First, we read the data and then obtain a scatter plot matrix of the random variable Defects, \(Y\), against the independent variables. We notice that the relationship is between \(Y\) and each one of \(X'\)s looks like quadratic and not linear. Thus, transformation might be needed.
defects <- read.table("https://gattonweb.uky.edu/sheather/book/docs/datasets/defects.txt", header=TRUE)
attach(defects)
# A scatter plot matrix of the defects data
pairs(Defective ~ Temperature + Density + Rate)
After that we fit the regression line \[Y = \beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon\] where
Then, we plot the the standardized residuals against each one of the predictors. Note that we add a locally-weighted polynomial regression LOWESS to the plots to help us in visualizing the relationship between the standardized residuals and each of the predictor variables.
m1 <- lm(Defective ~ Temperature + Density + Rate)
# Plots of the standardized residuals against X1,X2,X3
par(mfrow=c(2,2))
StanRes1 <- rstandard(m1)
plot(Temperature, StanRes1, ylab = "Standardized Residuals")
lines(lowess(StanRes1 ~ Temperature), col = 2)
plot(Density, StanRes1, ylab = "Standardized Residuals")
lines(lowess(StanRes1 ~ Density), col = 2)
plot(Rate, StanRes1, ylab = "Standardized Residuals")
lines(lowess(StanRes1 ~ Rate), col = 2)
plot(m1$fitted.values, StanRes1, ylab = "Standardized Residuals",xlab="Fitted Values")
lines(lowess(StanRes1 ~ m1$fitted.values), col = 2)
The plots of standardized residuals against each predictor and the fitted values clearly do not produce random scatters (looks like parabola shape). Thus the fitted model is not valid and it is natural to consider a transformation of Y.
Now, to check whether the relationship between the dependent variable and independents is quadratic, [ i.e., something like \(Y = g(\beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon)\approx [\beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon]^2\), we plot the values of Y against their fitted values with a straight line and a quadratic curve. The plots confirm that we have a quadratic parabola and not linear.
# Plot of Y against fitted values with a straight line and a quadratic curve
par(mfrow = c(1,1))
fit1 <- m1$fitted.values
m2 <- lm(Defective ~ fit1 + I(fit1^2))
plot(fit1, Defective, xlab = "Fitted Values")
fitnew <- seq(-15,60, len = 76)
lines(fitnew,predict(m2, newdata = data.frame(fit1 = fitnew)))
abline(lsfit(m1$fitted.values, Defective), lty = 2)
We use the function from the package car to estimate \(\lambda\) and get \(\hat{\lambda}=0.44\).
library("car")
invResPlot(m1)
## lambda RSS
## 1 0.4359947 541.9376
## 2 -1.0000000 6465.6551
## 3 0.0000000 1572.7444
## 4 1.0000000 1156.2704
The inverse response plots suggest that we shall transform Y by taking the square root and thus consider the following model \[\sqrt{Y} =g^{-1}(Y) = \beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon\]
Now, we apply the Box-Cox transformation test using the function from the R package MASS to estimate the parameter \(\lambda\) that maximizes the log-likelihood.
#inverse.response.plot(m1,key=TRUE)
# Log-likelihood for the Box-Cox transformation method
library(MASS)
bc <- boxcox(m1, lambda = seq(0.3, 0.65, length = 20))
bc$x[which.max(bc$y)] ## to get the value of lambda
## [1] 0.4520202
We conclude from the Box-Cox procedures that \(\lambda=0.45\approx 0.5\). Thus, it is reasonable to model the square root of Defective variable based on a linear combination of \(X_1, X_2, X_3\). i.e.,
\[\sqrt{Y} = \beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon\] The next step is to examine whether the transformation of \(Y\) using square root makes the relationship between \(\sqrt{Y}\) and each one of \(X_1, X_2, X_3\) liner?
# Plots of square root of Y against each predictor
par(mfrow = c(2, 2))
plot(Temperature, sqrt(Defective), ylab = expression(sqrt(Defective)))
plot(Density, sqrt(Defective), ylab = expression(sqrt(Defective)))
plot(Rate, sqrt(Defective), ylab = expression(sqrt(Defective)))
It is evident from the plots that the relationship between \(\sqrt{Y}\) and each predictor is more linear than the relationship between Y and each predictor.
Now, we fit the model after transformation and perform diagnostic checks using the standardized residuals.
# Fit the model after transformation
mt <- lm(sqrt(Defective) ~ Temperature + Density + Rate)
# Plots of the standardized residuals (Diagnostic checks)
par(mfrow = c(2, 2))
StanRest <- rstandard(mt)
plot(Temperature, StanRest, ylab = "Standardized Residuals")
plot(Density, StanRest, ylab = "Standardized Residuals")
plot(Rate, StanRest, ylab = "Standardized Residuals")
plot(mt$fitted.values, StanRest, ylab = "Standardized Residuals", xlab = "Fitted Values")
# Plots of square root of Y against fitted values with a straight line added
par(mfrow = c(1, 1))
plot(mt$fitted.values, sqrt(Defective), xlab = "Fitted Values", ylab = expression(sqrt(Defective)))
abline(lsfit(mt$fitted.values, sqrt(Defective)))
# Use diagnostic plots provided by R
par(mfrow = c(2, 2))
plot(mt)
The diagnostic plots show that the transformation model is a valid model for the data and our model from the regression output summary is
\[\sqrt{\widehat{\text{Defective}}} = 5.59297 + 1.56516 \times \text{Temperature} -0.29166 \times \text{Density} + 0.01290 \times \text{Rate}\]
# Final Regression output
summary(mt)
##
## Call:
## lm(formula = sqrt(Defective) ~ Temperature + Density + Rate)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.10147 -0.28502 -0.07716 0.34139 1.13951
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.59297 5.26401 1.062 0.2978
## Temperature 1.56516 0.66226 2.363 0.0259 *
## Density -0.29166 0.11954 -2.440 0.0218 *
## Rate 0.01290 0.01043 1.237 0.2273
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5677 on 26 degrees of freedom
## Multiple R-squared: 0.943, Adjusted R-squared: 0.9365
## F-statistic: 143.5 on 3 and 26 DF, p-value: 2.713e-16
Note that the variable Rate
is not statistically
significant, thus not supporting Ole’s theory. On the other hand, the
coefficient of Density
is statistically significantly
negative, which supports the Ole’s theory that the increasing in the
density value
leads to lowering the
defect rate
. However, the Rate
(which is
related to the other two predictors) still needs to be considered when
adjustments are made to one or both of the statistically significant
predictors.
detach(defects)