Transformations in Regression

R-ode is for modelling defective rates

First, we read the data and then obtain a scatter plot matrix of the random variable Defects, \(Y\), against the independent variables. We notice that the relationship is between \(Y\) and each one of \(X'\)s looks like quadratic and not linear. Thus, transformation might be needed.

defects <- read.table("https://gattonweb.uky.edu/sheather/book/docs/datasets/defects.txt", header=TRUE)
attach(defects)
# A scatter plot matrix of the defects data
pairs(Defective ~ Temperature + Density + Rate)

After that we fit the regression line \[Y = \beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon\] where

\(Y=\) Defective.
\(X_1=\) Temperature.
\(X_2=\) Density.
\(X_3=\) Rate.

Then, we plot the the standardized residuals against each one of the predictors. Note that we add a locally-weighted polynomial regression LOWESS to the plots to help us in visualizing the relationship between the standardized residuals and each of the predictor variables.

m1 <- lm(Defective ~ Temperature  + Density + Rate)
# Plots of the standardized residuals against X1,X2,X3
par(mfrow=c(2,2))
StanRes1 <- rstandard(m1)
plot(Temperature, StanRes1, ylab = "Standardized Residuals")
lines(lowess(StanRes1 ~ Temperature), col = 2)

plot(Density, StanRes1, ylab = "Standardized Residuals")
lines(lowess(StanRes1 ~ Density), col = 2)

plot(Rate, StanRes1, ylab = "Standardized Residuals")
lines(lowess(StanRes1 ~ Rate), col = 2)

plot(m1$fitted.values, StanRes1, ylab = "Standardized Residuals",xlab="Fitted Values")
lines(lowess(StanRes1 ~ m1$fitted.values), col = 2)

The plots of standardized residuals against each predictor and the fitted values clearly do not produce random scatters (looks like parabola shape). Thus the fitted model is not valid and it is natural to consider a transformation of Y.

Now, to check whether the relationship between the dependent variable and independents is quadratic, [ i.e., something like \(Y = g(\beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon)\approx [\beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon]^2\), we plot the values of Y against their fitted values with a straight line and a quadratic curve. The plots confirm that we have a quadratic parabola and not linear.

# Plot of Y against fitted values with a straight line and a quadratic curve
par(mfrow = c(1,1))
fit1 <- m1$fitted.values
m2 <- lm(Defective ~ fit1 + I(fit1^2))
plot(fit1, Defective, xlab = "Fitted Values")
fitnew <- seq(-15,60, len = 76)
lines(fitnew,predict(m2, newdata = data.frame(fit1 = fitnew)))
abline(lsfit(m1$fitted.values, Defective), lty = 2)

Inverse response plots

We use the function from the package car to estimate \(\lambda\) and get \(\hat{\lambda}=0.44\).

library("car")
invResPlot(m1)

##       lambda       RSS
## 1  0.4359947  541.9376
## 2 -1.0000000 6465.6551
## 3  0.0000000 1572.7444
## 4  1.0000000 1156.2704

The inverse response plots suggest that we shall transform Y by taking the square root and thus consider the following model \[\sqrt{Y} =g^{-1}(Y) = \beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon\]

Box-Cox transformation

Now, we apply the Box-Cox transformation test using the function from the R package MASS to estimate the parameter \(\lambda\) that maximizes the log-likelihood.

#inverse.response.plot(m1,key=TRUE)
# Log-likelihood for the Box-Cox transformation method
library(MASS)
bc <- boxcox(m1, lambda = seq(0.3, 0.65, length = 20))

bc$x[which.max(bc$y)] ## to get the value of lambda

## [1] 0.4520202

We conclude from the Box-Cox procedures that \(\lambda=0.45\approx 0.5\). Thus, it is reasonable to model the square root of Defective variable based on a linear combination of \(X_1, X_2, X_3\). i.e.,

\[\sqrt{Y} = \beta_0 + \beta_1 X_1+ \beta_2 X_2+ \beta_3 X_3+\varepsilon\] The next step is to examine whether the transformation of \(Y\) using square root makes the relationship between \(\sqrt{Y}\) and each one of \(X_1, X_2, X_3\) liner?

# Plots of square root of Y against each predictor
par(mfrow = c(2, 2))
plot(Temperature, sqrt(Defective), ylab = expression(sqrt(Defective)))
plot(Density, sqrt(Defective), ylab = expression(sqrt(Defective)))
plot(Rate, sqrt(Defective), ylab = expression(sqrt(Defective)))

It is evident from the plots that the relationship between \(\sqrt{Y}\) and each predictor is more linear than the relationship between Y and each predictor.

Now, we fit the model after transformation and perform diagnostic checks using the standardized residuals.

# Fit the model after transformation
mt <- lm(sqrt(Defective) ~ Temperature + Density + Rate)
# Plots of the standardized residuals (Diagnostic checks)
par(mfrow = c(2, 2))
StanRest <- rstandard(mt)
plot(Temperature, StanRest, ylab = "Standardized Residuals")
plot(Density, StanRest, ylab = "Standardized Residuals")
plot(Rate, StanRest, ylab = "Standardized Residuals")
plot(mt$fitted.values, StanRest, ylab = "Standardized Residuals", xlab = "Fitted Values")

# Plots of square root of Y against fitted values with a straight line added
par(mfrow = c(1, 1))
plot(mt$fitted.values, sqrt(Defective), xlab = "Fitted Values", ylab = expression(sqrt(Defective)))
abline(lsfit(mt$fitted.values, sqrt(Defective)))

# Use diagnostic plots provided by R
par(mfrow = c(2, 2))
plot(mt)

The diagnostic plots show that the transformation model is a valid model for the data and our model from the regression output summary is

\[\sqrt{\widehat{\text{Defective}}} = 5.59297 + 1.56516 \times \text{Temperature} -0.29166 \times \text{Density} + 0.01290 \times \text{Rate}\]

# Final Regression output 
summary(mt)

## 
## Call:
## lm(formula = sqrt(Defective) ~ Temperature + Density + Rate)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.10147 -0.28502 -0.07716  0.34139  1.13951 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  5.59297    5.26401   1.062   0.2978  
## Temperature  1.56516    0.66226   2.363   0.0259 *
## Density     -0.29166    0.11954  -2.440   0.0218 *
## Rate         0.01290    0.01043   1.237   0.2273  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.5677 on 26 degrees of freedom
## Multiple R-squared:  0.943,  Adjusted R-squared:  0.9365 
## F-statistic: 143.5 on 3 and 26 DF,  p-value: 2.713e-16

Note that the variable Rate is not statistically significant, thus not supporting Ole’s theory. On the other hand, the coefficient of Density is statistically significantly negative, which supports the Ole’s theory that the increasing in the density value leads to lowering the defect rate. However, the Rate (which is related to the other two predictors) still needs to be considered when adjustments are made to one or both of the statistically significant predictors.

detach(defects)

Transformations in Regression

Esam Mahdi

2023-10-09

R-ode is for modelling defective rates

Inverse response plots

Box-Cox transformation