In this document we reperform some of the analysis provided in Risk Analysis of the Space Shuttle: Pre-Challenger Prediction of Failure by Siddhartha R. Dalal, Edward B. Fowlkes, Bruce Hoadley published in Journal of the American Statistical Association, Vol. 84, No. 408 (Dec., 1989), pp. 945-957 and available at http://www.jstor.org/stable/2290069.

On the fourth page of this article, they indicate that the maximum likelihood estimates of the logistic regression using only temperature are: \(\hat{\alpha}=5.085\) and \(\hat{\beta}=-0.1156\) and their asymptotic standard errors are \(s_{\hat{\alpha}}=3.052\) and \(s_{\hat{\beta}}=0.047\). The Goodness of fit indicated for this model was \(G^2=18.086\) with 21 degrees of freedom. Our goal is to reproduce the computation behind these values and the Figure 4 of this article, possibly in a nicer looking way.

Technical information on the computer on which the analysis is run

We will be using the R language, trough Rstudio, using the ggplot2 library. The version of the gg2plot is version 3.3.5. The version of R is version 3.6.3.

library(ggplot2)

Loading and inspecting data

Let’s start by reading data:

data

We know from our previous experience on this data set that filtering data is a really bad idea. We will therefore process it as such.

Let’s visually inspect how temperature affects malfunction:

plot(data=data, Malfunction/Count ~ Temperature, ylim=c(0,1))

Logistic regression

Let’s assume O-rings independently fail with the same probability which solely depends on temperature. A logistic regression should allow us to estimate the influence of temperature.

logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
                   family=binomial(link='logit'))
summary(logistic_reg)
## 
## Call:
## glm(formula = Malfunction/Count ~ Temperature, family = binomial(link = "logit"), 
##     data = data, weights = Count)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.95227  -0.78299  -0.54117  -0.04379   2.65152  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  5.08498    3.05247   1.666   0.0957 .
## Temperature -0.11560    0.04702  -2.458   0.0140 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 24.230  on 22  degrees of freedom
## Residual deviance: 18.086  on 21  degrees of freedom
## AIC: 35.647
## 
## Number of Fisher Scoring iterations: 5

The maximum likelyhood estimator of the intercept and of Temperature are thus \(\hat{\alpha}=5.08498\) and \(\hat{\beta}=-0.11560\) and their standard errors are \(s_{\hat{\alpha}} = 3.05247\) and \(s_{\hat{\beta}} = 0.04702\). The Residual deviance corresponds to the Goodness of fit \(G^2=18.086\) with 21 degrees of freedom.

Predicting failure probability

The temperature when launching the shuttle was 31°F. Let’s try to estimate the failure probability for such temperature using our model.:

# shuttle=shuttle[shuttle$r!=0,]
tempv = seq(from=30, to=90, by = .5)
rmv <- predict(logistic_reg,list(Temperature=tempv),type="response")
plot(tempv,rmv,type="l",ylim=c(0,1))
points(data=data, Malfunction/Count ~ Temperature)

This figure is very similar to the Figure 4 of Dalal et al.

Confidence on the prediction

Let’s try to plot confidence intervals with ggplot2.

ggplot(data, aes(y=Malfunction/Count, x=Temperature)) + geom_point(alpha=.2, size = 2, color="blue") +
  geom_smooth(method = "glm", method.args = list(family = "binomial"), fullrange=T) +
  xlim(30,90) + ylim(0,1) + theme_bw()
## `geom_smooth()` using formula 'y ~ x'
## Warning in eval(family$initialize): non-integer #successes in a binomial glm!

Mmmh, I have a warning from ggplot2 indicating “non-integer #successes in a binomial glm!”. This seems fishy. Furthermore, this confidence region seems huge… It seems strange to me that the uncertainty grows so large for higher temperatures. And compared to my previous call to glm, I haven’t indicated the weight which accounts for the fact that each ratio Malfunction/Count corresponds to Count observations (if someone knows how to do this…). There must be something wrong.

So let’s provide the “raw” data to ggplot2.

data_flat=data.frame()
for(i in 1:nrow(data)) {
  temperature = data[i,"Temperature"];
  malfunction = data[i,"Malfunction"];
  d = data.frame(Temperature=temperature,Malfunction=rep(0,times = data[i,"Count"]))
  if(malfunction>0) {
      d[1:malfunction, "Malfunction"]=1;
  }
  data_flat=rbind(data_flat,d)
}
dim(data_flat)
## [1] 138   2
str(data_flat)
## 'data.frame':    138 obs. of  2 variables:
##  $ Temperature: int  66 66 66 66 66 66 70 70 70 70 ...
##  $ Malfunction: num  0 0 0 0 0 0 1 0 0 0 ...

Let’s check whether I obtain the same regression or not:

logistic_reg_flat = glm(data=data_flat, Malfunction ~ Temperature, family=binomial(link='logit'))
summary(logistic_reg)
## 
## Call:
## glm(formula = Malfunction/Count ~ Temperature, family = binomial(link = "logit"), 
##     data = data, weights = Count)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.95227  -0.78299  -0.54117  -0.04379   2.65152  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  5.08498    3.05247   1.666   0.0957 .
## Temperature -0.11560    0.04702  -2.458   0.0140 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 24.230  on 22  degrees of freedom
## Residual deviance: 18.086  on 21  degrees of freedom
## AIC: 35.647
## 
## Number of Fisher Scoring iterations: 5

Perfect. The estimates and the standard errors are the same although the Residual deviance is different since the distance is now measured with respect to each 0/1 measurement and not to ratios. Let’s use plot the regression for data_flat along with the ratios (data).

ggplot(data=data_flat, aes(y=Malfunction, x=Temperature)) +
  geom_smooth(method = "glm", method.args = list(family = "binomial"), fullrange=T) +
  geom_point(data=data, aes(y=Malfunction/Count, x=Temperature),alpha=.2, size = 2, color="blue") +
  geom_point(alpha=.5, size = .5) +
  xlim(30,90) + ylim(0,1) + theme_bw()
## `geom_smooth()` using formula 'y ~ x'

This confidence interval seems much more reasonable (in accordance with the data) than the previous one. Let’s check whether it corresponds to the prediction obtained when calling directly predict. Obtaining the prediction can be done directly or through the link function.

Here is the “direct” (response) version I used in my very first plot:

pred = predict(logistic_reg_flat,list(Temperature=30),type="response",se.fit = T)
pred
## $fit
##        1 
## 0.834373 
## 
## $se.fit
##         1 
## 0.2293304 
## 
## $residual.scale
## [1] 1

The estimated Failure probability for 30° is thus \(0.834\). However, the \(se.fit\) value seems pretty hard to use as I can obviously not simply add \(\pm 2 se.fit\) to \(fit\) to compute a confidence interval.

Here is the “link” version:

pred_link = predict(logistic_reg_flat,list(Temperature=30),type="link",se.fit = T)
pred_link
## $fit
##        1 
## 1.616942 
## 
## $se.fit
## [1] 1.659473
## 
## $residual.scale
## [1] 1
logistic_reg$family$linkinv(pred_link$fit)
##        1 
## 0.834373

I recover \(0.834\) for the estimated Failure probability at 30°. But now, going through the linkinv function, we can use \(se.fit\):

critval = 1.96
logistic_reg$family$linkinv(c(pred_link$fit-critval*pred_link$se.fit,
                              pred_link$fit+critval*pred_link$se.fit))
##         1         1 
## 0.1630612 0.9923814

The 95% confidence interval for our estimation is thus [0.163,0.992]. This is what ggplot2 just plotted me. This seems coherent.