a possible analysis

cc1cbbd1 · feee8ecec645c53ceeda57e948ea51be · caaa4b95 · cc1cbbd1
Commit cc1cbbd1 authored Jul 24, 2021 by feee8ecec645c53ceeda57e948ea51be
Show whitespace changes
Inline Side-by-side

Showing with 133 additions and 0 deletions

exo5_en.Rmd module2/exo5/exo5_en.Rmd +133 -0

No files found.
--- a/module2/exo5/exo5_en.Rmd
+++ b/module2/exo5/exo5_en.Rmd
+---
+title: "Analysis of the risk of failure of the O-rings on the Challenger shuttle"
+author: "Gkiouzepi Eleni"
+date: "24/7/2021"
+output: html_document
+---
+On January 27, 1986, the day before the takeoff of the shuttle _Challenger_, had
+a three-hour teleconference was held between 
+Morton Thiokol (the manufacturer of one of the engines) and NASA. The
+discussion focused on the consequences of the
+temperature at take-off of 31°F (just below
+0°C) for the success of the flight and in particular on the performance of the
+O-rings used in the engines. Indeed, no test
+had been performed at this temperature.
+The following study takes up some of the analyses carried out that
+night with the objective of assessing the potential influence of
+the temperature and pressure to which the O-rings are subjected
+on their probability of malfunction. Our starting point is 
+the results of the experiments carried out by NASA engineers
+during the six years preceding the launch of the shuttle
+Challenger.
+# Loading the data
+We start by loading this data:
+```{r}
+data = read.csv("shuttle.csv",header=T)
+data
+```
+The data set shows us the date of each test, the number of O-rings
+(there are 6 on the main launcher), the
+temperature (in Fahrenheit) and pressure (in psi), and finally the
+number of identified malfunctions.
+# Graphical inspection
+~~Flights without incidents do not provide any information
+on the influence of temperature or pressure on malfunction.
+We thus focus on the experiments in which at least one O-ring was defective.~~ **Wrong assumption**
+```{r}
+# mal = data[data$Malfunction>0,]
+```
+We have a high temperature variability but
+the pressure is almost always 200, which should
+simplify the analysis.
+How does the frequency of failure vary with temperature?
+```{r}
+plot(data=data, Malfunction/Count ~ Temperature, ylim=c(0,1))
+```
+At first glance, the dependence does not look very important, but let's try to
+estimate the impact of temperature $t$ on the probability of O-ring malfunction.
+# Estimation of the temperature influence
+Suppose that each of the six O-rings is damaged with the same
+probability and independently of the others and that this probability
+depends only on the temperature. If $p(t)$ is this probability, the
+number $D$ of malfunctioning O-rings during a flight at
+temperature $t$ follows a binomial law with parameters $n=6$ and
+$p=p(t)$. To link $p(t)$ to $t$, we will therefore perform a
+logistic regression.
+```{r}
+logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count, 
+                   family=binomial(link='logit'))
+summary(logistic_reg)
+# mal_logistic_reg = glm(data=mal, Malfunction/Count ~ Temperature, weights=Count, 
+#                    family=binomial(link='logit'))
+# summary(mal_logistic_reg)
+```
+The most likely estimator of the temperature parameter is  ~~0.001416~~ __-0.11560__
+and the standard error of this estimator is 0.047, in other words
+**WRONG** ~~we
+cannot distinguish any particular impact~~
+_it is inverse-dependent on temperature, if temperature decreases by 1 degree, the probability of O-ring malfunction increases by 0.1156,_  and we must take our
+estimates with caution.
+# Estimation of the probability of O-ring malfunction
+The expected temperature on the take-off day is 31°F. Let's try to
+estimate the probability of O-ring malfunction at
+this temperature from the model we just built:
+```{r}
+# shuttle=shuttle[shuttle$r!=0,]
+tempv = seq(from=30, to=90, by = .5)
+# rmv_mal <- predict(mal_logistic_reg,list(Temperature=tempv),type="response")
+# plot(tempv,rmv_mal,type="l",ylim=c(0,1))
+# points(data=mal, Malfunction/Count ~ Temperature)
+rmv <- predict(logistic_reg,list(Temperature=tempv),se.fit=T,type="response")
+plot(tempv,rmv$fit,type="l",ylim=c(0,1))
+lines(tempv,rmv$fit+rmv$se.fit,col="red")
+lines(tempv,rmv$fit-rmv$se.fit,col="red")
+points(data=data, Malfunction/Count ~ Temperature)
+```
+~~As expected from the initial data~~, the
+temperature has **VERY** ~~no~~ significant impact on the probability of failure of the
+O-rings. It will be ~~about 0.2~~ **in average over 0.8 to as high as more than 1.0 (certain)**,~~as in the tests
+where we had a failure of at least one joint~~ **so we are expecting a failure of at least 4 joints**. Let's ~~get back to the initial dataset to~~ estimate the probability of failure:
+```{r}
+# data_full = read.csv("shuttle.csv",header=T)
+# sum(data_full$Malfunction)/sum(data_full$Count)
+estim = predict(logistic_reg,list(Temperature=31),se.fit=T,type="response")
+estim
+```
+This probability is thus about $p=`r round(estim$fit, digits = 5)`\pm`r round(estim$se.fit, digits = 5)`$. Knowing that there is
+a primary and a secondary O-ring on each of the three parts of the
+launcher, the probability of failure of both joints of a launcher
+is $p^2 \approx `r round((estim$fit+estim$se.fit)^2, digits = 2)`\pm`r round(2*estim$se.fit*estim$fit, digits = 2)`$. The probability of failure of any one of the
+launchers is $1-(1-p^2)^3 \approx `r (1-(1-round((estim$fit+estim$se.fit)^2, digits = 0))^3)*100`\%$.  ~~That would really be
+bad luck.... Everything is under control, so the takeoff can happen
+tomorrow as planned~~.**ABORT! ABORT! ABORT THE MISSION!**
+*Unfortunately, none of the above analysis was carried out properly and* the next day, the Challenger shuttle exploded and took away
+with her the seven crew members. The public was shocked and in
+the subsequent investigation, the reliability of the
+O-rings was questioned. Beyond the internal communication problems
+of NASA, which have a lot to do with this fiasco, the previous analysis
+includes (at least) a small problem.