diff --git a/module2/exo5/exo5_en.Rmd b/module2/exo5/exo5_en.Rmd new file mode 100644 index 0000000000000000000000000000000000000000..819ba9f263ee410da787d67710fe27e860ab929c --- /dev/null +++ b/module2/exo5/exo5_en.Rmd @@ -0,0 +1,133 @@ +--- +title: "Analysis of the risk of failure of the O-rings on the Challenger shuttle" +author: "Gkiouzepi Eleni" +date: "24/7/2021" +output: html_document +--- + +On January 27, 1986, the day before the takeoff of the shuttle _Challenger_, had +a three-hour teleconference was held between +Morton Thiokol (the manufacturer of one of the engines) and NASA. The +discussion focused on the consequences of the +temperature at take-off of 31°F (just below +0°C) for the success of the flight and in particular on the performance of the +O-rings used in the engines. Indeed, no test +had been performed at this temperature. + +The following study takes up some of the analyses carried out that +night with the objective of assessing the potential influence of +the temperature and pressure to which the O-rings are subjected +on their probability of malfunction. Our starting point is +the results of the experiments carried out by NASA engineers +during the six years preceding the launch of the shuttle +Challenger. + +# Loading the data +We start by loading this data: + +```{r} +data = read.csv("shuttle.csv",header=T) +data +``` + +The data set shows us the date of each test, the number of O-rings +(there are 6 on the main launcher), the +temperature (in Fahrenheit) and pressure (in psi), and finally the +number of identified malfunctions. + +# Graphical inspection +~~Flights without incidents do not provide any information +on the influence of temperature or pressure on malfunction. +We thus focus on the experiments in which at least one O-ring was defective.~~ **Wrong assumption** + +```{r} +# mal = data[data$Malfunction>0,] +``` + +We have a high temperature variability but +the pressure is almost always 200, which should +simplify the analysis. + +How does the frequency of failure vary with temperature? +```{r} +plot(data=data, Malfunction/Count ~ Temperature, ylim=c(0,1)) +``` + +At first glance, the dependence does not look very important, but let's try to +estimate the impact of temperature $t$ on the probability of O-ring malfunction. + +# Estimation of the temperature influence + +Suppose that each of the six O-rings is damaged with the same +probability and independently of the others and that this probability +depends only on the temperature. If $p(t)$ is this probability, the +number $D$ of malfunctioning O-rings during a flight at +temperature $t$ follows a binomial law with parameters $n=6$ and +$p=p(t)$. To link $p(t)$ to $t$, we will therefore perform a +logistic regression. + +```{r} +logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count, + family=binomial(link='logit')) +summary(logistic_reg) + +# mal_logistic_reg = glm(data=mal, Malfunction/Count ~ Temperature, weights=Count, +# family=binomial(link='logit')) +# summary(mal_logistic_reg) +``` + +The most likely estimator of the temperature parameter is ~~0.001416~~ __-0.11560__ +and the standard error of this estimator is 0.047, in other words +**WRONG** ~~we +cannot distinguish any particular impact~~ +_it is inverse-dependent on temperature, if temperature decreases by 1 degree, the probability of O-ring malfunction increases by 0.1156,_ and we must take our +estimates with caution. + + +# Estimation of the probability of O-ring malfunction +The expected temperature on the take-off day is 31°F. Let's try to +estimate the probability of O-ring malfunction at +this temperature from the model we just built: + +```{r} +# shuttle=shuttle[shuttle$r!=0,] +tempv = seq(from=30, to=90, by = .5) +# rmv_mal <- predict(mal_logistic_reg,list(Temperature=tempv),type="response") +# plot(tempv,rmv_mal,type="l",ylim=c(0,1)) +# points(data=mal, Malfunction/Count ~ Temperature) + + +rmv <- predict(logistic_reg,list(Temperature=tempv),se.fit=T,type="response") +plot(tempv,rmv$fit,type="l",ylim=c(0,1)) +lines(tempv,rmv$fit+rmv$se.fit,col="red") +lines(tempv,rmv$fit-rmv$se.fit,col="red") +points(data=data, Malfunction/Count ~ Temperature) +``` + +~~As expected from the initial data~~, the +temperature has **VERY** ~~no~~ significant impact on the probability of failure of the +O-rings. It will be ~~about 0.2~~ **in average over 0.8 to as high as more than 1.0 (certain)**,~~as in the tests +where we had a failure of at least one joint~~ **so we are expecting a failure of at least 4 joints**. Let's ~~get back to the initial dataset to~~ estimate the probability of failure: + +```{r} +# data_full = read.csv("shuttle.csv",header=T) +# sum(data_full$Malfunction)/sum(data_full$Count) + +estim = predict(logistic_reg,list(Temperature=31),se.fit=T,type="response") +estim +``` + +This probability is thus about $p=`r round(estim$fit, digits = 5)`\pm`r round(estim$se.fit, digits = 5)`$. Knowing that there is +a primary and a secondary O-ring on each of the three parts of the +launcher, the probability of failure of both joints of a launcher +is $p^2 \approx `r round((estim$fit+estim$se.fit)^2, digits = 2)`\pm`r round(2*estim$se.fit*estim$fit, digits = 2)`$. The probability of failure of any one of the +launchers is $1-(1-p^2)^3 \approx `r (1-(1-round((estim$fit+estim$se.fit)^2, digits = 0))^3)*100`\%$. ~~That would really be +bad luck.... Everything is under control, so the takeoff can happen +tomorrow as planned~~.**ABORT! ABORT! ABORT THE MISSION!** + +*Unfortunately, none of the above analysis was carried out properly and* the next day, the Challenger shuttle exploded and took away +with her the seven crew members. The public was shocked and in +the subsequent investigation, the reliability of the +O-rings was questioned. Beyond the internal communication problems +of NASA, which have a lot to do with this fiasco, the previous analysis +includes (at least) a small problem.