From f943a2fe9d3a60fbde44c9f6b8e453f7dda7390a Mon Sep 17 00:00:00 2001 From: NourElh <734092651fcdd5add927271f472626a6@app-learninglab.inria.fr> Date: Wed, 2 Nov 2022 12:45:23 +0000 Subject: [PATCH] Update exo5_en.Rmd --- module2/exo5/exo5_en.Rmd | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/module2/exo5/exo5_en.Rmd b/module2/exo5/exo5_en.Rmd index f9003e3..7dc084b 100644 --- a/module2/exo5/exo5_en.Rmd +++ b/module2/exo5/exo5_en.Rmd @@ -30,14 +30,27 @@ data = read.csv("shuttle.csv",header=T) data ``` + The data set shows us the date of each test, the number of O-rings (there are 6 on the main launcher), the temperature (in Fahrenheit) and pressure (in psi), and finally the number of identified malfunctions. +### Comment: +There is no datapoint for low temperatures as the one used during launching => Lack of data. + # Graphical inspection Flights without incidents do not provide any information on the influence of temperature or pressure on malfunction. +### New analysis: +However, I'd like to visualize the data of successful flights. +```{r} +d = data[data$Malfunction==0,] +d +``` +We can see that for high temperatures, there are failures, so excluding this data may prevent the detection of the origin of the failures. + +### Continue We thus focus on the experiments in which at least one O-ring was defective. ```{r} @@ -78,6 +91,14 @@ and the standard error of this estimator is 0.049, in other words we cannot distinguish any particular impact and we must take our estimates with caution. +### New analysis +```{r} +library(ggplot2) +ggplot(d,aes(x=Temperature, y=Malfunction/Count))+geom_point(alpha=.3,size=3)+theme_bw()+geom_smooth(method = "glm",method.args=list(family="binomial"))+xlim(0,200) +``` + +From the graph, we see that there is a big uncertainty of this estimate, so the possibility that temperature has an effect cannot be ruled out. + # Estimation of the probability of O-ring malfunction The expected temperature on the take-off day is 31°F. Let's try to estimate the probability of O-ring malfunction at @@ -85,6 +106,8 @@ this temperature from the model we just built: ```{r} # shuttle=shuttle[shuttle$r!=0,] +logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count, + family=binomial(link='logit')) tempv = seq(from=30, to=90, by = .5) rmv <- predict(logistic_reg,list(Temperature=tempv),type="response") plot(tempv,rmv,type="l",ylim=c(0,1)) @@ -117,3 +140,6 @@ of NASA, which have a lot to do with this fiasco, the previous analysis includes (at least) a small problem.... Can you find it? You are free to modify this analysis and to look at this dataset from all angles in order to to explain what's wrong. + +### Comment: +Lack of low temperatures data + aggregating data + not considering the uncertainty of the estimate have led to this mistake. -- 2.18.1