Commit f943a2fe authored by NourElh's avatar NourElh

Update exo5_en.Rmd

parent 020c82b5
......@@ -30,14 +30,27 @@ data = read.csv("shuttle.csv",header=T)
data
```
The data set shows us the date of each test, the number of O-rings
(there are 6 on the main launcher), the
temperature (in Fahrenheit) and pressure (in psi), and finally the
number of identified malfunctions.
### Comment:
There is no datapoint for low temperatures as the one used during launching => Lack of data.
# Graphical inspection
Flights without incidents do not provide any information
on the influence of temperature or pressure on malfunction.
### New analysis:
However, I'd like to visualize the data of successful flights.
```{r}
d = data[data$Malfunction==0,]
d
```
We can see that for high temperatures, there are failures, so excluding this data may prevent the detection of the origin of the failures.
### Continue
We thus focus on the experiments in which at least one O-ring was defective.
```{r}
......@@ -78,6 +91,14 @@ and the standard error of this estimator is 0.049, in other words we
cannot distinguish any particular impact and we must take our
estimates with caution.
### New analysis
```{r}
library(ggplot2)
ggplot(d,aes(x=Temperature, y=Malfunction/Count))+geom_point(alpha=.3,size=3)+theme_bw()+geom_smooth(method = "glm",method.args=list(family="binomial"))+xlim(0,200)
```
From the graph, we see that there is a big uncertainty of this estimate, so the possibility that temperature has an effect cannot be ruled out.
# Estimation of the probability of O-ring malfunction
The expected temperature on the take-off day is 31°F. Let's try to
estimate the probability of O-ring malfunction at
......@@ -85,6 +106,8 @@ this temperature from the model we just built:
```{r}
# shuttle=shuttle[shuttle$r!=0,]
logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
family=binomial(link='logit'))
tempv = seq(from=30, to=90, by = .5)
rmv <- predict(logistic_reg,list(Temperature=tempv),type="response")
plot(tempv,rmv,type="l",ylim=c(0,1))
......@@ -117,3 +140,6 @@ of NASA, which have a lot to do with this fiasco, the previous analysis
includes (at least) a small problem.... Can you find it?
You are free to modify this analysis and to look at this dataset
from all angles in order to to explain what's wrong.
### Comment:
Lack of low temperatures data + aggregating data + not considering the uncertainty of the estimate have led to this mistake.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment