Commit f943a2fe authored by NourElh's avatar NourElh

Update exo5_en.Rmd

parent 020c82b5
...@@ -30,14 +30,27 @@ data = read.csv("shuttle.csv",header=T) ...@@ -30,14 +30,27 @@ data = read.csv("shuttle.csv",header=T)
data data
``` ```
The data set shows us the date of each test, the number of O-rings The data set shows us the date of each test, the number of O-rings
(there are 6 on the main launcher), the (there are 6 on the main launcher), the
temperature (in Fahrenheit) and pressure (in psi), and finally the temperature (in Fahrenheit) and pressure (in psi), and finally the
number of identified malfunctions. number of identified malfunctions.
### Comment:
There is no datapoint for low temperatures as the one used during launching => Lack of data.
# Graphical inspection # Graphical inspection
Flights without incidents do not provide any information Flights without incidents do not provide any information
on the influence of temperature or pressure on malfunction. on the influence of temperature or pressure on malfunction.
### New analysis:
However, I'd like to visualize the data of successful flights.
```{r}
d = data[data$Malfunction==0,]
d
```
We can see that for high temperatures, there are failures, so excluding this data may prevent the detection of the origin of the failures.
### Continue
We thus focus on the experiments in which at least one O-ring was defective. We thus focus on the experiments in which at least one O-ring was defective.
```{r} ```{r}
...@@ -78,6 +91,14 @@ and the standard error of this estimator is 0.049, in other words we ...@@ -78,6 +91,14 @@ and the standard error of this estimator is 0.049, in other words we
cannot distinguish any particular impact and we must take our cannot distinguish any particular impact and we must take our
estimates with caution. estimates with caution.
### New analysis
```{r}
library(ggplot2)
ggplot(d,aes(x=Temperature, y=Malfunction/Count))+geom_point(alpha=.3,size=3)+theme_bw()+geom_smooth(method = "glm",method.args=list(family="binomial"))+xlim(0,200)
```
From the graph, we see that there is a big uncertainty of this estimate, so the possibility that temperature has an effect cannot be ruled out.
# Estimation of the probability of O-ring malfunction # Estimation of the probability of O-ring malfunction
The expected temperature on the take-off day is 31°F. Let's try to The expected temperature on the take-off day is 31°F. Let's try to
estimate the probability of O-ring malfunction at estimate the probability of O-ring malfunction at
...@@ -85,6 +106,8 @@ this temperature from the model we just built: ...@@ -85,6 +106,8 @@ this temperature from the model we just built:
```{r} ```{r}
# shuttle=shuttle[shuttle$r!=0,] # shuttle=shuttle[shuttle$r!=0,]
logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
family=binomial(link='logit'))
tempv = seq(from=30, to=90, by = .5) tempv = seq(from=30, to=90, by = .5)
rmv <- predict(logistic_reg,list(Temperature=tempv),type="response") rmv <- predict(logistic_reg,list(Temperature=tempv),type="response")
plot(tempv,rmv,type="l",ylim=c(0,1)) plot(tempv,rmv,type="l",ylim=c(0,1))
...@@ -117,3 +140,6 @@ of NASA, which have a lot to do with this fiasco, the previous analysis ...@@ -117,3 +140,6 @@ of NASA, which have a lot to do with this fiasco, the previous analysis
includes (at least) a small problem.... Can you find it? includes (at least) a small problem.... Can you find it?
You are free to modify this analysis and to look at this dataset You are free to modify this analysis and to look at this dataset
from all angles in order to to explain what's wrong. from all angles in order to to explain what's wrong.
### Comment:
Lack of low temperatures data + aggregating data + not considering the uncertainty of the estimate have led to this mistake.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment