Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
M
mooc-rr
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
d271ba6f3f7f690a53c9cb2ecaeb2975
mooc-rr
Commits
e73b08b7
Commit
e73b08b7
authored
Oct 15, 2025
by
Wojciech Łoboda
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
ex5
parent
d21d5e81
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
103 additions
and
57 deletions
+103
-57
exo5_en.Rmd
module2/exo5/exo5_en.Rmd
+103
-57
No files found.
module2/exo5/exo5_en.Rmd
View file @
e73b08b7
...
@@ -3,26 +3,28 @@ title: "Analysis of the risk of failure of the O-rings on the Challenger shuttle
...
@@ -3,26 +3,28 @@ title: "Analysis of the risk of failure of the O-rings on the Challenger shuttle
author: "Arnaud Legrand"
author: "Arnaud Legrand"
date: "28 juin 2018"
date: "28 juin 2018"
output: html_document
output: html_document
editor_options:
markdown:
wrap: 72
---
---
On January 27, 1986, the day before the takeoff of the shuttle _Challenger_, had
On January 27, 1986, the day before the takeoff of the shuttle
a three-hour teleconference was held between
*Challenger*, had a three-hour teleconference was held between Morton
Morton Thiokol (the manufacturer of one of the engines) and NASA. The
Thiokol (the manufacturer of one of the engines) and NASA. The
discussion focused on the consequences of the
discussion focused on the consequences of the temperature at take-off of
temperature at take-off of 31°F (just below
31°F (just below 0°C) for the success of the flight and in particular on
0°C) for the success of the flight and in particular on the performance of the
the performance of the O-rings used in the engines. Indeed, no test had
O-rings used in the engines. Indeed, no test
been performed at this temperature.
had been performed at this temperature.
The following study takes up some of the analyses carried out that night
The following study takes up some of the analyses carried out that
with the objective of assessing the potential influence of the
night with the objective of assessing the potential influence of
temperature and pressure to which the O-rings are subjected on their
the temperature and pressure to which the O-rings are subjected
probability of malfunction. Our starting point is the results of the
on their probability of malfunction. Our starting point is
experiments carried out by NASA engineers during the six years preceding
the results of the experiments carried out by NASA engineers
the launch of the shuttle Challenger.
during the six years preceding the launch of the shuttle
Challenger.
# Loading the data
# Loading the data
We start by loading this data:
We start by loading this data:
```{r}
```{r}
...
@@ -31,41 +33,41 @@ data
...
@@ -31,41 +33,41 @@ data
```
```
The data set shows us the date of each test, the number of O-rings
The data set shows us the date of each test, the number of O-rings
(there are 6 on the main launcher), the
(there are 6 on the main launcher), the temperature (in Fahrenheit) and
temperature (in Fahrenheit) and pressure (in psi), and finally the
pressure (in psi), and finally the number of identified malfunctions.
number of identified malfunctions.
# Graphical inspection
# Graphical inspection
Flights without incidents do not provide any information
on the influence of temperature or pressure on malfunction.
Flights without incidents do not provide any information on the
We thus focus on the experiments in which at least one O-ring was defective.
influence of temperature or pressure on malfunction. We thus focus on
the experiments in which at least one O-ring was defective.
```{r}
```{r}
data = data[data$Malfunction>0,]
data = data[data$Malfunction>0,]
data
data
```
```
We have a high temperature variability but
We have a high temperature variability but the pressure is almost always
the pressure is almost always 200, which should
200, which should simplify the analysis.
simplify the analysis.
How does the frequency of failure vary with temperature?
How does the frequency of failure vary with temperature?
```{r}
```{r}
plot(data=data, Malfunction/Count ~ Temperature, ylim=c(0,1))
plot(data=data, Malfunction/Count ~ Temperature, ylim=c(0,1))
```
```
At first glance, the dependence does not look very important, but let's try to
At first glance, the dependence does not look very important, but let's
estimate the impact of temperature $t$ on the probability of O-ring malfunction.
try to estimate the impact of temperature $t$ on the probability of
O-ring malfunction.
# Estimation of the temperature influence
# Estimation of the temperature influence
Suppose that each of the six O-rings is damaged with the same
Suppose that each of the six O-rings is damaged with the same
probability and independently of the others and that this probability
probability and independently of the others and that this probability
depends only on the temperature. If $p(t)$ is this probability, the
depends only on the temperature. If $p(t)$ is this probability, the
number $D$ of malfunctioning O-rings during a flight at
number $D$ of malfunctioning O-rings during a flight at temperature $t$
temperature $t$ follows a binomial law with parameters $n=6$ and
follows a binomial law with parameters $n=6$ and $p=p(t)$. To link
$p=p(t)$. To link $p(t)$ to $t$, we will therefore perform a
$p(t)$ to $t$, we will therefore perform a logistic regression.
logistic regression.
```{r}
```{r}
logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
...
@@ -73,15 +75,16 @@ logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
...
@@ -73,15 +75,16 @@ logistic_reg = glm(data=data, Malfunction/Count ~ Temperature, weights=Count,
summary(logistic_reg)
summary(logistic_reg)
```
```
The most likely estimator of the temperature parameter is 0.001416
The most likely estimator of the temperature parameter is 0.001416
and
and the standard error of this estimator is 0.049, in other words we
the standard error of this estimator is 0.049, in other words we cannot
cannot distinguish any particular impact and we must take our
distinguish any particular impact and we must take our estimates with
estimates with
caution.
caution.
# Estimation of the probability of O-ring malfunction
# Estimation of the probability of O-ring malfunction
The expected temperature on the take-off day is 31°F. Let's try to
The expected temperature on the take-off day is 31°F. Let's try to
estimate the probability of O-ring malfunction at
estimate the probability of O-ring malfunction at
this temperature from
th
is temperature from th
e model we just built:
the model we just built:
```{r}
```{r}
# shuttle=shuttle[shuttle$r!=0,]
# shuttle=shuttle[shuttle$r!=0,]
...
@@ -91,29 +94,72 @@ plot(tempv,rmv,type="l",ylim=c(0,1))
...
@@ -91,29 +94,72 @@ plot(tempv,rmv,type="l",ylim=c(0,1))
points(data=data, Malfunction/Count ~ Temperature)
points(data=data, Malfunction/Count ~ Temperature)
```
```
As expected from the initial data, the
As expected from the initial data, the
temperature has no significant
temperature has no significant impact on the probability of failure of the
impact on the probability of failure of the O-rings. It will be about
O-rings. It will be about 0.2, as in the test
s
0.2, as in the tests where we had a failure of at least one joint. Let'
s
where we had a failure of at least one joint. Let's
get back to the initial dataset to estimate the probability of failure:
get back to the initial dataset to estimate the probability of failure:
```{r}
```{r}
data_full = read.csv("shuttle.csv",header=T)
data_full = read.csv("shuttle.csv",header=T)
sum(data_full$Malfunction)/sum(data_full$Count)
sum(data_full$Malfunction)/sum(data_full$Count)
```
```
This probability is thus about $p=0.065$. Knowing that there is
This probability is thus about $p=0.065$. Knowing that there is a
a primary and a secondary O-ring on each of the three parts of the
primary and a secondary O-ring on each of the three parts of the
launcher, the probability of failure of both joints of a launcher
launcher, the probability of failure of both joints of a launcher is
is $p^2 \approx 0.00425$. The probability of failure of any one of the
$p^2 \approx 0.00425$. The probability of failure of any one of the
launchers is $1-(1-p^2)^3 \approx 1.2%$. That would really be
launchers is $1-(1-p^2)^3 \approx 1.2%$. That would really be bad
bad luck.... Everything is under control, so the takeoff can happen
luck.... Everything is under control, so the takeoff can happen tomorrow
tomorrow as planned.
as planned.
But the next day, the Challenger shuttle exploded and took away
But the next day, the Challenger shuttle exploded and took away with her
with her the seven crew members. The public was shocked and in
the seven crew members. The public was shocked and in the subsequent
the subsequent investigation, the reliability of the
investigation, the reliability of the O-rings was questioned. Beyond the
O-rings was questioned. Beyond the internal communication problems
internal communication problems of NASA, which have a lot to do with
of NASA, which have a lot to do with this fiasco, the previous analysis
this fiasco, the previous analysis includes (at least) a small
includes (at least) a small problem.... Can you find it?
problem.... Can you find it? You are free to modify this analysis and to
You are free to modify this analysis and to look at this dataset
look at this dataset from all angles in order to to explain what's
from all angles in order to to explain what's wrong.
wrong.
## Finding error
in the provided data from tests, the range of temperatures is small, all
of them vary between 60-70, based on this data we cannot reason about
what will happen in the temperatures around 30. To show this we can
visualize confidence intervals for out logistic regression
```{r}
# Create a sequence of Temperature values for plotting
newdata <- data.frame(Temperature = seq(0,
max(data$Temperature),
length.out = 100))
# Predict on the link (logit) scale with standard errors
pred <- predict(logistic_reg, newdata, type = "link", se.fit = TRUE)
# Compute 95% CI on the link scale
newdata$fit <- pred$fit
newdata$lower <- pred$fit - 1.96 * pred$se.fit
newdata$upper <- pred$fit + 1.96 * pred$se.fit
# Transform back to probability scale
newdata$fit_prob <- plogis(newdata$fit)
newdata$lower_prob <- plogis(newdata$lower)
newdata$upper_prob <- plogis(newdata$upper)
```
```{r}
library(ggplot2)
ggplot(newdata, aes(x = Temperature, y = fit_prob)) +
geom_line(color = "blue") + # Predicted probability line
geom_ribbon(aes(ymin = lower_prob, ymax = upper_prob), alpha = 0.2) + # 95% CI
geom_point(data = data, aes(x = Temperature, y = Malfunction/Count), color = "red") + # observed proportions
labs(y = "Probability of Malfunction",
x = "Temperature") +
theme_minimal()
```
Model was to simple and took into account only, temperature not pressure
etc, we should do the same based on pressure, temperature, malfucntion
types
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment