{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analysis of the risk of failure of the O-rings on the Challenger shuttle"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On January 27, 1986, the day before the takeoff of the shuttle _Challenger_, had\n",
"a three-hour teleconference was held between \n",
"Morton Thiokol (the manufacturer of one of the engines) and NASA. The\n",
"discussion focused on the consequences of the\n",
"temperature at take-off of 31°F (just below\n",
"0°C) for the success of the flight and in particular on the performance of the\n",
"O-rings used in the engines. Indeed, no test\n",
"had been performed at this temperature.\n",
"\n",
"The following study takes up some of the analyses carried out that\n",
"night with the objective of assessing the potential influence of\n",
"the temperature and pressure to which the O-rings are subjected\n",
"on their probability of malfunction. Our starting point is \n",
"the results of the experiments carried out by NASA engineers\n",
"during the six years preceding the launch of the shuttle\n",
"Challenger."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Loading the data\n",
"We start by loading this data:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
Date
\n",
"
Count
\n",
"
Temperature
\n",
"
Pressure
\n",
"
Malfunction
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
4/12/81
\n",
"
6
\n",
"
66
\n",
"
50
\n",
"
0
\n",
"
\n",
"
\n",
"
1
\n",
"
11/12/81
\n",
"
6
\n",
"
70
\n",
"
50
\n",
"
1
\n",
"
\n",
"
\n",
"
2
\n",
"
3/22/82
\n",
"
6
\n",
"
69
\n",
"
50
\n",
"
0
\n",
"
\n",
"
\n",
"
3
\n",
"
11/11/82
\n",
"
6
\n",
"
68
\n",
"
50
\n",
"
0
\n",
"
\n",
"
\n",
"
4
\n",
"
4/04/83
\n",
"
6
\n",
"
67
\n",
"
50
\n",
"
0
\n",
"
\n",
"
\n",
"
5
\n",
"
6/18/82
\n",
"
6
\n",
"
72
\n",
"
50
\n",
"
0
\n",
"
\n",
"
\n",
"
6
\n",
"
8/30/83
\n",
"
6
\n",
"
73
\n",
"
100
\n",
"
0
\n",
"
\n",
"
\n",
"
7
\n",
"
11/28/83
\n",
"
6
\n",
"
70
\n",
"
100
\n",
"
0
\n",
"
\n",
"
\n",
"
8
\n",
"
2/03/84
\n",
"
6
\n",
"
57
\n",
"
200
\n",
"
1
\n",
"
\n",
"
\n",
"
9
\n",
"
4/06/84
\n",
"
6
\n",
"
63
\n",
"
200
\n",
"
1
\n",
"
\n",
"
\n",
"
10
\n",
"
8/30/84
\n",
"
6
\n",
"
70
\n",
"
200
\n",
"
1
\n",
"
\n",
"
\n",
"
11
\n",
"
10/05/84
\n",
"
6
\n",
"
78
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
12
\n",
"
11/08/84
\n",
"
6
\n",
"
67
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
13
\n",
"
1/24/85
\n",
"
6
\n",
"
53
\n",
"
200
\n",
"
2
\n",
"
\n",
"
\n",
"
14
\n",
"
4/12/85
\n",
"
6
\n",
"
67
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
15
\n",
"
4/29/85
\n",
"
6
\n",
"
75
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
16
\n",
"
6/17/85
\n",
"
6
\n",
"
70
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
17
\n",
"
7/29/85
\n",
"
6
\n",
"
81
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
18
\n",
"
8/27/85
\n",
"
6
\n",
"
76
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
19
\n",
"
10/03/85
\n",
"
6
\n",
"
79
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
20
\n",
"
10/30/85
\n",
"
6
\n",
"
75
\n",
"
200
\n",
"
2
\n",
"
\n",
"
\n",
"
21
\n",
"
11/26/85
\n",
"
6
\n",
"
76
\n",
"
200
\n",
"
0
\n",
"
\n",
"
\n",
"
22
\n",
"
1/12/86
\n",
"
6
\n",
"
58
\n",
"
200
\n",
"
1
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Date Count Temperature Pressure Malfunction\n",
"0 4/12/81 6 66 50 0\n",
"1 11/12/81 6 70 50 1\n",
"2 3/22/82 6 69 50 0\n",
"3 11/11/82 6 68 50 0\n",
"4 4/04/83 6 67 50 0\n",
"5 6/18/82 6 72 50 0\n",
"6 8/30/83 6 73 100 0\n",
"7 11/28/83 6 70 100 0\n",
"8 2/03/84 6 57 200 1\n",
"9 4/06/84 6 63 200 1\n",
"10 8/30/84 6 70 200 1\n",
"11 10/05/84 6 78 200 0\n",
"12 11/08/84 6 67 200 0\n",
"13 1/24/85 6 53 200 2\n",
"14 4/12/85 6 67 200 0\n",
"15 4/29/85 6 75 200 0\n",
"16 6/17/85 6 70 200 0\n",
"17 7/29/85 6 81 200 0\n",
"18 8/27/85 6 76 200 0\n",
"19 10/03/85 6 79 200 0\n",
"20 10/30/85 6 75 200 2\n",
"21 11/26/85 6 76 200 0\n",
"22 1/12/86 6 58 200 1"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"data = pd.read_csv(\"shuttle.csv\")\n",
"data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The data set shows us the date of each test, the number of O-rings (there are 6 on the main launcher), the temperature (in Fahrenheit) and pressure (in psi), and finally the number of identified malfunctions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Graphical inspection\n",
"Flights without incidents do not provide any information\n",
"on the influence of temperature or pressure on malfunction.\n",
"We thus focus on the experiments in which at least one O-ring\n",
"was defective.\n",
"\n",
"### =====\n",
"They filtered out all flights with zero malfunctions, which is a problem because now the data only shows cases where something went wrong. This makes it impossible to see how temperature actually affects failures across all flights.\n",
"\n",
"--> Keep all flights, including those without malfunctions, so the model can learn the real failure probability"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"pd.set_option('mode.chained_assignment',None) # this removes a useless warning from pandas\n",
"import matplotlib.pyplot as plt\n",
"\n",
"data[\"Frequency\"]=data.Malfunction/data.Count\n",
"data.plot(x=\"Temperature\",y=\"Frequency\",kind=\"scatter\",ylim=[0,1])\n",
"plt.grid(True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At first glance, the dependence does not look very important, but let's try to\n",
"estimate the impact of temperature $t$ on the probability of O-ring malfunction."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Estimation of the temperature influence\n",
"\n",
"Suppose that each of the six O-rings is damaged with the same\n",
"probability and independently of the others and that this probability\n",
"depends only on the temperature. If $p(t)$ is this probability, the\n",
"number $D$ of malfunctioning O-rings during a flight at\n",
"temperature $t$ follows a binomial law with parameters $n=6$ and\n",
"$p=p(t)$. To link $p(t)$ to $t$, we will therefore perform a\n",
"logistic regression."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
Generalized Linear Model Regression Results
\n",
"
\n",
"
Dep. Variable:
Frequency
No. Observations:
23
\n",
"
\n",
"
\n",
"
Model:
GLM
Df Residuals:
21
\n",
"
\n",
"
\n",
"
Model Family:
Binomial
Df Model:
1
\n",
"
\n",
"
\n",
"
Link Function:
logit
Scale:
1.0000
\n",
"
\n",
"
\n",
"
Method:
IRLS
Log-Likelihood:
-3.9210
\n",
"
\n",
"
\n",
"
Date:
Wed, 12 Nov 2025
Deviance:
3.0144
\n",
"
\n",
"
\n",
"
Time:
23:11:29
Pearson chi2:
5.00
\n",
"
\n",
"
\n",
"
No. Iterations:
6
Covariance Type:
nonrobust
\n",
"
\n",
"
\n",
"
\n",
"
\n",
"
coef
std err
z
P>|z|
[0.025
0.975]
\n",
"
\n",
"
\n",
"
Intercept
5.0850
7.477
0.680
0.496
-9.570
19.740
\n",
"
\n",
"
\n",
"
Temperature
-0.1156
0.115
-1.004
0.316
-0.341
0.110
\n",
"
\n",
"
"
],
"text/plain": [
"\n",
"\"\"\"\n",
" Generalized Linear Model Regression Results \n",
"==============================================================================\n",
"Dep. Variable: Frequency No. Observations: 23\n",
"Model: GLM Df Residuals: 21\n",
"Model Family: Binomial Df Model: 1\n",
"Link Function: logit Scale: 1.0000\n",
"Method: IRLS Log-Likelihood: -3.9210\n",
"Date: Wed, 12 Nov 2025 Deviance: 3.0144\n",
"Time: 23:11:29 Pearson chi2: 5.00\n",
"No. Iterations: 6 Covariance Type: nonrobust\n",
"===============================================================================\n",
" coef std err z P>|z| [0.025 0.975]\n",
"-------------------------------------------------------------------------------\n",
"Intercept 5.0850 7.477 0.680 0.496 -9.570 19.740\n",
"Temperature -0.1156 0.115 -1.004 0.316 -0.341 0.110\n",
"===============================================================================\n",
"\"\"\""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import statsmodels.api as sm\n",
"\n",
"data[\"Success\"]=data.Count-data.Malfunction\n",
"data[\"Intercept\"]=1\n",
"\n",
"logmodel=sm.GLM(data['Frequency'], data[['Intercept','Temperature']], family=sm.families.Binomial(sm.families.links.logit)).fit()\n",
"\n",
"logmodel.summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The estimated coefficient for Temperature is -0.116 with a standard error of 0.115, which means we cannot confidently say that Temperature has a significant effect on O-ring failure. These estimates should be interpreted with caution."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Estimation of the probability of O-ring malfunction\n",
"\n",
"The expected temperature on the take-off day is 31°F. Let's try to\n",
"estimate the probability of O-ring malfunction at\n",
"this temperature from the model we just built:\n",
"\n",
"\n",
"### ==================\n",
"A problem I find very confusing is that they said, \"These estimates should be interpreted with caution\" (even in the original analysis), but then they still relied on the model’s predictions and used those results to justify their reasoning. ????!"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"data_pred = pd.DataFrame({'Temperature': np.linspace(start=30, stop=90, num=121), 'Intercept': 1})\n",
"data_pred['Frequency'] = logmodel.predict(data_pred[['Intercept','Temperature']])\n",
"data_pred.plot(x=\"Temperature\",y=\"Frequency\",kind=\"line\",ylim=[0,1])\n",
"plt.scatter(x=data[\"Temperature\"],y=data[\"Frequency\"])\n",
"plt.grid(True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"hideCode": false,
"hidePrompt": false,
"scrolled": true
},
"source": [
"Using the updated analysis, the predicted probability of O-ring failure clearly decreases as temperature increases, forming a curve across the observed range. This shows that at low temperatures like 31°F, the risk is much higher than the simple average would suggest, highlighting the danger under extreme conditions."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"# The code below is disregarded because the risk of O-ring failure strongly depends on temperature.\n",
"# data = pd.read_csv(\"shuttle.csv\")\n",
"# print(np.sum(data.Malfunction)/np.sum(data.Count))"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Predicted probability of O-ring failure at 31°F: 0.8177744062821891\n"
]
}
],
"source": [
"# Let's calculate again\n",
"p_challenger = logmodel.predict((1, 31))[0]\n",
"print(\"Predicted probability of O-ring failure at 31°F:\", p_challenger)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prob for both joints: 0.6687549795701869\n",
"Prob for any one of launchers: 0.963654715320592\n"
]
}
],
"source": [
"print(\"Prob of failure for both joints: \", p_challenger**2)\n",
"print(\"Prob of failure for any one of launchers: \", 1-(1-p_challenger**2)**3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This probability is thus about p ≈ $0.818$. Knowing that there is a primary and a secondary O-ring on each of the three parts of the launcher, the probability of failure of both joints of a launcher is $p^2 \\approx 0.669$. The probability of failure of any one of the launchers is $1-(1-p^2)^3 \\approx 96.4\\%$. This is drastically higher than the old average-based estimate of p ≈ 0.065, which underestimated the risk because it ignored the extreme temperature on the launch day."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Hide code",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}