Commit daa4fe3b authored by Tommy Rushton's avatar Tommy Rushton

Exercise 5.

parent b985c45a
module2/exo1/cosxsx.png

83.5 KB | W: | H:

module2/exo1/cosxsx.png

21.9 KB | W: | H:

module2/exo1/cosxsx.png
module2/exo1/cosxsx.png
module2/exo1/cosxsx.png
module2/exo1/cosxsx.png
  • 2-up
  • Swipe
  • Onion skin
......@@ -33,7 +33,7 @@ Challenger.
* Loading the data
We start by loading this data:
#+begin_src python :results value :session *python* :exports both
#+begin_src python :results value :session :exports both
import numpy as np
import pandas as pd
data = pd.read_csv("shuttle.csv")
......@@ -78,8 +78,13 @@ Flights without incidents do not provide any information
on the influence of temperature or pressure on malfunction.
We thus focus on the experiments in which at least one O-ring was defective.
#+begin_src python :results value :session *python* :exports both
data = data[data.Malfunction>0]
#+begin_quote
This is suspect. What if launches without defects were predominately
at higher temperatures?
#+end_quote
#+begin_src python :results value :session :exports both
data = data[data.Malfunction > 0]
data
#+end_src
......@@ -97,8 +102,12 @@ We have a high temperature variability but
the pressure is almost always 200, which should
simplify the analysis.
#+begin_quote
"Almost always" is alarming.
#+end_quote
How does the frequency of failure vary with temperature?
#+begin_src python :results output file :var matplot_lib_filename="freq_temp_python.png" :exports both :session *python*
#+begin_src python :results output file :var matplot_lib_filename="freq_temp_python.png" :exports both :session
import matplotlib.pyplot as plt
plt.clf()
......@@ -120,13 +129,17 @@ estimate the impact of temperature $t$ on the probability of O-ring malfunction.
Suppose that each of the six O-rings is damaged with the same
probability and independently of the others and that this probability
depends only on the temperature. If $p(t)$ is this probability, the
depends only on the temperature.
#+begin_quote
OK, yes, let's suppose that.
#+end_quote
If $p(t)$ is this probability, the
number $D$ of malfunctioning O-rings during a flight at
temperature $t$ follows a binomial law with parameters $n=6$ and
$p=p(t)$. To link $p(t)$ to $t$, we will therefore perform a
logistic regression.
#+begin_src python :results value :session *python* :exports both
#+begin_src python :results value :session :exports both
import statsmodels.api as sm
data["Success"]=data.Count-data.Malfunction
......@@ -134,7 +147,8 @@ data["Intercept"]=1
# logit_model=sm.Logit(data["Frequency"],data[["Intercept","Temperature"]]).fit()
logmodel=sm.GLM(data['Frequency'], data[['Intercept','Temperature']], family=sm.families.Binomial(sm.families.links.logit)).fit()
link = sm.families.links.Logit()
logmodel=sm.GLM(data['Frequency'], data[['Intercept','Temperature']], family=sm.families.Binomial(link)).fit()
logmodel.summary()
#+end_src
......@@ -142,15 +156,16 @@ logmodel.summary()
#+RESULTS:
#+begin_example
Generalized Linear Model Regression Results
==============================================================================
===============================================================================
Dep. Variable: Frequency No. Observations: 7
Model: GLM Df Residuals: 5
Model Family: Binomial Df Model: 1
Link Function: logit Scale: 1.0
Method: IRLS Log-Likelihood: -3.6370
Date: Fri, 20 Jul 2018 Deviance: 3.3763
Time: 16:56:08 Pearson chi2: 0.236
No. Iterations: 5
Link Function: Logit Scale: 1.0000
Method: IRLS Log-Likelihood: -2.5250
Date: jeu., 11 avril 2024 Deviance: 0.22231
Time: 15:05:37 Pearson chi2: 0.236
No. Iterations: 4 Pseudo R-squ. (CS): 1.926e-05
Covariance Type: nonrobust
===============================================================================
coef std err z P>|z| [0.025 0.975]
-------------------------------------------------------------------------------
......@@ -163,13 +178,23 @@ The most likely estimator of the temperature parameter is 0.0014
and the standard error of this estimator is 0.122, in other words we
cannot distinguish any particular impact and we must take our
estimates with caution.
#+begin_quote
Indeed... and look at that /p value/ (0.991) which more-or-less says,
for the subset of data, that temperature likely has /no effect/ on
likelihood of malfunction.
#+end_quote
* Estimation of the probability of O-ring malfunction
The expected temperature on the take-off day is 31°F. Let's try to
The expected temperature on the take-off day is 31°F.
#+begin_quote
A temperature at/around which we have /no data/. Extrapolating from
higher temperatures — bad idea.
#+end_quote
Let's try to
estimate the probability of O-ring malfunction at
this temperature from the model we just built:
#+begin_src python :results output file :var matplot_lib_filename="proba_estimate_python.png" :exports both :session *python*
#+begin_src python :results output file :var matplot_lib_filename="proba_estimate_python.png" :exports both :session
import matplotlib.pyplot as plt
data_pred = pd.DataFrame({'Temperature': np.linspace(start=30, stop=90, num=121), 'Intercept': 1})
......@@ -188,9 +213,14 @@ print(matplot_lib_filename)
As expected from the initial data, the
temperature has no significant impact on the probability of failure of the
O-rings. It will be about 0.2, as in the tests
where we had a failure of at least one joint. Let's get back to the initial dataset to estimate the probability of failure:
#+begin_src python :results output :session *python* :exports both
where we had a failure of at least one joint.
#+begin_quote
Opting not to exclude the entries where no malfunction occurred
reveals a strikingly different picture.
#+end_quote
Let's get back to the initial dataset to estimate the probability of failure:
#+begin_src python :results output :session :exports both
data = pd.read_csv("shuttle.csv")
print(np.sum(data.Malfunction)/np.sum(data.Count))
#+end_src
......@@ -198,7 +228,13 @@ print(np.sum(data.Malfunction)/np.sum(data.Count))
#+RESULTS:
: 0.06521739130434782
This probability is thus about $p=0.065$. Knowing that there is
This probability is thus about $p=0.065$.
#+begin_quote
This has an air of desperation about it. So now, we're just taking the
proportion of failures to total O-rings... across just the flights
where malfunctions were recorded?
#+end_quote
Knowing that there is
a primary and a secondary O-ring on each of the three parts of the
launcher, the probability of failure of both joints of a launcher
is $p^2 \approx 0.00425$. The probability of failure of any one of the
......
module2/exo5/freq_temp_python.png

12.3 KB | W: | H:

module2/exo5/freq_temp_python.png

11.9 KB | W: | H:

module2/exo5/freq_temp_python.png
module2/exo5/freq_temp_python.png
module2/exo5/freq_temp_python.png
module2/exo5/freq_temp_python.png
  • 2-up
  • Swipe
  • Onion skin
module2/exo5/proba_estimate_python.png

14.3 KB | W: | H:

module2/exo5/proba_estimate_python.png

14.1 KB | W: | H:

module2/exo5/proba_estimate_python.png
module2/exo5/proba_estimate_python.png
module2/exo5/proba_estimate_python.png
module2/exo5/proba_estimate_python.png
  • 2-up
  • Swipe
  • Onion skin
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment