{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Risk Analysis of the Space Shuttle: Pre-Challenger Prediction of Failure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this document we reperform some of the analysis provided in \n", "*Risk Analysis of the Space Shuttle: Pre-Challenger Prediction of Failure* by *Siddhartha R. Dalal, Edward B. Fowlkes, Bruce Hoadley* published in *Journal of the American Statistical Association*, Vol. 84, No. 408 (Dec., 1989), pp. 945-957 and available at http://www.jstor.org/stable/2290069. \n", "\n", "On the fourth page of this article, they indicate that the maximum likelihood estimates of the logistic regression using only temperature are: $\\hat{\\alpha}$ = **5.085** and $\\hat{\\beta}$ = **-0.1156** and their asymptotic standard errors are $s_{\\hat{\\alpha}}$ = **3.052** and $s_{\\hat{\\beta}}$ = **0.047**. The Goodness of fit indicated for this model was $G^2$ = **18.086** with **21** degrees of freedom. Our goal is to reproduce the computation behind these values and the Figure 4 of this article, possibly in a nicer looking way." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Technical information on the computer on which the analysis is run" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will be using the Python 3 language using the pandas, statsmodels, and numpy library." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:18) \n", "[GCC 10.3.0]\n", "uname_result(system='Linux', node='felic-ThinkPad-E14-Gen-2', release='5.15.0-91-generic', version='#101~20.04.1-Ubuntu SMP Thu Nov 16 14:22:28 UTC 2023', machine='x86_64', processor='x86_64')\n", "IPython 8.6.0\n", "IPython.core.release 8.6.0\n", "PIL 9.3.0\n", "PIL.Image 9.3.0\n", "PIL._deprecate 9.3.0\n", "PIL._version 9.3.0\n", "_csv 1.0\n", "_ctypes 1.1.0\n", "_curses b'2.2'\n", "decimal 1.70\n", "_pydev_bundle.fsnotify 0.1.5\n", "_pydevd_frame_eval.vendored.bytecode 0.13.0.dev\n", "argparse 1.1\n", "backcall 0.2.0\n", "cffi 1.15.1\n", "csv 1.0\n", "ctypes 1.1.0\n", "cycler 0.10.0\n", "dateutil 2.8.2\n", "debugpy 1.6.3\n", "debugpy.public_api 1.6.3\n", "decimal 1.70\n", "decorator 5.1.1\n", "defusedxml 0.7.1\n", "distutils 3.8.13\n", "entrypoints 0.4\n", "executing 1.2.0\n", "executing.version 1.2.0\n", "http.server 0.6\n", "ipykernel 6.17.0\n", "ipykernel._version 6.17.0\n", "ipywidgets 8.0.2\n", "ipywidgets._version 8.0.2\n", "jedi 0.18.1\n", "joblib 1.2.0\n", "joblib.externals.cloudpickle 2.2.0\n", "joblib.externals.loky 3.3.0\n", "json 2.0.9\n", "jupyter_client 7.4.4\n", "jupyter_client._version 7.4.4\n", "jupyter_core 4.11.2\n", "jupyter_core.version 4.11.2\n", "kiwisolver 1.4.4\n", "kiwisolver._cext 1.4.4\n", "logging 0.5.1.2\n", "matplotlib 3.6.2\n", "matplotlib._version 3.6.2\n", "matplotlib_inline 0.1.6\n", "numpy 1.23.4\n", "numpy.core 1.23.4\n", "numpy.core._multiarray_umath 3.1\n", "numpy.lib 1.23.4\n", "numpy.linalg._umath_linalg 0.1.5\n", "numpy.version 1.23.4\n", "packaging 21.3\n", "packaging.__about__ 21.3\n", "pandas 2.0.3\n", "parso 0.8.3\n", "patsy 0.5.6\n", "patsy.version 0.5.6\n", "pexpect 4.8.0\n", "pickleshare 0.7.5\n", "pkg_resources._vendor.appdirs 1.4.3\n", "pkg_resources._vendor.more_itertools 8.12.0\n", "pkg_resources._vendor.packaging 21.3\n", "pkg_resources._vendor.packaging.__about__ 21.3\n", "pkg_resources._vendor.pyparsing 3.0.9\n", "pkg_resources._vendor.appdirs 1.4.3\n", "pkg_resources._vendor.more_itertools 8.12.0\n", "pkg_resources._vendor.packaging 21.3\n", "pkg_resources._vendor.pyparsing 3.0.9\n", "platform 1.0.8\n", "prompt_toolkit 3.0.32\n", "psutil 5.9.3\n", "ptyprocess 0.7.0\n", "pure_eval 0.2.2\n", "pure_eval.version 0.2.2\n", "pydevd 2.8.0\n", "pygments 2.13.0\n", "pyparsing 3.0.9\n", "pytz 2023.3.post1\n", "re 2.2.1\n", "scipy 1.9.3\n", "scipy._lib._uarray 0.8.8.dev0+aa94c5a4.scipy\n", "scipy._lib.decorator 4.0.5\n", "scipy.integrate._dop 1.20.3\n", "scipy.integrate._lsoda 1.20.3\n", "scipy.integrate._vode 1.20.3\n", "scipy.interpolate.dfitpack 1.20.3\n", "scipy.linalg._fblas 1.20.3\n", "scipy.linalg._flapack 1.20.3\n", "scipy.linalg._flinalg 1.20.3\n", "scipy.linalg._interpolative 1.20.3\n", "scipy.optimize.__nnls 1.20.3\n", "scipy.optimize._cobyla 1.20.3\n", "scipy.optimize._lbfgsb 1.20.3\n", "scipy.optimize._minpack2 1.20.3\n", "scipy.optimize._slsqp 1.20.3\n", "scipy.sparse.linalg._eigen.arpack._arpack 1.20.3\n", "scipy.sparse.linalg._isolve._iterative 1.20.3\n", "scipy.special._specfun 1.20.3\n", "scipy.stats._mvn 1.20.3\n", "scipy.stats._statlib 1.20.3\n", "seaborn 0.12.1\n", "seaborn.external.appdirs 1.4.4\n", "seaborn.external.husl 2.1.0\n", "setuptools 65.5.0\n", "distutils 3.8.13\n", "setuptools._vendor.more_itertools 8.8.0\n", "setuptools._vendor.ordered_set 3.1\n", "setuptools._vendor.packaging 21.3\n", "setuptools._vendor.packaging.__about__ 21.3\n", "setuptools._vendor.pyparsing 3.0.9\n", "setuptools._vendor.more_itertools 8.8.0\n", "setuptools._vendor.ordered_set 3.1\n", "setuptools._vendor.packaging 21.3\n", "setuptools._vendor.pyparsing 3.0.9\n", "setuptools.version 65.5.0\n", "six 1.16.0\n", "socketserver 0.4\n", "stack_data 0.6.0\n", "stack_data.version 0.6.0\n", "statsmodels 0.14.1\n", "statsmodels.__init__ 0.14.1\n", "statsmodels._version 0.14.1\n", "statsmodels.api 0.14.1\n", "statsmodels.tools.web 0.14.1\n", "traitlets 5.5.0\n", "traitlets._version 5.5.0\n", "urllib.request 3.8\n", "wcwidth 0.2.5\n", "xmlrpc.client 3.8\n", "zlib 1.0\n", "zmq 24.0.1\n", "zmq.sugar 24.0.1\n", "zmq.sugar.version 24.0.1\n" ] } ], "source": [ "def print_imported_modules():\n", " import sys\n", " for name, val in sorted(sys.modules.items()):\n", " if(hasattr(val, '__version__')): \n", " print(val.__name__, val.__version__)\n", "# else:\n", "# print(val.__name__, \"(unknown version)\")\n", "def print_sys_info():\n", " import sys\n", " import platform\n", " print(sys.version)\n", " print(platform.uname())\n", "\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import statsmodels.api as sm\n", "import seaborn as sns\n", "\n", "print_sys_info()\n", "print_imported_modules()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading and inspecting data\n", "Let's start by reading data." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Date | \n", "Count | \n", "Temperature | \n", "Pressure | \n", "Malfunction | \n", "
---|---|---|---|---|---|
0 | \n", "4/12/81 | \n", "6 | \n", "66 | \n", "50 | \n", "0 | \n", "
1 | \n", "11/12/81 | \n", "6 | \n", "70 | \n", "50 | \n", "1 | \n", "
2 | \n", "3/22/82 | \n", "6 | \n", "69 | \n", "50 | \n", "0 | \n", "
3 | \n", "11/11/82 | \n", "6 | \n", "68 | \n", "50 | \n", "0 | \n", "
4 | \n", "4/04/83 | \n", "6 | \n", "67 | \n", "50 | \n", "0 | \n", "
5 | \n", "6/18/82 | \n", "6 | \n", "72 | \n", "50 | \n", "0 | \n", "
6 | \n", "8/30/83 | \n", "6 | \n", "73 | \n", "100 | \n", "0 | \n", "
7 | \n", "11/28/83 | \n", "6 | \n", "70 | \n", "100 | \n", "0 | \n", "
8 | \n", "2/03/84 | \n", "6 | \n", "57 | \n", "200 | \n", "1 | \n", "
9 | \n", "4/06/84 | \n", "6 | \n", "63 | \n", "200 | \n", "1 | \n", "
10 | \n", "8/30/84 | \n", "6 | \n", "70 | \n", "200 | \n", "1 | \n", "
11 | \n", "10/05/84 | \n", "6 | \n", "78 | \n", "200 | \n", "0 | \n", "
12 | \n", "11/08/84 | \n", "6 | \n", "67 | \n", "200 | \n", "0 | \n", "
13 | \n", "1/24/85 | \n", "6 | \n", "53 | \n", "200 | \n", "2 | \n", "
14 | \n", "4/12/85 | \n", "6 | \n", "67 | \n", "200 | \n", "0 | \n", "
15 | \n", "4/29/85 | \n", "6 | \n", "75 | \n", "200 | \n", "0 | \n", "
16 | \n", "6/17/85 | \n", "6 | \n", "70 | \n", "200 | \n", "0 | \n", "
17 | \n", "7/2903/85 | \n", "6 | \n", "81 | \n", "200 | \n", "0 | \n", "
18 | \n", "8/27/85 | \n", "6 | \n", "76 | \n", "200 | \n", "0 | \n", "
19 | \n", "10/03/85 | \n", "6 | \n", "79 | \n", "200 | \n", "0 | \n", "
20 | \n", "10/30/85 | \n", "6 | \n", "75 | \n", "200 | \n", "2 | \n", "
21 | \n", "11/26/85 | \n", "6 | \n", "76 | \n", "200 | \n", "0 | \n", "
22 | \n", "1/12/86 | \n", "6 | \n", "58 | \n", "200 | \n", "1 | \n", "