anomalies sur les semaines manquantes

parent 1ed7d631
......@@ -30,7 +30,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
......@@ -49,7 +49,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 12,
"metadata": {},
"outputs": [
{
......@@ -80,7 +80,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 13,
"metadata": {},
"outputs": [
{
......@@ -141,7 +141,7 @@
"1958-04-26 316.48"
]
},
"execution_count": 3,
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
......@@ -154,7 +154,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 14,
"metadata": {},
"outputs": [
{
......@@ -183,7 +183,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 15,
"metadata": {},
"outputs": [
{
......@@ -228,7 +228,7 @@
"Index: []"
]
},
"execution_count": 5,
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
......@@ -240,7 +240,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 17,
"metadata": {},
"outputs": [
{
......@@ -264,28 +264,34 @@
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>index</th>\n",
" <th>ppm</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1958-03-29</th>\n",
" <th>0</th>\n",
" <td>1958-03-29</td>\n",
" <td>316.19</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1958-04-05</th>\n",
" <th>1</th>\n",
" <td>1958-04-05</td>\n",
" <td>317.31</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1958-04-12</th>\n",
" <th>2</th>\n",
" <td>1958-04-12</td>\n",
" <td>317.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1958-04-19</th>\n",
" <th>3</th>\n",
" <td>1958-04-19</td>\n",
" <td>317.58</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1958-04-26</th>\n",
" <th>4</th>\n",
" <td>1958-04-26</td>\n",
" <td>316.48</td>\n",
" </tr>\n",
" </tbody>\n",
......@@ -293,24 +299,65 @@
"</div>"
],
"text/plain": [
" ppm\n",
"1958-03-29 316.19\n",
"1958-04-05 317.31\n",
"1958-04-12 317.69\n",
"1958-04-19 317.58\n",
"1958-04-26 316.48"
" index ppm\n",
"0 1958-03-29 316.19\n",
"1 1958-04-05 317.31\n",
"2 1958-04-12 317.69\n",
"3 1958-04-19 317.58\n",
"4 1958-04-26 316.48"
]
},
"execution_count": 6,
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = raw_data.copy()\n",
"data = raw_data.copy().reset_index()\n",
"data.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On vérifie si toutes les semaines sont présentes dans les données : ce n'est pas le cas pour les années ci-dessous. Deux cas de figure se présentent :\n",
"- pour l'année 1958 (première date de la série) et l'année 2023 (dernière date de la série) $\\rightarrow$ pas de problème\n",
"- pour les autres années $\\rightarrow$ quelques semaines sont manquantes, on décide de continuer l'analyse tel quel mais on garde l'information en tête"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1958 25\n",
"1959 48\n",
"1962 48\n",
"1963 49\n",
"1964 31\n",
"1966 49\n",
"1984 48\n",
"2003 49\n",
"2005 49\n",
"2023 26\n"
]
}
],
"source": [
"# calcul du nombre de semaines présentes par année\n",
"data['year'] = data['index'].apply(lambda x: x[:4])\n",
"years = data.groupby(['year'])['year'].count().index\n",
"nb_weeks = data.groupby(['year'])['year'].count().values\n",
"for x, y in zip(years, nb_weeks):\n",
" if y < 50:\n",
" print(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment