anomalies sur les semaines manquantes

parent 1ed7d631
...@@ -30,7 +30,7 @@ ...@@ -30,7 +30,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1, "execution_count": 11,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
...@@ -49,7 +49,7 @@ ...@@ -49,7 +49,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2, "execution_count": 12,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -80,7 +80,7 @@ ...@@ -80,7 +80,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3, "execution_count": 13,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -141,7 +141,7 @@ ...@@ -141,7 +141,7 @@
"1958-04-26 316.48" "1958-04-26 316.48"
] ]
}, },
"execution_count": 3, "execution_count": 13,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
...@@ -154,7 +154,7 @@ ...@@ -154,7 +154,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4, "execution_count": 14,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -183,7 +183,7 @@ ...@@ -183,7 +183,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 5, "execution_count": 15,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -228,7 +228,7 @@ ...@@ -228,7 +228,7 @@
"Index: []" "Index: []"
] ]
}, },
"execution_count": 5, "execution_count": 15,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
...@@ -240,7 +240,7 @@ ...@@ -240,7 +240,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6, "execution_count": 17,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [
{ {
...@@ -264,28 +264,34 @@ ...@@ -264,28 +264,34 @@
" <thead>\n", " <thead>\n",
" <tr style=\"text-align: right;\">\n", " <tr style=\"text-align: right;\">\n",
" <th></th>\n", " <th></th>\n",
" <th>index</th>\n",
" <th>ppm</th>\n", " <th>ppm</th>\n",
" </tr>\n", " </tr>\n",
" </thead>\n", " </thead>\n",
" <tbody>\n", " <tbody>\n",
" <tr>\n", " <tr>\n",
" <th>1958-03-29</th>\n", " <th>0</th>\n",
" <td>1958-03-29</td>\n",
" <td>316.19</td>\n", " <td>316.19</td>\n",
" </tr>\n", " </tr>\n",
" <tr>\n", " <tr>\n",
" <th>1958-04-05</th>\n", " <th>1</th>\n",
" <td>1958-04-05</td>\n",
" <td>317.31</td>\n", " <td>317.31</td>\n",
" </tr>\n", " </tr>\n",
" <tr>\n", " <tr>\n",
" <th>1958-04-12</th>\n", " <th>2</th>\n",
" <td>1958-04-12</td>\n",
" <td>317.69</td>\n", " <td>317.69</td>\n",
" </tr>\n", " </tr>\n",
" <tr>\n", " <tr>\n",
" <th>1958-04-19</th>\n", " <th>3</th>\n",
" <td>1958-04-19</td>\n",
" <td>317.58</td>\n", " <td>317.58</td>\n",
" </tr>\n", " </tr>\n",
" <tr>\n", " <tr>\n",
" <th>1958-04-26</th>\n", " <th>4</th>\n",
" <td>1958-04-26</td>\n",
" <td>316.48</td>\n", " <td>316.48</td>\n",
" </tr>\n", " </tr>\n",
" </tbody>\n", " </tbody>\n",
...@@ -293,24 +299,65 @@ ...@@ -293,24 +299,65 @@
"</div>" "</div>"
], ],
"text/plain": [ "text/plain": [
" ppm\n", " index ppm\n",
"1958-03-29 316.19\n", "0 1958-03-29 316.19\n",
"1958-04-05 317.31\n", "1 1958-04-05 317.31\n",
"1958-04-12 317.69\n", "2 1958-04-12 317.69\n",
"1958-04-19 317.58\n", "3 1958-04-19 317.58\n",
"1958-04-26 316.48" "4 1958-04-26 316.48"
] ]
}, },
"execution_count": 6, "execution_count": 17,
"metadata": {}, "metadata": {},
"output_type": "execute_result" "output_type": "execute_result"
} }
], ],
"source": [ "source": [
"data = raw_data.copy()\n", "data = raw_data.copy().reset_index()\n",
"data.head()" "data.head()"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On vérifie si toutes les semaines sont présentes dans les données : ce n'est pas le cas pour les années ci-dessous. Deux cas de figure se présentent :\n",
"- pour l'année 1958 (première date de la série) et l'année 2023 (dernière date de la série) $\\rightarrow$ pas de problème\n",
"- pour les autres années $\\rightarrow$ quelques semaines sont manquantes, on décide de continuer l'analyse tel quel mais on garde l'information en tête"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1958 25\n",
"1959 48\n",
"1962 48\n",
"1963 49\n",
"1964 31\n",
"1966 49\n",
"1984 48\n",
"2003 49\n",
"2005 49\n",
"2023 26\n"
]
}
],
"source": [
"# calcul du nombre de semaines présentes par année\n",
"data['year'] = data['index'].apply(lambda x: x[:4])\n",
"years = data.groupby(['year'])['year'].count().index\n",
"nb_weeks = data.groupby(['year'])['year'].count().values\n",
"for x, y in zip(years, nb_weeks):\n",
" if y < 50:\n",
" print(x, y)"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment