En 1958, Charles David Keeling a initié une mesure de la concentration de CO2 dans l’atmosphère à l’observatoire de Mauna Loa, Hawaii, États-Unis qui continue jusqu’à aujourd’hui. L’objectif initial était d’étudier la variation saisonnière, mais l’intérêt s’est déplacé plus tard vers l’étude de la tendance croissante dans le contexte du changement climatique. En honneur à Keeling, ce jeu de données est souvent appelé “Keeling Curve” (voir (https://en.wikipedia.org/wiki/Keeling_Curve) pour l’histoire et l’importance de ces données).
Les données sont disponibles sur le site Web de l’institut Scripps. Utilisez le fichier avec les observations hebdomadaires. Attention, ce fichier est mis à jour régulièrement avec de nouvelles observations. Notez donc bien la date du téléchargement, et gardez une copie locale de la version précise que vous analysez. Faites aussi attention aux données manquantes.
Traitement de suites chronologiques
Quelques références:
The data file below contains 10 columns.
Missing values are denoted by -99.99
CO2 concentrations are measured on the ‘08A’ calibration scale
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 3.5.3
## -- Attaching packages -------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0 v purrr 0.3.2
## v tibble 2.1.1 v dplyr 0.8.0.1
## v tidyr 0.8.3 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## Warning: package 'tibble' was built under R version 3.5.3
## Warning: package 'tidyr' was built under R version 3.5.3
## Warning: package 'readr' was built under R version 3.5.3
## Warning: package 'purrr' was built under R version 3.5.3
## Warning: package 'dplyr' was built under R version 3.5.3
## Warning: package 'stringr' was built under R version 3.5.3
## Warning: package 'forcats' was built under R version 3.5.3
## -- Conflicts ----------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(forecast)
## Warning: package 'forecast' was built under R version 3.5.3
library(lubridate)
## Warning: package 'lubridate' was built under R version 3.5.3
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
library(car)
## Warning: package 'car' was built under R version 3.5.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 3.5.2
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
## The following object is masked from 'package:purrr':
##
## some
library(scales)
##
## Attaching package: 'scales'
## The following object is masked from 'package:purrr':
##
## discard
## The following object is masked from 'package:readr':
##
## col_factor
library(patchwork)
## Warning: package 'patchwork' was built under R version 3.5.3
library(kableExtra)
## Warning: package 'kableExtra' was built under R version 3.5.3
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
dataCO2 <- read.csv("monthly_in_situ_co2_mlo.csv", sep="," ,skip = 57)
colnames(dataCO2) <- c("Year", "Month","Date1", "Date2", "ObsCO2", "SeasAdjCO2","SplineAdjCO2", "SplineAdjCO2Trend", "ObsCO2Comp", "SeasAdjCO2Comp")
summary(dataCO2)
## Year Month Date1 Date2
## Min. :1958 Min. : 1.000 Min. :21231 Min. :1958
## 1st Qu.:1973 1st Qu.: 4.000 1st Qu.:26968 1st Qu.:1974
## Median :1989 Median : 7.000 Median :32704 Median :1990
## Mean :1989 Mean : 6.507 Mean :32705 Mean :1990
## 3rd Qu.:2005 3rd Qu.: 9.500 3rd Qu.:38442 3rd Qu.:2005
## Max. :2020 Max. :12.000 Max. :44180 Max. :2021
## ObsCO2 SeasAdjCO2 SplineAdjCO2 SplineAdjCO2Trend
## Min. :-99.99 Min. :-99.99 Min. :-99.99 Min. :-99.99
## 1st Qu.:328.40 1st Qu.:328.70 1st Qu.:328.46 1st Qu.:328.82
## Median :351.34 Median :352.13 Median :351.33 Median :352.03
## Mean :346.18 Mean :346.18 Mean :348.95 Mean :348.95
## 3rd Qu.:377.55 3rd Qu.:377.35 3rd Qu.:377.69 3rd Qu.:377.37
## Max. :414.83 Max. :413.33 Max. :414.94 Max. :413.35
## ObsCO2Comp SeasAdjCO2Comp
## Min. :-99.99 Min. :-99.99
## 1st Qu.:328.40 1st Qu.:328.70
## Median :351.34 Median :352.13
## Mean :348.96 Mean :348.95
## 3rd Qu.:377.55 3rd Qu.:377.35
## Max. :414.83 Max. :413.33
dataCO2$Date <- ymd(paste0(dataCO2$Year, " ", dataCO2$Month, " ", "15"))
** Remplacement dans la série des valeurs observées, des valeurs manquantes -99.99 par celles qui sont interpolées ** on enlève ensuite les observations manquantes
dataCO2 <- dataCO2[dataCO2$ObsCO2Comp != "-99.99", ]
** Create a column Date with format YYYY MM DD
dataCO2$Date <- ymd(paste0(dataCO2$Year, "-", dataCO2$Month, "-", "15"))
ggplot(dataCO2,aes(Date, dataCO2$ObsCO2Comp)) +
geom_line(color='orange') +
xlab("Year, Month") +
scale_x_date(date_labels = "%Y-%m", date_breaks = "5 year") +
theme(axis.text.x = element_text(face = "bold", color = "#993333",
size = 12, angle = 45, hjust = 1)) +
ylab("CO2 Concentration (ppm)") +
scale_y_continuous() +
theme(axis.text.y = element_text(face = "bold", color = "#993333",
size = 10, hjust = 1),axis.title.y = element_text(size = 10)) +
ggtitle("Graphique 1")
library(viridis)
## Loading required package: viridisLite
##
## Attaching package: 'viridis'
## The following object is masked from 'package:scales':
##
## viridis_pal
dataCO2_by_year <- dataCO2 %>% group_by("Year")
ggplot(dataCO2_by_year, aes(dataCO2_by_year$Month,dataCO2_by_year$ObsCO2Comp )) +
geom_line(aes( group = dataCO2_by_year$Year , colour=dataCO2_by_year$Year)) +
xlab("Month")+
ylab("CO2 Concentration (ppm)") +
ggtitle("Graphique saisonnier")
Série n’est pas stationnaire comme le montre le graphique
Série montre une saisonnalité