This document is written in fulfilment of the subject 4: "Latency and capacity estimation for a network connection from asymmetric measurements". The objective of this work is to compare actual latency measurements with a widely-used linear model.
To this end, we are provided with two sets of measurements (with the `ping` command line tool), which descriptions from the original subject reads:
- The [first dataset](http://mescal.imag.fr/membres/arnaud.legrand/teaching/2014/RICM4_EP_ping/liglab2.log.gz) explores a short on-campus connection
- The [second dataset](http://mescal.imag.fr/membres/arnaud.legrand/teaching/2014/RICM4_EP_ping/stackoverflow.log.gz) measures the performance of a connection to a remote Web site that is popular and therefore has a heavy load
## Retrieving the data
The following function lets us download either of the data sets and return its content (after uncompressing it). If the file is already on disk, we do not download again.
These are bare strings. They need some processing to be usable. Before that, let us see if there is any blank line:
```{r}
any(liglab2$raw == '')
```
Before going further, I will just check whether there are lines with less than 10 items (after splitting). This would cause issues in the conversion to a data frame, as some elements in the line could be shifted to the wring index after splitting.
```{r}
valid_rows <- sapply(
strsplit(liglab2$raw, ' '),
function(item) {length(item) == 10}
)
problematic_rows = which(!valid_rows)
length(problematic_rows)
head(liglab2$raw[problematic_rows])
```
there are 377 rows which do not have the expected 10 elements. We observe that they are missing the latency information. We therefore have to remove them.
```{r}
library(dplyr)
valid_row_filter <- function(row) {
len(strsplit(row, ' ')) == 10
}
liglab2 <- liglab2 %>%
```
Let us therefore proceed to extracting the data we are looking for, namely the measurement date, the target machine's IP address, the number of bytes, and the latency.
## Introduction material from the template document
This R Markdown document is made interactive using Shiny. Unlike the more traditional workflow of creating static reports, you can now create documents that allow your readers to change the assumptions underlying your analysis and see the results immediately.
To learn more, see [Interactive Documents](http://rmarkdown.rstudio.com/authoring_shiny.html).
## Inputs and Outputs
You can embed Shiny inputs and outputs in your document. Outputs are automatically updated whenever inputs change. This demonstrates how a standard R plot can be made interactive by wrapping it in the Shiny `renderPlot` function. The `selectInput` and `sliderInput` functions create the input widgets used to drive the plot.
```{r eruptions, echo=FALSE}
library(shiny)
inputPanel(
selectInput("n_breaks", label = "Number of bins:",
hist(faithful$eruptions, probability = TRUE, breaks = as.numeric(input$n_breaks),
xlab = "Duration (minutes)", main = "Geyser eruption duration")
dens <- density(faithful$eruptions, adjust = input$bw_adjust)
lines(dens, col = "blue")
})
```
## Embedded Application
It's also possible to embed an entire Shiny application within an R Markdown document using the `shinyAppDir` function. This example embeds a Shiny application located in another directory:
Note the use of the `height` parameter to determine how much vertical space the embedded application should occupy.
You can also use the `shinyApp` function to define an application inline rather then in an external directory.
In all of R code chunks above the `echo = FALSE` attribute is used. This is to prevent the R code within the chunk from rendering in the document alongside the Shiny components.
## An example from the doc
```{r, echo=FALSE}
## Only run this example in interactive R sessions