From e9b18e1405560d2d0fc5de05c5144337dae363a1 Mon Sep 17 00:00:00 2001
From: NourElh <734092651fcdd5add927271f472626a6@app-learninglab.inria.fr>
Date: Wed, 18 Jan 2023 20:30:43 +0000
Subject: [PATCH] Upload New File

---
 DoE/DoE.Rmd | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 105 insertions(+)
 create mode 100644 DoE/DoE.Rmd

diff --git a/DoE/DoE.Rmd b/DoE/DoE.Rmd
new file mode 100644
index 0000000..a33978e
--- /dev/null
+++ b/DoE/DoE.Rmd
@@ -0,0 +1,105 @@
+---
+title: "Design of Experiments"
+author: "El-hassane Nour"
+date: "January 8, 2023"
+output:
+  pdf_document: default
+  html_document: default
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+```
+
+# Playing with the DoE Shiny Application
+The model studied in this experiment is a black-blox, where x1, x2, ..., x11 are controlable factores, z1,...,z11 are uncontrolable factors and y is the output. 
+In order to approximate this unknown model, we need first to determine which variables are the most significant on the response y, using screening designs.
+
+Then, define and fit an analytical model of the response y as a function of the primary factors x using regression and lhs & optimal designs.
+
+## 1. First intuition
+My first intuition was to run an lhs design using the 11 factors to have a general overview about the response behavior. 
+```{r}
+library(DoE.wrapper)
+ set.seed(45);
+design <- lhs.design(type= "maximin", nruns= 500 , nfactors= 11, digits=NULL, seed= 20523, factor.names=list(X1=c(0,1),X2=c(0,1),X3=c(0,1),X4=c(0,1),X5=c(0,1),X6=c(0,1),X7=c(0,1),X8=c(0,1),X9=c(0,1),X10=c(0,1),X11=c(0,1)))
+
+design.Dopt <- Dopt.design(30, data=design, nRepeat= 5, randomize= TRUE, seed=19573); design.Dopt
+```
+Once data was generated, I tested it on the DoE shiny app and got a csv file containing the generated data with the corresponding response.
+```{r}
+df <- read.csv("exp.csv",header = TRUE, colClasses=c("NULL", NA, NA, NA, NA, NA, NA, NA,NA, NA, NA, NA, NA));
+df
+```
+## 2. Designing experiments to run
+### 2.1 Screening design using Plackett-Burman screening designs
+Now, we're interested to see the factors effects in more in details. This will allow us to define the most efficient factor that influence the response. Since running a large number of such experiments is tedious, we gonna use the Plackett-Burman designs to see the different possible interactions.
+```{r}
+library(FrF2)
+d<-pb(nruns= 12 ,n12.taguchi= FALSE ,nfactors= 12 -1, ncenter= 0 , replications= 1 ,repeat.only= FALSE ,randomize= TRUE ,seed= 26654 ,factor.names=list( X1=c(0,1),X2=c(0,1),X3=c(0,1),X4=c(0,1),
+X5=c(0,1),X6=c(0,1),X7=c(0, 1),X8=c(0,1),X9=c(0,1),X10=c(0,1),
+X11=c(0,1)));d
+```
+
+Here are the results:
+
+   X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11
+1   1  0  1  1  0  1  1  1  0   0   0
+2   1  0  0  0  1  0  1  1  0   1   1
+3   0  0  1  0  1  1  0  1  1   1   0
+4   0  1  1  0  1  1  1  0  0   0   1
+5   0  1  0  1  1  0  1  1  1   0   0
+6   1  1  0  1  1  1  0  0  0   1   0
+7   1  1  1  0  0  0  1  0  1   1   0
+8   1  0  1  1  1  0  0  0  1   0   1
+9   0  0  0  0  0  0  0  0  0   0   0
+10  1  1  0  0  0  1  0  1  1   0   1
+11  0  0  0  1  0  1  1  0  1   1   1
+12  0  1  1  1  0  0  0  1  0   1   1
+
+### 2.2. Regression and Analyse of Variance
+In order to vizualise the correlation between factors and the response, we do a linear regression on the data generated by the application, and analyse the variance to see the effect.
+
+#### Experiment 1: X1, X3, X4, X6, X7, X8 taken into account
+```{r}
+y<-df$y
+summary(lm(y~df$x1+df$x3+df$x4+df$x6+df$x7+df$x8,data=df))
+```
+Nothing interesting :/, even R^2 is too small wich is too bad.
+
+#### Experiment 2: X1, X5, X7, X8, X10, X11 taken into account
+```{r}
+y<-df$y
+summary(lm(y~df$x1+df$x5+df$x7+df$x8+df$x10+df$x11,data=df))
+```
+Very small improvement in R^2, but still too bad.
+
+#### Experiment 3: X3, X5, X6, X8, X9, X10 taken into account
+```{r}
+y<-df$y
+summary(lm(y~df$x3+df$x5+df$x6+df$x8+df$x9+df$x10,data=df))
+```
+It seems that X9 is a significant factor that influences the model, and the determination coefficient R^2 is 0.73 now, which is pretty good.
+
+#### Try the combination X3*X6:
+```{r}
+y<-df$y
+summary(lm(y~df$x3+df$x9+df$x6+df$x3:df$x6,data=df))
+```
+
+Not too much interesting. Let's try another experiment where X9 and X3 are set to 1.
+
+#### Experiment 8: X1, X3, X4, X5, X6, X9, X11 taken into account
+```{r}
+y<-df$y
+summary(lm(y~df$x1+df$x3+df$x4+df$x5+df$x6+df$x9,data=df))
+```
+
+Removing X11 doesn't have any effect. We decide to keep X9, X6, X4, combine X5 with X7, combine X1 with X3.
+
+```{r}
+y<-df$y
+summary(lm(y~df$x3:df$x1+df$x6+df$x5:df$x7+df$x4+df$x9,data=df))
+```
+We can say that our model is:
+y= -2.72\*X9 + 0.72\*X4 + 0.42\*X6 + 0.99\*X1\*X3 -0.77\*X5\*X7 + 1.55
\ No newline at end of file
-- 
2.18.1