Commit ee3c40ac authored by François Févotte's avatar François Févotte

Ex 4-1: meilleure manière de traiter les poids dans la régression

parent b1342638
This diff is collapsed.
...@@ -99,32 +99,23 @@ This corresponds to the values from the article of Dalal et al. The standard ...@@ -99,32 +99,23 @@ This corresponds to the values from the article of Dalal et al. The standard
errors are errors are
$s_{\hat{\alpha}} = `j @printf "%.3f" σα`$ and $s_{\hat{\alpha}} = `j @printf "%.3f" σα`$ and
$s_{\hat{\beta}} = `j @printf "%.3f" σβ`$, $s_{\hat{\beta}} = `j @printf "%.3f" σβ`$,
which is different from the $3.052$ and $0.047$ reported by Dallal et al. The which is different from the $3.052$ and $0.047$ reported by Dallal et al.
deviance is
$G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom.
I cannot find any value similar to the Goodness of fit ($G^2=18.086$) reported The deviance is $G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom.
by Dalal et al. However, the number of degrees of freedom is similar to theirs I cannot find any value similar to the Goodness of fitreported by Dalal *et al.*
(21). ($G^2=18.086$). However, the number of degrees of freedom is different but at
least similar to theirs (21).
There seems to be something wrong. Oh I know, I haven't indicated that my There seems to be something wrong. Oh I know, I haven't indicated that my
observations are actually the result of 6 observations for each rocket observations are actually the result of 6 observations for each rocket
launch. The correct way to do this would be to weight the data using the `Count` launch. Let's indicate these weights (since the weights are always the same
column. Since I don't know how to do that with the throughout all experiments, it does not change the estimates of the fit but it
[GLM](https://github.com/JuliaStats/GLM.jl) package I'm using, I will simply does influence de variance estimate).
duplicate the data:
```julia; wrap=false; hold=true ```julia; wrap=false; hold=true
weighted_data = DataFrame(Temperature=Int[], Frequency=Float64[]) model = glm(@formula(Frequency ~ Temperature), data,
for row in eachrow(data) Binomial(), LogitLink();
for _ in 1:row.Count wts=data.Count)
push!(weighted_data, (Temperature=row.Temperature,
Frequency=row.Frequency))
end
end
model = glm(@formula(Frequency ~ Temperature), weighted_data,
Binomial(), LogitLink())
α, β = coef(model) α, β = coef(model)
σα, σβ = stderror(model) σα, σβ = stderror(model)
...@@ -142,8 +133,11 @@ $s_{\hat{\beta}} = `j @printf "%.3f" σβ`$, ...@@ -142,8 +133,11 @@ $s_{\hat{\beta}} = `j @printf "%.3f" σβ`$,
The Goodness of fit (Deviance) indicated for this model is The Goodness of fit (Deviance) indicated for this model is
$G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom. Now $G^2$ is in $G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom. Now $G^2$ is in
good accordance to the results of the Dalal *et al.* article, but the number of good accordance to the results of the Dalal *et al.* article, but the number of
degrees of freedom is 6 times larger than i should, due to my tampering of the degrees of freedom is approximately 6 times larger than that of Dalal *et
data to duplicate them instead of weighting them. al*. Note that, even removing this factor (which is probably due to the way the
number of residual degrees of freedom are defined in both libraries in the
presence of weights), the values are similar but still differ by
`j @printf "%2.0f" 100 * (nDOF/6/21 - 1)`%.
# Predicting failure probability # Predicting failure probability
......
No preview for this file type
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment