Commit ee3c40ac authored by François Févotte's avatar François Févotte

Ex 4-1: meilleure manière de traiter les poids dans la régression

parent b1342638
This diff is collapsed.
......@@ -99,32 +99,23 @@ This corresponds to the values from the article of Dalal et al. The standard
errors are
$s_{\hat{\alpha}} = `j @printf "%.3f" σα`$ and
$s_{\hat{\beta}} = `j @printf "%.3f" σβ`$,
which is different from the $3.052$ and $0.047$ reported by Dallal et al. The
deviance is
$G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom.
which is different from the $3.052$ and $0.047$ reported by Dallal et al.
I cannot find any value similar to the Goodness of fit ($G^2=18.086$) reported
by Dalal et al. However, the number of degrees of freedom is similar to theirs
(21).
The deviance is $G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom.
I cannot find any value similar to the Goodness of fitreported by Dalal *et al.*
($G^2=18.086$). However, the number of degrees of freedom is different but at
least similar to theirs (21).
There seems to be something wrong. Oh I know, I haven't indicated that my
observations are actually the result of 6 observations for each rocket
launch. The correct way to do this would be to weight the data using the `Count`
column. Since I don't know how to do that with the
[GLM](https://github.com/JuliaStats/GLM.jl) package I'm using, I will simply
duplicate the data:
launch. Let's indicate these weights (since the weights are always the same
throughout all experiments, it does not change the estimates of the fit but it
does influence de variance estimate).
```julia; wrap=false; hold=true
weighted_data = DataFrame(Temperature=Int[], Frequency=Float64[])
for row in eachrow(data)
for _ in 1:row.Count
push!(weighted_data, (Temperature=row.Temperature,
Frequency=row.Frequency))
end
end
model = glm(@formula(Frequency ~ Temperature), weighted_data,
Binomial(), LogitLink())
model = glm(@formula(Frequency ~ Temperature), data,
Binomial(), LogitLink();
wts=data.Count)
α, β = coef(model)
σα, σβ = stderror(model)
......@@ -142,8 +133,11 @@ $s_{\hat{\beta}} = `j @printf "%.3f" σβ`$,
The Goodness of fit (Deviance) indicated for this model is
$G^2 = `j @printf "%.3f" G²`$ with `j nDOF` degrees of freedom. Now $G^2$ is in
good accordance to the results of the Dalal *et al.* article, but the number of
degrees of freedom is 6 times larger than i should, due to my tampering of the
data to duplicate them instead of weighting them.
degrees of freedom is approximately 6 times larger than that of Dalal *et
al*. Note that, even removing this factor (which is probably due to the way the
number of residual degrees of freedom are defined in both libraries in the
presence of weights), the values are similar but still differ by
`j @printf "%2.0f" 100 * (nDOF/6/21 - 1)`%.
# Predicting failure probability
......
No preview for this file type
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment