AnovaBase.jl
Models
AnovaBase.AnovaModel
— Typeabstract type AnovaModel{M, N} end
An abstract type as super type of any models for ANOVA.
AnovaBase.FullModel
— TypeFullModel{M, N} <: AnovaModel{M, N}
A wrapper of a regression model for conducting ANOVA.
M
is a type of regression model.N
is the number of predictors.
Fields
model
: a regression model.pred_id
: the index of terms included in ANOVA. The source iterable can be obtained bypredictors(model)
. This value may depend ontype
for certain model, e.g. type 1 ANOVA for a gamma regression model with inverse link.type
: type of ANOVA, either 1, 2 or 3.
Constructor
FullModel(model::RegressionModel, type::Int, null::Bool, test_intercept::Bool)
model
: a regression model.type
: type of ANOVA, either 1, 2 or 3.null
: whethery ~ 0
is allowed.test_intercept
: whether intercept is going to be tested.
AnovaBase.NestedModels
— TypeNestedModels{M, N} <: AnovaModel{M, N}
A wrapper of nested models of the same types for conducting ANOVA.
M
is a type of regression model.N
is the number of models.
Fields
model
: a tuple of models.
Constructors
NestedModels(model::Vararg{M, N}) where {M, N}
NestedModels(model::NTuple{N, M}) where {M, N}
AnovaBase.MixedAovModels
— TypeMixedAovModels{M, N} <: AnovaModel{M, N}
A wrapper of nested models of multiple types for conducting ANOVA.
M
is a union type of regression models.N
is the number of models.
Fields
model
: a tuple of models.
Constructors
MixedAovModels{M}(model...) where M
MixedAovModels{M}(model::T) where {M, T <: Tuple}
AnovaBase.MultiAovModels
— Typeconst MultiAovModels{M, N} = Union{NestedModels{M, N}, MixedAovModels{M, N}} where {M, N}
Wrappers of mutiple models.
AnovaBase.MultiAovModels
— MethodMultiAovModels(model::NTuple{N, M}) where {M, N} -> NestedModels{M, N}
MultiAovModels(model::Vararg{M, N}) where {M, N} -> NestedModels{M, N}
MultiAovModels(model::T) where {T <: Tuple} -> MixedAovModels
MultiAovModels(model...) -> MixedAovModels
Construct NestedModels
or MixedAovModels
based on model types.
AnovaBase.nestedmodels
— Methodnestedmodels(<model>; <keyword arguments>)
nestedmodels(<model type>, formula, data; <keyword arguments>)
Create nested models NestedModels
from a model or modeltype, formula and data.
ANOVA
AnovaBase.AnovaResult
— TypeAnovaResult{M, T, N}
Returned object of anova
.
M
isNestedModels
orFullModel
.T
is a subtype ofGoodnessOfFit
; eitherFTest
orLRT
.N
is the length of parameters.
Fields
anovamodel
:NestedModels
,MixedAovModels
, orFullModel
.dof
: degrees of freedom of models or predictors.deviance
: deviance(s) for calculating test statistics. Seedeviance
for more details.teststat
: value(s) of test statiscics.pval
: p-value(s) of test statiscics.otherstat
:NamedTuple
contained extra statistics.
Constructor
AnovaResult(
anovamodel::M,
::Type{T},
dof::NTuple{N, Int},
deviance::NTuple{N, Float64},
teststat::NTuple{N, Float64},
pval::NTuple{N, Float64},
otherstat::NamedTuple
) where {N, M <: AnovaModel{<: RegressionModel, N}, T <: GoodnessOfFit}
AnovaBase.anova
— Methodanova(Test::Type{<: GoodnessOfFit}, <anovamodel>; <keyword arguments>)
anova(<models>...; test::Type{<: GoodnessOfFit}, <keyword arguments>)
anova(Test::Type{<: GoodnessOfFit}, <model>; <keyword arguments>)
anova(Test::Type{<: GoodnessOfFit}, <models>...; <keyword arguments>)
Analysis of variance.
Return AnovaResult{M, Test, N}
. See AnovaResult
for details.
anovamodel
: aAnovaModel
.models
:RegressionModel
(s). If mutiple models are provided, they should be nested, fitted with the same data and the last one is the most complex.Test
: test statistics for goodness of fit. Available tests areLikelihoodRatioTest
(LRT
) andFTest
.
Attributes
AnovaBase.anova_test
— Methodanova_test(::AnovaResult)
Test statiscics of anova
. See AnovaResult
for details.
AnovaBase.anova_type
— Methodanova_type(aov::AnovaResult)
anova_type(model::MultiAovModels)
anova_type(model::FullModel)
Type of anova
, either 1, 2 or 3.
AnovaBase.pval
— Methodteststat(aov::AnovaResult)
P-values of test statiscics of anova
. See AnovaResult
for details.
AnovaBase.teststat
— Methodteststat(aov::AnovaResult)
Values of test statiscics of anova
. See AnovaResult
for details.
StatsAPI.deviance
— Methoddeviance(aov::AnovaResult)
Return the stored devaince. The value repressents different statistics for different models and tests. It may be deviance, Δdeviance, -2loglikelihood or other measures of model performance.
StatsAPI.dof
— Methoddof(aov::AnovaResult)
Degrees of freedom of each models or predictors.
StatsAPI.nobs
— Methodnobs(aov::AnovaResult)
nobs(aov::AnovaResult{<: MultiAovModels})
Number of observations.
Goodness of fit
AnovaBase.GoodnessOfFit
— Typeabstract type GoodnessOfFit end
An abstract type as super type of goodness of fit.
AnovaBase.FTest
— Typestruct FTest <: GoodnessOfFit end
Type indicates conducting ANOVA by F-test. It can be the first argument or keyword argument test
.
AnovaBase.LikelihoodRatioTest
— Typestruct LikelihoodRatioTest <: GoodnessOfFit end
const LRT = LikelihoodRatioTest
Type indicates conducting ANOVA by likelihood-ratio test. It can be the first argument or keyword argument test
.
AnovaBase.canonicalgoodnessoffit
— Functioncanonicalgoodnessoffit(::FixDispDist) = LRT
canonicalgoodnessoffit(::UnivariateDistribution) = FTest
const FixDispDist = Union{Bernoulli, Binomial, Poisson}
Return LRT
if the distribution has a fixed dispersion; otherwise, FTest
.
Other interface
AnovaBase.ftest_nested
— Functionftest_nested(models::MultiAovModels{M, N}, df, dfr, dev, σ²) where {M <: RegressionModel, N}
Calculate F-statiscics and p-values based on given parameters.
models
: nested modelsdf
: degrees of freedoms of each modelsdfr
: degrees of freedom of residuals of each modelsdev
: deviances of each models, i.e. unit devianceσ²
: squared dispersion of each models
F-statiscic is (devᵢ - devᵢ₋₁) / (dfᵢ₋₁ - dfᵢ) / σ²
for the ith predictor.
AnovaBase.lrt_nested
— Functionlrt_nested(models::MultiAovModels{M, N}, df, dev, σ²) where {M <: RegressionModel, N}
Calculate likelihood ratio and p-values based on given parameters.
models
: nested modelsdf
: degrees of freedom of each modelsdev
: deviances of each models, i.e. unit devianceσ²
: squared dispersion of each models
The likelihood ratio of the ith predictor is LRᵢ = (devᵢ - devᵢ₋₁) / σ²
.
If dev
is alternatively -2loglikelihood
, σ²
should be set to 1.
StatsAPI.dof_residual
— Methoddof_residual(aov::AnovaResult)
dof_residual(aov::AnovaResult{<: MultiAovModels})
Degrees of freedom of residuals.
By default, it applies dof_residual
to models in aov.anovamodel
.
AnovaBase.predictors
— Methodpredictors(model::RegressionModel)
predictors(anovamodel::FullModel)
Return a tuple of Terms
which are predictors of the model or anovamodel.
By default, it returns formula(model).rhs.terms
; if the formula has special structures, this function should be overloaded.
AnovaBase.anovatable
— Methodanovatable(aov::AnovaResult{<: FullModel, Test}; rownames = prednames(aov))
anovatable(aov::AnovaResult{<: MultiAovModels, Test}; rownames = string.(1:N))
anovatable(aov::AnovaResult{<: MultiAovModels, FTest, N}; rownames = string.(1:N)) where N
anovatable(aov::AnovaResult{<: MultiAovModels, LRT, N}; rownames = string.(1:N)) where N
Return a table with coefficients and related statistics of ANOVA.
When displaying aov
in repl, rownames
will be prednames(aov)
for FullModel
and string.(1:N)
for MultiAovModels
.
For MultiAovModels
, there are two default methods for FTest
and LRT
; one can also define new methods dispatching on ::AnovaResult{NestedModels{M}}
or ::AnovaResult{MixedAovModels{M}}
where M
is a model type.
For FullModel
, no default api is implemented.
The returned AnovaTable
object implements the Tables.jl
interface, and can be converted e.g. to a DataFrame via using DataFrames; DataFrame(anovatable(aov))
.
Developer utility
AnovaBase.dof_asgn
— Functiondof_asgn(v::Vector{Int})
Calculate degrees of freedom of each predictors. 'assign' can be obtained by StatsModels.asgn(f::FormulaTerm)
. For a given trm::RegressionModel
, it is as same as trm.mm.assign
.
The index of the output matches values in the orinal assign
. If any index value is not in assign
, the default is 0.
Examples
julia> dof_asgn([1, 2, 2, 3, 3, 3])
3-element Vector{Int64}:
1
2
3
julia> dof_asgn([2, 2, 3, 3, 3])
3-element Vector{Int64}:
0
2
3
AnovaBase.prednames
— Functionprednames(<term>)
Return the name(s) of predictor(s). Return value is either a String
, an iterable of String
s or nothing
.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ SepalWidth + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
SepalWidth(continuous)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> prednames(f)
["(Intercept)", "SepalWidth", "PetalLength", "PetalWidth", "PetalLength & PetalWidth"]
julia> prednames(InterceptTerm{false}())
prednames(aov::AnovaResult)
prednames(anovamodel::FullModel)
prednames(anovamodel::MultiAovModels)
prednames(<model>)
Return the name of predictors as a vector of strings. When there are multiple models, return value is nothing
.
AnovaBase.any_not_aliased_with_1
— Functionany_not_aliased_with_1(<terms>)
Return true
if there are any terms not aliased with the intercept, e.g. ContinuousTerm
or FunctionTerm
.
Terms without schema are considered aliased with the intercept.
AnovaBase.getterms
— Functiongetterms(<term>)
Return the symbol of term(s) as a vector of Expr
or Symbol
.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> getterms(f)
(Expr[:(log(SepalLength))], [:Species, :PetalLength, :PetalWidth])
julia> getterms(InterceptTerm{true}())
Symbol[]
AnovaBase.isinteract
— Functionisinteract(m::MatrixTerm, id1::Int, id2::Int)
isinteract(f::TupleTerm, id1::Int, id2::Int)
Determine if f[id2]
is an interaction term of f[id1]
and other terms.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> isinteract(f.rhs, 1, 2)
true
julia> isinteract(f.rhs, 3, 4)
false
julia> isinteract(f.rhs, 4, 5)
true
AnovaBase.select_super_interaction
— Functionselect_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)
Return a set of index of f
, which
returned terms are interaction terms of
f[id]
and other terms.f[id]
is an interaction term of returned terms and other terms.returned terms not interaction terms of
f[id]
and other terms.f[id]
is not interaction term of returned terms and other terms.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
5
3
julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
3
1
julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
4
2
1
julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
5
4
2
AnovaBase.select_sub_interaction
— Functionselect_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)
Return a set of index of f
, which
returned terms are interaction terms of
f[id]
and other terms.f[id]
is an interaction term of returned terms and other terms.returned terms not interaction terms of
f[id]
and other terms.f[id]
is not interaction term of returned terms and other terms.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
5
3
julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
3
1
julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
4
2
1
julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
5
4
2
AnovaBase.select_not_super_interaction
— Functionselect_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)
Return a set of index of f
, which
returned terms are interaction terms of
f[id]
and other terms.f[id]
is an interaction term of returned terms and other terms.returned terms not interaction terms of
f[id]
and other terms.f[id]
is not interaction term of returned terms and other terms.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
5
3
julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
3
1
julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
4
2
1
julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
5
4
2
AnovaBase.select_not_sub_interaction
— Functionselect_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)
Return a set of index of f
, which
returned terms are interaction terms of
f[id]
and other terms.f[id]
is an interaction term of returned terms and other terms.returned terms not interaction terms of
f[id]
and other terms.f[id]
is not interaction term of returned terms and other terms.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
5
3
julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
3
1
julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
4
2
1
julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
5
4
2
AnovaBase.subformula
— Functionsubformula(f::FormulaTerm, id; kwargs...)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id::Int; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::NTuple{N, AbstractTerm}, id::Int; rhs_id::Int = 1, reschema::Bool = false)
Create formula from existing lhs
and rhs
(or rhs[tuple_id]
) truncated to 1:id
or excluded collection id
. When id
is 0, all terms in rhs
(or rhs[tuple_id]
) will be removed.
If reschema
is true, all terms' schema will be removed.
Examples
julia> iris = dataset("datasets", "iris");
julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalLength(continuous)
PetalWidth(continuous)
PetalLength(continuous) & PetalWidth(continuous)
julia> subformula(f, 2)
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
julia> subformula(f, [3, 5]; reschema = true)
FormulaTerm
Response:
(SepalLength)->log(SepalLength)
Predictors:
1
Species(DummyCoding:3→2)
PetalWidth(unknown)
julia> f = formula(fit(LinearMixedModel, @formula(SepalLength ~ SepalWidth + (SepalWidth|Species)), iris))
FormulaTerm
Response:
SepalLength(continuous)
Predictors:
1
SepalWidth(continuous)
(1 + SepalWidth | Species)
julia> subformula(f, 0)
FormulaTerm
Response:
SepalLength(continuous)
Predictors:
0
(1 + SepalWidth | Species)
AnovaBase.clear_schema
— Functionclear_schema(<terms with schema>) = <terms without schema>
Clear any applied schema on terms.
AnovaBase.extract_contrasts
— Functionextract_contrasts(f::FormulaTerm)
Extract a dictionary of contrasts. The keys are symbols of term; the values are contrasts (AbstractContrasts
).
AnovaBase._diff
— Function_diff(t::NTuple)
Return a tuple of difference between adjacent elements of a tuple(later - former).
AnovaBase._diffn
— Function_diff(t::NTuple)
Return a tuple of difference between adjacent elements of a tuple(former - later).
AnovaBase.AnovaTable
— TypeAnovaTable
A table with coefficients and related statistics of ANOVA. It is mostly modified from StatsBase.CoefTable
.
Fields
cols
: values of each statiscics.colnms
: names of statiscics.rownms
: names of each row.pvalcol
: the index of column repressenting p-value.teststatcol
: the index of column representing test statiscics.
Constructor
AnovaTable(cols::Vector, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
AnovaTable(mat::Matrix, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
AnovaBase.testname
— Functiontestname(::Type{FTest}) = "F test"
testname(::Type{LRT}) = "Likelihood-ratio test"
Name of tests.