AnovaBase.jl

Models

AnovaBase.FullModelType
FullModel{M, N} <: AnovaModel{M, N}

A wrapper of a regression model for conducting ANOVA.

  • M is a type of regression model.
  • N is the number of predictors.

Fields

  • model: a regression model.
  • pred_id: the index of terms included in ANOVA. The source iterable can be obtained by predictors(model). This value may depend on type for certain model, e.g. type 1 ANOVA for a gamma regression model with inverse link.
  • type: type of ANOVA, either 1, 2 or 3.

Constructor

FullModel(model::RegressionModel, type::Int, null::Bool, test_intercept::Bool)
  • model: a regression model.
  • type: type of ANOVA, either 1, 2 or 3.
  • null: whether y ~ 0 is allowed.
  • test_intercept: whether intercept is going to be tested.
source
AnovaBase.NestedModelsType
NestedModels{M, N} <: AnovaModel{M, N}

A wrapper of nested models of the same type for conducting ANOVA.

  • M is a type of regression model.
  • N is the number of models.

Fields

  • model: a tuple of models.

Constructors

NestedModels(model::Vararg{M, N}) where {M, N}
NestedModels(model::NTuple{N, M}) where {M, N}
source
AnovaBase.MixedAovModelsType
MixedAovModels{M, N} <: AnovaModel{M, N}

A wrapper of nested models of multiple types for conducting ANOVA.

  • M is a union type of regression models.
  • N is the number of models.

Fields

  • model: a tuple of models.

Constructors

MixedAovModels{M}(model...) where M 
MixedAovModels{M}(model::T) where {M, T <: Tuple}
source
AnovaBase.MultiAovModelsType
const MultiAovModels{M, N} = Union{NestedModels{M, N}, MixedAovModels{M, N}} where {M, N}

Wrappers of mutiple models.

source
AnovaBase.MultiAovModelsMethod
MultiAovModels(model::NTuple{N, M}) where {M, N} -> NestedModels{M, N}
MultiAovModels(model::Vararg{M, N}) where {M, N} -> NestedModels{M, N}
MultiAovModels(model::T) where {T <: Tuple}      -> MixedAovModels
MultiAovModels(model...)                         -> MixedAovModels

Construct NestedModels or MixedAovModels based on model types.

source
AnovaBase.nestedmodelsMethod
nestedmodels(model; keyword_arguments...)
nestedmodels(model_type, formula, data; keyword_arguments...)

Create nested models NestedModels from a model or model_type, formula and data.

source

ANOVA

AnovaBase.AnovaResultType
AnovaResult{M, T, N}

Returned object of anova.

  • M is NestedModels or FullModel.
  • T is a subtype of GoodnessOfFit; either FTest or LRT.
  • N is the length of parameters.

Fields

  • anovamodel: NestedModels, MixedAovModels, or FullModel.
  • dof: degrees of freedom of models or predictors.
  • deviance: deviance(s) for calculating test statistics. See deviance for more details.
  • teststat: value(s) of test statiscics.
  • pval: p-value(s) of test statiscics.
  • otherstat: NamedTuple contained extra statistics.

Constructor

AnovaResult(
        anovamodel::M,
        ::Type{T},
        dof::NTuple{N, Int},
        deviance::NTuple{N, Float64},
        teststat::NTuple{N, Float64},
        pval::NTuple{N, Float64},
        otherstat::NamedTuple
) where {N, M <: AnovaModel{<: RegressionModel, N}, T <: GoodnessOfFit}
source
AnovaBase.anovaMethod
anova(Test::Type{<: GoodnessOfFit}, anovamodel; keyword_arguments...)
anova(models...; test::Type{<: GoodnessOfFit}, keyword_arguments...)
anova(Test::Type{<: GoodnessOfFit}, model; keyword_arguments...)
anova(Test::Type{<: GoodnessOfFit}, models...; keyword_arguments...)

Analysis of variance.

Return AnovaResult{M, Test, N}. See AnovaResult for details.

  • anovamodel: a AnovaModel.
  • model(s): RegressionModel(s). If mutiple models are provided, they should be nested, fitted with the same data and the last one is the most complex.
  • Test: test statistics for goodness of fit. Available tests are LikelihoodRatioTest (LRT) and FTest.
source

Attributes

AnovaBase.anova_typeMethod
anova_type(aov::AnovaResult)
anova_type(model::MultiAovModels)
anova_type(model::FullModel)

Type of anova, either 1, 2 or 3.

source
StatsAPI.devianceMethod
deviance(aov::AnovaResult)

Return the stored devaince. The value repressents different statistics for different models and tests. It may be deviance, Δdeviance, -2loglikelihood or other measures of model performance.

source
StatsAPI.dofMethod
dof(aov::AnovaResult)

Degrees of freedom of each models or predictors.

source
StatsAPI.nobsMethod
nobs(aov::AnovaResult)
nobs(aov::AnovaResult{<: MultiAovModels})

Number of observations.

source

Goodness of fit

AnovaBase.FTestType
struct FTest <: GoodnessOfFit end

Type indicates conducting ANOVA by F-test. It can be the first argument or keyword argument test in anova function.

source
AnovaBase.LikelihoodRatioTestType
struct LikelihoodRatioTest <: GoodnessOfFit end
const LRT = LikelihoodRatioTest

Type indicates conducting ANOVA by likelihood-ratio test. It can be the first argument or keyword argument test in anova function.

source
AnovaBase.canonicalgoodnessoffitFunction
canonicalgoodnessoffit(::FixDispDist) = LRT
canonicalgoodnessoffit(::UnivariateDistribution) = FTest

const FixDispDist = Union{Bernoulli, Binomial, Poisson}

Return LRT if the distribution has a fixed dispersion; otherwise, FTest.

source

Other interface

AnovaBase.ftest_nestedFunction
ftest_nested(models::MultiAovModels{M, N}, df, dfr, dev, σ²) where {M <: RegressionModel, N}

Calculate F-statiscics and p-values based on given parameters.

  • models: nested models
  • df: degrees of freedoms of each models
  • dfr: degrees of freedom of residuals of each models
  • dev: deviances of each models, i.e. unit deviance
  • σ²: squared dispersion of each models

F-statiscic is (devᵢ - devᵢ₋₁) / (dfᵢ₋₁ - dfᵢ) / σ² for the ith predictor.

source
AnovaBase.lrt_nestedFunction
lrt_nested(models::MultiAovModels{M, N}, df, dev, σ²) where {M <: RegressionModel, N}

Calculate likelihood ratio and p-values based on given parameters.

  • models: nested models
  • df: degrees of freedom of each models
  • dev: deviances of each models, i.e. unit deviance
  • σ²: squared dispersion of each models

The likelihood ratio of the ith predictor is LRᵢ = (devᵢ - devᵢ₋₁) / σ².

If dev is alternatively -2loglikelihood, σ² should be set to 1.

source
StatsAPI.dof_residualMethod
dof_residual(aov::AnovaResult)    
dof_residual(aov::AnovaResult{<: MultiAovModels})

Degrees of freedom of residuals.

By default, it applies dof_residual to models in aov.anovamodel.

source
AnovaBase.predictorsMethod
predictors(model::RegressionModel)
predictors(anovamodel::FullModel)

Return a tuple of Terms which are predictors of the model or anovamodel.

By default, it returns formula(model).rhs.terms; if the formula has special structures, this function should be overloaded.

source
AnovaBase.anovatableMethod
anovatable(aov::AnovaResult{<: FullModel, Test}; rownames = prednames(aov))
anovatable(aov::AnovaResult{<: MultiAovModels, Test}; rownames = string.(1:N))
anovatable(aov::AnovaResult{<: MultiAovModels, FTest, N}; rownames = string.(1:N)) where N
anovatable(aov::AnovaResult{<: MultiAovModels, LRT, N}; rownames = string.(1:N)) where N

Return a table with coefficients and related statistics of ANOVA.

When displaying aov in repl, rownames will be prednames(aov) for FullModel and string.(1:N) for MultiAovModels.

For MultiAovModels, there are two default methods for FTest and LRT; users can also define new methods dispatching on ::AnovaResult{NestedModels{M}} or ::AnovaResult{MixedAovModels{M}} where M is a model type.

For FullModel, no default api is implemented.

The returned AnovaTable object implements the Tables.jl interface, and can be converted e.g. to a DataFrame via using DataFrames; DataFrame(anovatable(aov)).

source

Developer utility

AnovaBase.dof_asgnFunction
dof_asgn(v::Vector{Int})

Calculate degrees of freedom of each predictors. 'assign' can be obtained by StatsModels.asgn(f::FormulaTerm). For a given trm::RegressionModel, it is as same as trm.mm.assign.

The index of the output matches values in the orinal assign. If any index value is not in assign, the default is 0.

Examples

julia> dof_asgn([1, 2, 2, 3, 3, 3])
3-element Vector{Int64}:
 1
 2
 3

julia> dof_asgn([2, 2, 3, 3, 3])
3-element Vector{Int64}:
 0
 2
 3
source
AnovaBase.prednamesFunction
prednames(term)

Return the name(s) of predictor(s). Return value is either a String, an iterable of Strings or nothing.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ SepalWidth + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  SepalWidth(continuous)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> prednames(f)
["(Intercept)", "SepalWidth", "PetalLength", "PetalWidth", "PetalLength & PetalWidth"]

julia> prednames(InterceptTerm{false}())

source
prednames(aov::AnovaResult)
prednames(anovamodel::FullModel) 
prednames(anovamodel::MultiAovModels)
prednames(model)

Return the name of predictors as a vector of strings. When there are multiple models, return value is nothing.

source
AnovaBase.any_not_aliased_with_1Function
any_not_aliased_with_1(terms)

Return true if there are any terms not aliased with the intercept, e.g. ContinuousTerm or FunctionTerm.

Terms without schema are considered aliased with the intercept.

source
AnovaBase.gettermsFunction
getterms(term)

Return the symbol of term(s) as a vector of Expr or Symbol.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> getterms(f)
(Expr[:(log(SepalLength))], [:Species, :PetalLength, :PetalWidth])

julia> getterms(InterceptTerm{true}())
Symbol[]
source
AnovaBase.isinteractFunction
isinteract(m::MatrixTerm, id1::Int, id2::Int)
isinteract(f::TupleTerm, id1::Int, id2::Int)

Determine if f[id2] is an interaction term of f[id1] and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> isinteract(f.rhs, 1, 2)
true

julia> isinteract(f.rhs, 3, 4)
false

julia> isinteract(f.rhs, 4, 5)
true
source
AnovaBase.select_super_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
source
AnovaBase.select_sub_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
source
AnovaBase.select_not_super_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
source
AnovaBase.select_not_sub_interactionFunction
select_super_interaction(m::MatrixTerm, id::Int)
select_super_interaction(f::TupleTerm, id::Int)
select_sub_interaction(m::MatrixTerm, id::Int)
select_sub_interaction(f::TupleTerm, id::Int)
select_not_super_interaction(m::MatrixTerm, id::Int)
select_not_super_interaction(f::TupleTerm, id::Int)
select_not_sub_interaction(m::MatrixTerm, id::Int)
select_not_sub_interaction(f::TupleTerm, id::Int)

Return a set of index of f, which

  1. returned terms are interaction terms of f[id] and other terms.

  2. f[id] is an interaction term of returned terms and other terms.

  3. returned terms not interaction terms of f[id] and other terms.

  4. f[id] is not interaction term of returned terms and other terms.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> select_super_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  5
  3

julia> select_sub_interaction(f.rhs, 3)
Set{Int64} with 2 elements:
  3
  1

julia> select_not_super_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  4
  2
  1

julia> select_not_sub_interaction(f.rhs, 3)
Set{Int64} with 3 elements:
  5
  4
  2
source
AnovaBase.subformulaFunction
subformula(f::FormulaTerm, id; kwargs...)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id::Int; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::MatrixTerm, id; reschema::Bool = false)
subformula(lhs::AbstractTerm, rhs::NTuple{N, AbstractTerm}, id::Int; rhs_id::Int = 1, reschema::Bool = false)

Create formula from existing lhs and rhs (or rhs[tuple_id]) truncated to 1:id or excluded collection id. When id is 0, all terms in rhs (or rhs[tuple_id]) will be removed.

If reschema is true, all terms' schema will be removed.

Examples

julia> iris = dataset("datasets", "iris");

julia> f = formula(lm(@formula(log(SepalLength) ~ Species + PetalLength * PetalWidth), iris))
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalLength(continuous)
  PetalWidth(continuous)
  PetalLength(continuous) & PetalWidth(continuous)

julia> subformula(f, 2)
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)

julia> subformula(f, [3, 5]; reschema = true)
FormulaTerm
Response:
  (SepalLength)->log(SepalLength)
Predictors:
  1
  Species(DummyCoding:3→2)
  PetalWidth(unknown)

julia> f = formula(fit(LinearMixedModel, @formula(SepalLength ~ SepalWidth + (SepalWidth|Species)), iris))
FormulaTerm
Response:
  SepalLength(continuous)
Predictors:
  1
  SepalWidth(continuous)
  (1 + SepalWidth | Species)

julia> subformula(f, 0)
FormulaTerm
Response:
  SepalLength(continuous)
Predictors:
  0
  (1 + SepalWidth | Species)
source
AnovaBase.extract_contrastsFunction
extract_contrasts(f::FormulaTerm)

Extract a dictionary of contrasts. The keys are symbols of term; the values are contrasts (AbstractContrasts).

source
AnovaBase._diffFunction
_diff(t::NTuple)

Return a tuple of difference between adjacent elements of a tuple(later - former).

source
AnovaBase._diffnFunction
_diff(t::NTuple)

Return a tuple of difference between adjacent elements of a tuple(former - later).

source
AnovaBase.AnovaTableType
AnovaTable

A table with coefficients and related statistics of ANOVA. It is mostly modified from StatsBase.CoefTable.

Fields

  • cols: values of each statiscics.
  • colnms: names of statiscics.
  • rownms: names of each row.
  • pvalcol: the index of column repressenting p-value.
  • teststatcol: the index of column representing test statiscics.

Constructor

AnovaTable(cols::Vector, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
AnovaTable(mat::Matrix, colnms::Vector, rownms::Vector, pvalcol::Int = 0, teststatcol::Int = 0)
source
AnovaBase.testnameFunction
testname(::Type{FTest}) = "F test"
testname(::Type{LRT}) = "Likelihood-ratio test"

Name of tests.

source