| gaitpoisson.mlm {VGAM} | R Documentation |
Fits a generally–altered, –inflated and –truncated Poisson regression (using a multinomial logit model for the altered or inflated values). The truncation may include values in the upper tail. Currently only one of altered or inflated are allowed in a single model.
gaitpoisson.mlm(alter = NULL, inflate = NULL,
truncate = NULL, max.support = Inf,
zero = c("pobs", "pstr"), parallel.ap = FALSE, parallel.ip = FALSE,
llambda = "loglink", type.fitted = c("mean", "pobs.a", "pstr.i",
"Pobs.a", "Pstr.i", "prob.a", "prob.i", "prob.t", "lhs.prob"),
imethod = 1, mux.inflate = 0.5, ilambda = NULL,
ishrinkage = 0.95, probs.y = 0.35)
alter, inflate, truncate |
Vector of altered, inflated and truncated values,
i.e., nonnegative integers.
A See |
llambda |
Link function for the parent distribution.
See |
parallel.ap, parallel.ip |
Single logical each.
Constrain the MLM probabilities to be equal?
If so then this applies to all
|
type.fitted |
See The choice |
max.support |
See |
imethod, ilambda, mux.inflate |
See |
probs.y, ishrinkage |
See |
zero |
See |
The full distribution being considered can be abbreviated GAIT–Pois–MLM–MLM, which is where the inner distribution for ordinary values is the Poisson distribution, and the outer distributions for the altered and inflated values are unstructured probabilities. Thus the distribution being fitted is a mixture of a Poisson and two MLMs with the support of one of the MLMs being equal to the set of altered values and the other for the inflated values. Hence the probability for each inflated value comes from two sources: the parent distribution and a MLM.
The two submodels that may be fitted can be abbreviated
GAT–Pois–MLM or
GIT–Pois–MLM.
An alternative variant of this distribution,
more structured in nature,
replaces the MLMs by Poisson distributions with potentially
different means—see gaitpoisson.mix.
This family function may be described as nonparametric while
gaitpoisson.mix is parametric.
For the GIT model,
the ordering of the linear/additive predictors corresponds to
length(inflate) equalling 0 and more than 0;
the dimension grows accordingly.
The same idea holds for the GAT model.
This function currently does not handle multiple responses.
Further details are at Gaitpois.
Apart from the order of the linear/additive predictors,
the following are (or should be) equivalent:
gaitpoisson.mlm() and poissonff(),
gaitpoisson.mlm(alter = 0) and zapoisson(zero = "pobs0"),
gaitpoisson.mlm(inflate = 0) and zipoisson(zero = "pstr0"),
gaitpoisson.mlm(truncate = 0) and pospoisson().
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm,
and vgam.
The fitted.values slot of the fitted object,
which should be extracted by the generic function fitted,
is similar to gaitpoisson.mix.
The default is to
return the mean mu.
The choice type.fitted = "pobs.a" returns
a matrix of all
the altered probabilities (Greek symbol omega);
each column is an omega, and the last column is labelled
"(Others)".
The choice type.fitted = "Pobs.a" returns the same.
The choice "pstr.i" returns
a matrix of all
the inflated probabilities (Greek symbol phi);
each column is a phi, and the last column is labelled
"(Others)".
The choice "Pstr.i" returns a length(inflate)-column
matrix with probabilities coming from the two sources.
The choice "prob.a" returns the sum of the
probability of having
an altered value, given the estimate of lambda from
the parent distribution; i.e., it is summed over alter
and evaluated at the parent PMF.
The choice "prob.i" is similar to "prob.a" but
summed over inflate.
Likewise,
the choice "prob.t"
summed over truncate.
Using inflate requires more caution than alter
because gross inflation is ideally needed for it to work safely.
Deflation or no inflation will produce numerical problems,
hence set trace = TRUE to monitor convergence.
See gaitpoisson.mix for other general details
and warnings.
T. W. Yee
Yee, T. W. and Ma, C. (2020). Generally–altered, –inflated and –truncated regression, with application to heaped and seeped count data. In preparation.
Gaitpois,
gaitpoisson.mix,
gatnbinomial.mlm,
zipoisson,
pospoisson,
CommonVGAMffArguments,
rootogram4,
simulate.vlm.
ivec <- c(3, 4) # Inflate these values
tvec <- c(5, 7) # Truncate these values
max.support <- 20 # Values beyond here are truncated
pstr.i <- 0.1 # Common value for all values of ivec
gdata <- data.frame(x2 = runif(nn <- 1000))
gdata <- transform(gdata, lambda = exp(1.5 + 0.5 * x2))
gdata <- transform(gdata,
y1 = rgaitpois(nn, lambda, pstr.mlm.i = pstr.i, inflate.mlm = ivec,
truncate = tvec, max.support = max.support))
gaitpoisson.mlm(inflate = ivec)
with(gdata, table(y1))
fit1 <- vglm(y1 ~ x2, crit = "coef", trace = TRUE, data = gdata,
gaitpoisson.mlm(inflate = ivec, truncate = tvec,
parallel.ip = TRUE, # Good idea here
max.support = max.support))
head(fitted(fit1, type.fitted = "Pstr.i"))
head(predict(fit1))
log(pstr.i / (1 - length(ivec) * pstr.i)) # Value of some of the etas
coef(fit1, matrix = TRUE)
summary(fit1)