| segmented {segmented} | R Documentation |
Fits regression models with segmented relationships between the response and one or more explanatory variables. Break-point estimates are provided.
segmented(obj, seg.Z, psi, control = seg.control(),
model = TRUE, ...)
## Default S3 method:
segmented(obj, seg.Z, psi, control = seg.control(),
model = TRUE, ...)
## S3 method for class 'lm'
segmented(obj, seg.Z, psi, control = seg.control(),
model = TRUE, ...)
## S3 method for class 'glm'
segmented(obj, seg.Z, psi, control = seg.control(),
model = TRUE, ...)
## S3 method for class 'Arima'
segmented(obj, seg.Z, psi, control = seg.control(),
model = TRUE, ...)
obj |
standard ‘linear’ model of class "lm" or "glm". Since version 0.5.0-0 any regression fit may be supplied (see 'Details'). |
seg.Z |
a formula with no response variable, such as |
psi |
named list of vectors. The names have to match the variables in the |
control |
a list of parameters for controlling the fitting process.
See the documentation for |
model |
logical value indicating if the model.frame should be returned. |
... |
optional arguments. |
Given a linear regression model (usually of class "lm" or "glm"), segmented tries to estimate
a new model having broken-line relationships with the variables specified in seg.Z.
A segmented (or broken-line) relationship is defined by the slope
parameters and the break-points where the linear relation changes. The number of breakpoints
of each segmented relationship is fixed via the psi argument, where initial
values for the break-points must be specified. The model
is estimated simultaneously yielding point estimates and relevant approximate
standard errors of all the model parameters, including the break-points.
Since version 0.2-9.0 segmented implements the bootstrap restarting algorithm described in Wood (2001).
The bootstrap restarting is expected to escape the local optima of the objective function when the
segmented relationship is flat and the log likelihood can have multiple local optima.
Since version 0.5-0.0 the default method segmented.default has been added to estimate segmented relationships in
general (besides "lm" and "glm" fits) regression models, such as Cox regression or quantile regression (for a single percentile).
The objective function to be minimized is the (minus) value extracted by the logLik function or it may be passed on via
the fn.obj argument in seg.control. See example below. While the default method is expected to work with any regression
fit (where the usual coef(), update(), and logLik() returns appropriate results), it is not recommended for
"lm" or "glm" fits (as segmented.default is slower than the specific methods segmented.lm and segmented.glm), although
final results are the same. However the object returned by segmented.default is not of class "segmented", as currently
the segmented methods are not guaranteed to work for ‘generic’ (i.e., besides "lm" and "glm") regression fits. The user
could try each "segmented" method on the returned object by calling it explicitly (e.g. via plot.segmented() or confint.segmented()).
The returned object depends on the last component returned by seg.control.
If last=TRUE, the default, segmented returns an object of class "segmented" which inherits
from the class "lm" or "glm" depending on the class of obj. Otherwise a list is
returned, where the last component is the fitted model at the final iteration,
see seg.control.
An object of class "segmented" is a list containing the components of the
original object obj with additionally the followings:
psi |
estimated break-points and relevant (approximate) standard errors |
it |
number of iterations employed |
epsilon |
difference in the objective function when the algorithm stops |
model |
the model frame |
psi.history |
a list or a vector including the breakpoint estimates at each step |
seed |
the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed |
.. |
Other components are not of direct interest of the user |
It is well-known that the log-likelihood function for the
break-point may be not concave, especially
for poor clear-cut kink-relationships. In these circumstances the initial guess
for the break-point, i.e. the psi argument, must be provided with care. For instance visual
inspection of a, possibly smoothed, scatter-plot is usually a good way to obtain some idea on breakpoint location.
However bootstrap restarting, implemented since version 0.2-9.0, is relatively more robust to starting values specified
in psi. Alternatively an automatic procedure may be implemented by specifying psi=NA and
stop.if.error=FALSE in seg.control: experience suggests to increase the number of iterations
via it.max in seg.control(). This automatic procedure, however, is expected to overestimate
the number of breakpoints.
The algorithm will start if the it.max argument returned by seg.control
is greater than zero. If it.max=0 segmented will estimate a new linear model with
break-point(s) fixed at the values reported in psi.
In the returned fit object, ‘U.’ is put before the name of the segmented variable to mean the difference-in-slopes coefficient.
Methods specific to the class "segmented" are
print.segmented
summary.segmented
print.summary.segmented
plot.segmented
lines.segmented
confint.segmented
vcov.segmented
predict.segmented
points.segmented
Others are inherited from the class "lm" or "glm" depending on the
class of obj.
Vito M. R. Muggeo, vito.muggeo@unipa.it
Muggeo, V.M.R. (2003) Estimating regression models with unknown break-points. Statistics in Medicine 22, 3055–3071.
Muggeo, V.M.R. (2008) Segmented: an R package to fit regression models with broken-line relationships. R News 8/1, 20–25.
set.seed(12)
xx<-1:100
zz<-runif(100)
yy<-2+1.5*pmax(xx-35,0)-1.5*pmax(xx-70,0)+15*pmax(zz-.5,0)+rnorm(100,0,2)
dati<-data.frame(x=xx,y=yy,z=zz)
out.lm<-lm(y~x,data=dati)
#simple example: 1 segmented variable, 1 breakpoint: you do not need to specify
# the starting value for psi
o<-segmented(out.lm,seg.Z=~z)
#1 segmented variable, 2 breakpoints: you have to specify starting values (vector) for psi:
o<-segmented(out.lm,seg.Z=~x,psi=c(30,60),
control=seg.control(display=FALSE))
slope(o)
#2 segmented variables: starting values requested via a named list
out.lm<-lm(y~z,data=dati)
o1<-update(o,seg.Z=~x+z,psi=list(x=c(30,60),z=.3))
#the default method leads to the same results (but it is slower)
#o1<-segmented.default(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.3))
#o1<-segmented.default(out.lm,seg.Z=~x+z,psi=list(x=c(30,60),z=.3),
# control=seg.control(fn.obj="sum(x$residuals^2)"))
#automatic procedure to estimate breakpoints in the covariate x
# Notice: bootstrap restart is not allowed!
o<-segmented.lm(out.lm,seg.Z=~x+z,psi=list(x=NA,z=.3),
control=seg.control(stop.if.error=FALSE,n.boot=0, it.max=20))
#assess the progress of the breakpoint estimates throughout the iterations
## Not run:
par(mfrow=c(2,1))
draw.history(o, "x")
draw.history(o, "z")
## End(Not run)
#try to increase the number of iterations and re-assess the
#convergence diagnostics
#An example using the default method:
# Cox regression with a segmented relationship
## Not run:
library(survival)
data(stanford2)
o<-coxph(Surv(time, status)~age, data=stanford2)
os<-segmented(o, ~age, psi=40) #estimate the breakpoint in the age effect
summary(os) #actually it means summary.coxph(os)
plot(os) #it does not work
plot.segmented(os) #call explicitly plot.segmented() to plot the fitted piecewise lines
## End(Not run)