biglm.big.matrix, bigglm.big.matrix {bigmemory} | R Documentation |
This is a wrapper to Thomas Lumley's biglm
package, allowing its use with data stored in big.matrix
objects.
biglm.big.matrix(formula, data, fc=NULL, chunksize=NULL, weights=NULL, sandwich=FALSE) bigglm.big.matrix(formula, data, family=gaussian(), fc=NULL, chunksize=NULL, weights=NULL, sandwich=FALSE, maxit=8, tolerance=1e-7, start=NULL)
formula |
a model formula . |
data |
a big.matrix . |
fc |
either column indices or names of variables that are factors. |
chunksize |
an integer maximum size of chunks of data to process iteratively. |
weights |
a one-sided, single term formula specifying weights (see biglm for more information). |
sandwich |
TRUE to compute the Huber/White sandwich covariance matrix (see biglm for more information). |
family |
a glm family object |
maxit |
maximum number of Fisher scoring iterations. |
tolerance |
tolerance for change in coefficient (as multiple of standard error). |
start |
optional starting values for coefficients. If NULL , maxit should be at least 2 as some quantities will not be computed on the first iteration. |
See Thomas Lumley's biglm package for more information; chunksize
defaults to
floor(nrow(data)/ncol(data)^2)
.
an object of class biglm
.
Michael J. Kane
Algorithm AS274 Applied Statistics (1992) Vol. 41, No.2
Thomas Lumley (2005). biglm: bounded memory linear and generalized linear models. R package version 0.4.
# This example is quite silly, using the iris # data. But it shows that our wrapper to Lumley's biglm() function produces # the same answer as the plain old lm() function. ## Not run: x <- matrix(unlist(iris), ncol=5) colnames(x) <- names(iris) x <- as.big.matrix(x) head(x) silly.biglm <- biglm.big.matrix(Sepal.Length ~ Sepal.Width + Species, data=x, fc="Species") summary(silly.biglm) y <- data.frame(x[,]) y$Species <- as.factor(y$Species) head(y) silly.lm <- lm(Sepal.Length ~ Sepal.Width + Species, data=y) summary(silly.lm) ## End(Not run)