write.big.matrix, read.big.matrix {bigmemory} | R Documentation |
Create a big.matrix
by reading from a
suitably-formatted ASCII file, or
write the contents of a big.matrix
to a file.
write.big.matrix(x, fileName = NA, row.names = FALSE, col.names = FALSE, sep=',') read.big.matrix(fileName, sep = ',', header = FALSE, row.names = NULL, col.names = NULL, type = NA, skip = 0, separated = FALSE, shared = FALSE, backingfile = NULL, backingpath = NULL, descriptorfile = NULL, preserve = TRUE, extraCols = NULL)
x |
a big.matrix . |
fileName |
the name of an input/output file. |
sep |
a field delimiter. |
header |
if TRUE , the first line (after a possible skip) should contain column names. |
row.names |
if TRUE , use the first column of the file for row names; if a vector of names, use them even if row names appear to exist in the file. |
col.names |
if TRUE , use the first row of the file for column names; if a vector of names, use them even if column names exist in the file. |
type |
preferably specified, "integer" for example. |
skip |
number of lines to skip at the head of the file. |
separated |
use separated column organization of the data instead of column-major organization. |
shared |
if TRUE , load the object into shared memory. |
backingfile |
the root name for the file(s) for the cache of x . |
backingpath |
the path to the directory containing the file backing cache. |
descriptorfile |
the file to be used for the description of the filebacked matrix. |
preserve |
if this is a filebacked big.matrix , it is preserved, by default, even after the end of the R session unless this option is set to FALSE . |
extraCols |
the optional number of extra columns to be appended to the matrix for future use. |
Currently, files must contain only one atomic type
(all integer
, for example).
Once (if) we implement something like big.data.frame
, this assumption will be relaxed.
We have other ideas for useful options as well, including the reading and writing
of subsets of columns.
When reading from a file, if type
is not specified we try to
make a reasonable guess for you without
making any guarantees at this point. The same is true for the field
separator. Warning messages will be printed to alert you of this.
Unless you have really large integer values, we strongly recommend
you consider "short"
. If you have something that is essentially
categorical, you might even be able use "char"
, with huge memory
savings in large data sets.
a big.matrix
object is returned by read.big.matrix
, while
write.big.matrix
creates an output file in the present working directory.
John W. Emerson and Michael J. Kane
# Without specifying the type, this big.matrix x will hold integers. x <- as.big.matrix(matrix(1:10, 5, 2)) x[2,2] <- NA x[,] write.big.matrix(x, "foo.txt") # Just for fun, I'll read it back in as character (1-byte integers): y <- read.big.matrix("foo.txt", type="char") y[,] # Other examples: w <- as.big.matrix(matrix(1:10, 5, 2), type='double') w[1,2] <- NA w[2,2] <- -Inf w[3,2] <- Inf w[4,2] <- NaN w[,] write.big.matrix(w, "bar.txt") w <- read.big.matrix("bar.txt", type="double") w[,] w <- read.big.matrix("bar.txt", type="short") w[,] # Another example using row names (which we don't like). x <- as.big.matrix(as.matrix(iris), type='double') rownames(x) <- as.character(1:nrow(x)) write.big.matrix(x, 'IrisData.txt', col.names=TRUE, row.names=TRUE) y <- read.big.matrix("IrisData.txt", header=TRUE)