injectSNPs {BSgenome} | R Documentation |
Inject SNPs from a SNPlocs data package into a genome.
injectSNPs(x, SNPlocs_pkgname) SNPlocs_pkgname(x) SNPcount(x) SNPlocs(x, seqname) ## Related utilities available.SNPs(type=getOption("pkgType")) installed.SNPs()
x |
A BSgenome object. |
SNPlocs_pkgname |
The name of a SNPlocs data package containing SNP information for the
single sequences contained in |
seqname |
The name of a single sequence in |
type |
Character string indicating the type of package ( |
injectSNPs
returns a copy of the original genome x
where some
or all of the single sequences were altered by injecting the SNPs defined in
the SNPlocs data package specified thru the SNPlocs_pkgname
argument.
The SNPs in the altered genome are represented by an IUPAC ambiguity code
at each SNP location.
SNPlocs_pkgname
, SNPcount
and SNPlocs
return NULL
if no SNPs were injected in x
(i.e. if x
is not a
BSgenome object returned by a previous
call to injectSNPs
). Otherwise SNPlocs_pkgname
returns
the name of the package from which the
SNPs were injected, SNPcount
the number of SNPs for each altered
sequence in x
, and SNPlocs
their locations in the sequence
whose name is specified by seqname
.
available.SNPs
returns a character vector containing the names of the
SNPlocs data packages that are currently available on the Bioconductor
repositories for your version of R/Bioconductor. A SNPlocs data package
contains basic SNP information (location and alleles) for a given organism.
installed.SNPs
returns a character vector containing the names of the
SNPlocs data packages that are already installed.
injectSNPs
, SNPlocs_pkgname
, SNPcount
and SNPlocs
have the side effect to try to load the SNPlocs data package if it's not
already loaded.
H. Pages
BSgenome-class,
IUPAC_CODE_MAP
,
injectHardMask
,
letterFrequencyInSlidingView
,
.inplaceReplaceLetterAt
## What SNPlocs data packages are already installed: installed.SNPs() ## What SNPlocs data packages are available: available.SNPs() if (interactive()) { ## Make your choice and install with: source("http://bioconductor.org/biocLite.R") biocLite("SNPlocs.Hsapiens.dbSNP.20100427") } ## Inject SNPs from dbSNP into the Human genome: library(BSgenome.Hsapiens.UCSC.hg19) Hsapiens SNPlocs_pkgname(Hsapiens) SNP_Hsapiens <- injectSNPs(Hsapiens, "SNPlocs.Hsapiens.dbSNP.20100427") SNP_Hsapiens # note the extra "with SNPs injected from ..." line SNPlocs_pkgname(SNP_Hsapiens) SNPcount(SNP_Hsapiens) head(SNPlocs(SNP_Hsapiens, "chr1")) alphabetFrequency(Hsapiens$chr1) alphabetFrequency(SNP_Hsapiens$chr1) ## Find runs of SNPs of length at least 25 in chr1. Might require ## more memory than some platforms can handle (e.g. 32-bit Windows ## and maybe some Mac OS X machines with little memory): is_32bit_windows <- .Platform$OS.type == "windows" && .Platform$r_arch == "i386" is_macosx <- substr(R.version$os, start=1, stop=6) == "darwin" if (!is_32bit_windows && !is_macosx) { chr1 <- injectHardMask(SNP_Hsapiens$chr1) ambiguous_letters <- paste(DNA_ALPHABET[5:15], collapse="") lf <- letterFrequencyInSlidingView(chr1, 25, ambiguous_letters) sl <- slice(as.integer(lf), lower=25) v1 <- Views(chr1, start(sl), end(sl)+24) v1 max(width(v1)) # length of longest SNP run }