Title: | Pre-processing for GC-IMS data |
---|---|
Description: | A package for the analysis of ion mobility spectrometry (IMS) measurements, as well as samples from multicapillary columns coupled with IMS (MCC-IMS) and gas chromatography coupled to ion mobility spectrometry (GC-IMS). The package provides a complete workflow for the analysis, importing the data, preprocessing the spectra as well as classification and regression techniques for the modelling of the spectra. The package also includes visualization helpers, to represent topographic plots, extracted and total ion chromatograms and IMS spectra. |
Authors: | Sergio Oller Moreno [aut, cre] , Celia Mallafré Muro [aut] , Luis Fernandez [aut] , Eduardo Caballero Saldivar [aut] , Arnau Blanco Borrego [aut] , Antonio Pardo Matínez [aut] , Santiago Marco Colás [aut] , Institute for Bioengineering of Catalonia [cph] |
Maintainer: | Sergio Oller Moreno <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.1 |
Built: | 2024-11-18 11:30:03 UTC |
Source: | https://github.com/sipss/GCIMS |
Add peak list rectangles to a raw plot
add_peaklist_rect( plt, peaklist, color_by = NULL, col_prefix = "", pdata = NULL, palette = P40 )
add_peaklist_rect( plt, peaklist, color_by = NULL, col_prefix = "", pdata = NULL, palette = P40 )
plt |
The output of |
peaklist |
A data frame with at least the columns: |
color_by |
A character with a column name of |
col_prefix |
After clustering, besides |
pdata |
A phenotype data data frame, with a SampleID column to be merged into peaklist so color_by can specify
a phenotype
|
palette |
A character vector with color names to use drawing the rectangles. Use |
If peaklist
includes dt_apex_ms
and rt_apex_s
a cross will be plotted on the peak apex.
The given plt
with rectangles showing the ROIs and crosses showing the apexes
dt <- 1:10 rt <- 1:10 int <- matrix(0.0, nrow = length(dt), ncol = length(rt)) int[2, 4:8] <- c(.5, .5, 1, .5, 0.5) int[3, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[4, 4:8] <- c(1, 2, 5, 2, 1) int[5, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[6, 4:8] <- c(.5, .5, 1, .5, 0.5) dummy_obj <-GCIMSSample( drift_time = dt, retention_time = rt, data = int ) plt <- plot(dummy_obj) # Add a rectangle on top of the plot rect <- data.frame( dt_min_ms = 2.75, dt_max_ms = 5.6, rt_min_s = 4.6, rt_max_s = 7.4 ) add_peaklist_rect( plt = plt, peaklist = rect )
dt <- 1:10 rt <- 1:10 int <- matrix(0.0, nrow = length(dt), ncol = length(rt)) int[2, 4:8] <- c(.5, .5, 1, .5, 0.5) int[3, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[4, 4:8] <- c(1, 2, 5, 2, 1) int[5, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[6, 4:8] <- c(.5, .5, 1, .5, 0.5) dummy_obj <-GCIMSSample( drift_time = dt, retention_time = rt, data = int ) plt <- plot(dummy_obj) # Add a rectangle on top of the plot rect <- data.frame( dt_min_ms = 2.75, dt_max_ms = 5.6, rt_min_s = 4.6, rt_max_s = 7.4 ) add_peaklist_rect( plt = plt, peaklist = rect )
The alignment uses a multiplicative correction in drift time and a Parametric Time Warping correction in retention time
## S4 method for signature 'GCIMSDataset' align( object, method_rt = "ptw", align_dt = TRUE, align_ip = TRUE, reference_sample_idx = NULL, ... )
## S4 method for signature 'GCIMSDataset' align( object, method_rt = "ptw", align_dt = TRUE, align_ip = TRUE, reference_sample_idx = NULL, ... )
object |
A GCIMSDataset object, modified in-place |
method_rt |
Method for alignment, should be "ptw" or "pow" if pow is selected the package "pow" must be installed, to do so visit: https://github.com/sipss/pow |
align_dt |
if |
align_ip |
if |
reference_sample_idx |
One number, the index of the sample to use as reference for the alignment in retention time, if NULL the reference will be calculated automatically depending on the method |
... |
additional parameters for POW alignment |
The modified GCIMSDataset
Align a GCIMSSample object, in retention time
## S4 method for signature 'GCIMSSample' align(object, method_rt, ric_ref, ric_ref_rt, ...)
## S4 method for signature 'GCIMSSample' align(object, method_rt, ric_ref, ric_ref_rt, ...)
object |
A GCIMSSample object |
method_rt |
Method for alignment, should be "ptw" or "pow" |
ric_ref |
The reference Reverse Ion Chromatogram |
ric_ref_rt |
The retention times corresponding to |
... |
Additional arguments passed on to the alignment method. |
The modified GCIMSSample
Align a GCIMSSample in drift time with a multiplicative correction
alignDt(object, rip_ref_ms)
alignDt(object, rip_ref_ms)
object |
A GCIMSSample object |
rip_ref_ms |
The position of the RIP in ms |
The modified GCIMSSample
Plots to interpret alignment results
alignPlots(object)
alignPlots(object)
object |
A GCIMSDataset object, modified in-place |
A list with plots created with ggplot2.
Align a GCIMSSample in retention time with a multiplicative correction
alignRt_ip(object, min_start, rt_ref)
alignRt_ip(object, min_start, rt_ref)
object |
A GCIMSSample object |
min_start |
minimun injection point, to calculate where to begin the spectrums and cut as few points as posible |
rt_ref |
retention time reference |
The modified GCIMSSample
Align a GCIMSSample in retention time with parametric optimized warping
alignRt_pow( object, ric_ref, ric_ref_rt, lambdas = pracma::logspace(-2, 4, 31), p = 10, max_it = 5000, lambda1 = 10^6 )
alignRt_pow( object, ric_ref, ric_ref_rt, lambdas = pracma::logspace(-2, 4, 31), p = 10, max_it = 5000, lambda1 = 10^6 )
object |
A GCIMSSample object |
ric_ref |
The reference Reverse Ion Chromatogram |
ric_ref_rt |
The retention times corresponding to |
lambdas |
a vector with the penalties to test the POW |
p |
By default |
max_it |
Maximum number of iterations |
lambda1 |
Regularization parameter for second derivative of warp |
The modified GCIMSSample
Align a GCIMSSample in retention time using parametric time warping
alignRt_ptw(object, ric_ref, ric_ref_rt, ploynomial_order = 5)
alignRt_ptw(object, ric_ref, ric_ref_rt, ploynomial_order = 5)
object |
A GCIMSSample object |
ric_ref |
The reference Reverse Ion Chromatogram |
ric_ref_rt |
The retention times corresponding to |
ploynomial_order |
maximum order of the polynomial to be used by default 5 |
The modified GCIMSSample
Turn the intensity matrix into a data frame
## S3 method for class 'GCIMSSample' as.data.frame( x, row.names = NULL, optional = FALSE, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, ... )
## S3 method for class 'GCIMSSample' as.data.frame( x, row.names = NULL, optional = FALSE, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, ... )
x |
A GCIMSSample object |
row.names |
|
optional |
logical. If |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
... |
unused |
A data frame with dt_ms
, rt_s
and Intensity
columns
Peak grouping function, exposing several options useful for benchmarking.
clusterPeaks( peaks, ..., distance_method = "euclidean", dt_cluster_spread_ms = 0.1, rt_cluster_spread_s = 20, distance_between_peaks_from_same_sample = 100, clustering = list(method = "hclust"), verbose = FALSE )
clusterPeaks( peaks, ..., distance_method = "euclidean", dt_cluster_spread_ms = 0.1, rt_cluster_spread_s = 20, distance_between_peaks_from_same_sample = 100, clustering = list(method = "hclust"), verbose = FALSE )
peaks |
A data frame with at least the following columns:
|
... |
Ignored. All other parameters beyond |
distance_method |
A string. One of the distance methods from stats::dist, "sd_scaled_euclidean" or "mahalanobis" |
dt_cluster_spread_ms , rt_cluster_spread_s
|
The typical spread of the clusters. Used for scaling.
dimensions when computing distances. When |
distance_between_peaks_from_same_sample |
The distance between two peaks from the same sample will be set to |
clustering |
A named list with "method" and the supported method, as well as further options.
For For |
verbose |
logical, to control printing in the function |
A list with :
peak_list_clustered
: The peak list with a "cluster" column
cluster_stats
: Cluster statistics (cluster size...)
dist
: peak to peak distance object
extra_clustering_info
: Arbitrary clustering extra information, that depends on the clustering method
peak_list_fn <- system.file("extdata", "peak_list.rds", package = "GCIMS") peak_list <- readRDS(peak_list_fn) peak_clustering <- clusterPeaks(peak_list)
peak_list_fn <- system.file("extdata", "peak_list.rds", package = "GCIMS") peak_list <- readRDS(peak_list_fn) peak_clustering <- clusterPeaks(peak_list)
To process an entire dataset, we need a table that describes the samples, and you may want to add for the analysis.
create_annotations_table( samples_dir, glob = c("*.mea", "*.mea.gz"), recursive = TRUE, verbose = TRUE )
create_annotations_table( samples_dir, glob = c("*.mea", "*.mea.gz"), recursive = TRUE, verbose = TRUE )
samples_dir |
A directory that contains samples |
glob |
One or more globs for sample extensions. See the examples. |
recursive |
Whether to look for samples into |
verbose |
If set to |
The table needs to have at least two columns, one with the file name of
the sample (FileName
) and another one with the sample name (SampleID
), that you can set
as you like. Besides, you can add additional columns with any
metadata/annotations/phenotypes you may consider relevant.
This function will help you list all the samples from a directory to a table. The example below will show you how to save this table as an Excel or CSV file, for you to conveniently modify it and how you can read it back for further analysis.
A data frame with the SampleID and FileName columns
# How to create the annotations table: # # First you must tell R where your samples are. Please change "samples_dir" # below to your samples directory. On Windows you can use: # samples_dir <- choose.dir(getwd(), "Choose the folder where the samples are") # On other systems you can use: # library(tcltk) # samples_dir <- tclvalue(tkchooseDirectory()) # In this example we use a folder with some demo files: samples_dir <- system.file("extdata", "sample_formats", package = "GCIMS") # Then you need to provide the extension to look into. If you use `glob = "*.*"` you # will catch all files and you can filter the annotations table afterwards: annotations <- create_annotations_table(samples_dir, glob = "*.mea.gz") # You can write the annotations table to an Excel or a CSV file: # For Excel you may need to install the writexl package: # install.packages("writexl") # And then you can use: # writexl::write_xlsx(annotations, "annotations.xlsx") # For csv just use: # write.csv(annotations, "annotations.csv") # # Modify manually the excel or CSV file # # Read it again into R as follows: # # For Excel you may need to install the readxl package: # install.packages("readxl") # And then you can use: # annotations <- readxl::read_excel("annotations.xlsx") # For csv just use: # annotations <- read.csv("annotations.csv")
# How to create the annotations table: # # First you must tell R where your samples are. Please change "samples_dir" # below to your samples directory. On Windows you can use: # samples_dir <- choose.dir(getwd(), "Choose the folder where the samples are") # On other systems you can use: # library(tcltk) # samples_dir <- tclvalue(tkchooseDirectory()) # In this example we use a folder with some demo files: samples_dir <- system.file("extdata", "sample_formats", package = "GCIMS") # Then you need to provide the extension to look into. If you use `glob = "*.*"` you # will catch all files and you can filter the annotations table afterwards: annotations <- create_annotations_table(samples_dir, glob = "*.mea.gz") # You can write the annotations table to an Excel or a CSV file: # For Excel you may need to install the writexl package: # install.packages("writexl") # And then you can use: # writexl::write_xlsx(annotations, "annotations.xlsx") # For csv just use: # write.csv(annotations, "annotations.csv") # # Modify manually the excel or CSV file # # Read it again into R as follows: # # For Excel you may need to install the readxl package: # install.packages("readxl") # And then you can use: # annotations <- readxl::read_excel("annotations.xlsx") # For csv just use: # annotations <- read.csv("annotations.csv")
A scales transformation to be used with ggplot2.
cubic_root_trans()
cubic_root_trans()
This function is exported because we are using it in vignettes, but it may become unavailable in future versions
A scale transformation object of name "cubic_root"
library(ggplot2) x <- 1:10 y <- x^3 df <- data.frame(x = x, y = y) ggplot(data.frame(x=x, y=y)) + geom_point(aes(x = x, y = y)) + scale_y_continuous(trans=cubic_root_trans())
library(ggplot2) x <- 1:10 y <- x^3 df <- data.frame(x = x, y = y) ggplot(data.frame(x=x, y=y)) + geom_point(aes(x = x, y = y)) + scale_y_continuous(trans=cubic_root_trans())
Decimate a GCIMS dataset keeping 1 out of n points
## S4 method for signature 'GCIMSDataset' decimate(object, dt_factor = 1L, rt_factor = 1L)
## S4 method for signature 'GCIMSDataset' decimate(object, dt_factor = 1L, rt_factor = 1L)
object |
A GCIMSDataset object, modified in-place |
dt_factor |
Keep one every |
rt_factor |
Keep one every |
The modified GCIMSDataset
This method assumes that the sample has been low-pass filtered to avoid aliasing issues
## S4 method for signature 'GCIMSSample' decimate(object, dt_factor = 1L, rt_factor = 1L)
## S4 method for signature 'GCIMSSample' decimate(object, dt_factor = 1L, rt_factor = 1L)
object |
A GCIMSSample object |
dt_factor |
Keep one every |
rt_factor |
Keep one every |
The modified GCIMSSample
Delayed operations enables us to process our samples faster on big datasets. See the details section for details on how they work.
DelayedOperation( name, fun = NULL, params = list(), params_iter = list(), fun_extract = NULL, fun_aggregate = NULL )
DelayedOperation( name, fun = NULL, params = list(), params_iter = list(), fun_extract = NULL, fun_aggregate = NULL )
name |
A named for de delayed operation, only used for printing. |
fun |
A function that takes a sample and returns a modified sample |
params |
A named list with additional arguments to be passed to function |
params_iter |
A named list with additional arguments to be passed to
function. Compared to |
fun_extract |
A function that takes a modified sample and returns an extracted object. |
fun_aggregate |
A function that takes a dataset and a list of extracted objects and returns a modified dataset. |
Let's say we have a pipeline with two actions (e.g. smooth() and detectPeaks()). and we want to apply it to a dataset with two samples (e.g s1, s2).
This is a simple pseudocode to execute all actions in all samples. The code is written so you can get an idea of how :
dataset = list(s1, s2) actions = list(smooth, detectPeaks) for (action in actions) { for (i in seq_along(dataset)) { dataset[[i]] <- action(dataset[[i]]) } }
When the dataset is big, samples are stored in disk, and loaded/saved when used:
dataset = list(s1, s2) actions = list(smooth, detectPeaks) for (action in actions) { for (i in seq_along(dataset)) { sample <- read_from_disk(i) sample <- action(sample) save_to_disk(sample) } }
So actually, we can avoid "saving and loading" by changing the loop order:
dataset = list(s1, s2) actions = list(smooth, detectPeaks) for (i in seq_along(dataset)) { sample <- read_from_disk(i) for (action in actions) { sample <- action(sample) } save_to_disk(sample) }
This requires that when we apply an operation to the dataset, the operation is delayed, so we can stack many delayed operations and run them all at once.
The DelayedOperation class allows us to store all pending actions and run them afterwards when the data is needed.
Besides, samples can be processed in parallel if enough cores and RAM are available.
The DelayedOperation class also considers that sometimes we want to extract
some information from each sample (e.g. the Reverse Ion Chromatogram)
and build some matrix with the Reverse Ion Chromatograms of all samples. It changes
the loops above, so after each action modifies each sample, we can extract something
out of the sample and save it. After all actions have executed, we can aggregate
the results we have extracted and save them into the dataset. This is used for instance
in the getRIC()
implementation, to extract the RIC from each sample and afterwards
aggregate it into a matrix. This is implemented here with the fun_extract
and
fun_aggregate
functions.
A DelayedOperation object
DelayedOperation is an S4 class to store a delayed operation
Delayed operations are not applied to the dataset immediately, but rather when some data from the dataset is required. When working on large datasets, keeping all samples in RAM may be impossible, and the DelayedDatasetDisk architecture becomes convenient, where samples are stored in a directory, loaded processed and saved individually.
Under such arquitecture, it is more efficient to load a sample, run as many operations as possible on it and save the sample, instead of loading a sample, running one operation, saving the sample.
See how to create such delayed operations and more details at
vignette("creating-a-workflow-step", package = "GCIMS")
.
name
A named for de delayed operation, only used for printing.
fun
A function that takes a sample object and returns a sample object, usually with some change (filtered,...)
params
A named list with additional arguments to be passed to fun
params_iter
A named list with additional arguments to be passed to
fun
. Compared to params
, each argument must be a named list of length the number of samples, so
each sample will receive its corresponding parameter according to its name
fun_extract
A function that takes the modified sample object returned
by fun
and extracts some component out of it. This component will be stored in the dataset for faster access.
fun_aggregate
A function that takes a dataset object and a list of extracted results (the output of all fun_extract
calls)
and modifies the dataset.
This function downloads three samples in .mea.gz format. It is useful to run the introductory vignette.
download_three_ketones_dataset(outdir = "2021-mixture-six-ketones-demo")
download_three_ketones_dataset(outdir = "2021-mixture-six-ketones-demo")
outdir |
Name of the directory where the samples will be saved |
Nothing (Files are created in the given folder)
## Not run: download_three_ketones_dataset(outdir = "sample_dataset") list.files("sample_dataset") ## End(Not run)
## Not run: download_three_ketones_dataset(outdir = "sample_dataset") list.files("sample_dataset") ## End(Not run)
Get the drift time of the chromatogram
## S4 method for signature 'GCIMSChromatogram' dtime(object)
## S4 method for signature 'GCIMSChromatogram' dtime(object)
object |
A GCIMSChromatogram |
The drift time where this chromatogram was extracted from (in ms)
Other GCIMSChromatogram:
GCIMSChromatogram
,
GCIMSChromatogram-class
,
estimateBaseline,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
Get A reference drift time vector for the dataset
## S4 method for signature 'GCIMSDataset' dtime(object, sample = NULL)
## S4 method for signature 'GCIMSDataset' dtime(object, sample = NULL)
object |
A GCIMSDataset |
sample |
A number or a string with the sample index or name. If |
a drift time vector
The baseline is estimated by connecting local minima and interpolating from those.
The local minima are identified as "the minima in each region of length x"
The length of the regions are given in seconds in the region_s
parameter.
## S4 method for signature 'GCIMSChromatogram' estimateBaseline(object, rt_length_s) ## S4 method for signature 'GCIMSChromatogram' baseline(object, rt_range = NULL, rt_idx = NULL, .error_if_missing = TRUE) ## S4 replacement method for signature 'GCIMSChromatogram' baseline(object) <- value
## S4 method for signature 'GCIMSChromatogram' estimateBaseline(object, rt_length_s) ## S4 method for signature 'GCIMSChromatogram' baseline(object, rt_range = NULL, rt_idx = NULL, .error_if_missing = TRUE) ## S4 replacement method for signature 'GCIMSChromatogram' baseline(object) <- value
object |
A GCIMSChromatogram object |
rt_length_s |
The length of the baseline region. It should be comparable or longer than the peak width |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
.error_if_missing |
A logical. If |
value |
A vector with a baseline of the same length as |
The modified GCIMSChromatogram
baseline(GCIMSChromatogram)
: Get the baseline
baseline(GCIMSChromatogram) <- value
: Set the baseline
Other GCIMSChromatogram:
GCIMSChromatogram
,
GCIMSChromatogram-class
,
dtime,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
Estimate the baseline of a GCIMS Sample using a connect local minima algorithm
## S4 method for signature 'GCIMSDataset' estimateBaseline( object, dt_peak_fwhm_ms, dt_region_multiplier, rt_length_s, remove = TRUE )
## S4 method for signature 'GCIMSDataset' estimateBaseline( object, dt_peak_fwhm_ms, dt_region_multiplier, rt_length_s, remove = TRUE )
object |
A GCIMSDataset object, modified in-place |
dt_peak_fwhm_ms |
Full Width at Half Maximum in milliseconds. Used to determine the length of the regions where local minima are searched. |
dt_region_multiplier |
A multiplier to calculate the region |
rt_length_s |
The length of the baseline region. It should be comparable or longer than the peak width |
remove |
A boolean, if TRUE it removes the baseline from the intensity |
The modified GCIMSDataset
The baseline is estimated by connecting local minima and interpolating from those.
The local minima are identified as "the minima in each region of length x"
The length of the regions are estimated as fwhm * a multiplier / 2.3482
. This
assumes it's several times
## S4 method for signature 'GCIMSSample' estimateBaseline( object, dt_peak_fwhm_ms, dt_region_multiplier, rt_length_s, remove = TRUE ) ## S4 method for signature 'GCIMSSample' baseline( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, .error_if_missing = TRUE ) ## S4 replacement method for signature 'GCIMSSample' baseline(object) <- value
## S4 method for signature 'GCIMSSample' estimateBaseline( object, dt_peak_fwhm_ms, dt_region_multiplier, rt_length_s, remove = TRUE ) ## S4 method for signature 'GCIMSSample' baseline( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, .error_if_missing = TRUE ) ## S4 replacement method for signature 'GCIMSSample' baseline(object) <- value
object |
A GCIMSSample object |
dt_peak_fwhm_ms |
Full Width at Half Maximum in milliseconds. Used to determine the length of the regions where local minima are searched. |
dt_region_multiplier |
A multiplier to calculate the region |
rt_length_s |
The length of the baseline region. It should be comparable or longer than the peak width |
remove |
A boolean, if TRUE it removes the baseline from the intensity |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
.error_if_missing |
A logical. If |
value |
A matrix with the sample baseline of the same dimensions as |
The modified GCIMSSample
baseline(GCIMSSample)
: Get the baseline
baseline(GCIMSSample) <- value
: Set the baseline
The baseline is estimated by connecting local minima and interpolating from those.
The local minima are identified as "the minima in each region of length x"
The length of the regions are estimated as fwhm * a multiplier / 2.3482
. This
assumes it's several times
## S4 method for signature 'GCIMSSpectrum' estimateBaseline(object, dt_peak_fwhm_ms, dt_region_multiplier = 12) ## S4 method for signature 'GCIMSSpectrum' baseline(object, dt_range = NULL, dt_idx = NULL, .error_if_missing = TRUE) ## S4 replacement method for signature 'GCIMSSpectrum' baseline(object) <- value
## S4 method for signature 'GCIMSSpectrum' estimateBaseline(object, dt_peak_fwhm_ms, dt_region_multiplier = 12) ## S4 method for signature 'GCIMSSpectrum' baseline(object, dt_range = NULL, dt_idx = NULL, .error_if_missing = TRUE) ## S4 replacement method for signature 'GCIMSSpectrum' baseline(object) <- value
object |
A GCIMSSpectrum object |
dt_peak_fwhm_ms |
Full Width at Half Maximum in milliseconds. Used to determine the length of the regions where local minima are searched. |
dt_region_multiplier |
A multiplier to calculate the region |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
.error_if_missing |
A logical. If |
value |
A vector with a baseline of the same length as |
The modified GCIMSSpectrum
baseline(GCIMSSpectrum)
: Get the baseline
baseline(GCIMSSpectrum) <- value
: Set the baseline
Filter GCIMSDataset samples by drift time
## S4 method for signature 'GCIMSDataset' filterDt(object, dt_range)
## S4 method for signature 'GCIMSDataset' filterDt(object, dt_range)
object |
A GCIMSDataset object |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
The given object, with a delayed operation to filter retention times
base_dir <- system.file("extdata", "sample_formats", package = "GCIMS") annot <- data.frame(SampleID = "Sample1", FileName = "small.mea.gz") dataset <- GCIMSDataset$new(annot, base_dir) filterDt(dataset, dt_range = c(5, 10))
base_dir <- system.file("extdata", "sample_formats", package = "GCIMS") annot <- data.frame(SampleID = "Sample1", FileName = "small.mea.gz") dataset <- GCIMSDataset$new(annot, base_dir) filterDt(dataset, dt_range = c(5, 10))
Filter GCIMSSample samples by drift time
## S4 method for signature 'GCIMSSample' filterDt(object, dt_range)
## S4 method for signature 'GCIMSSample' filterDt(object, dt_range)
object |
A GCIMSSample object |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
A subset of the sample, only in the selected dt_range
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) s <- filterDt(s, dt_range = c(5, 9.5))
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) s <- filterDt(s, dt_range = c(5, 9.5))
Filter GCIMSDataset samples by retention time
## S4 method for signature 'GCIMSDataset' filterRt(object, rt_range)
## S4 method for signature 'GCIMSDataset' filterRt(object, rt_range)
object |
A GCIMSDataset object |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
The given object, with a delayed operation to filter retention times
base_dir <- system.file("extdata", "sample_formats", package = "GCIMS") annot <- data.frame(SampleID = "Sample1", FileName = "small.mea.gz") dataset <- GCIMSDataset$new(annot, base_dir) filterRt(dataset, rt_range = c(5, 50))
base_dir <- system.file("extdata", "sample_formats", package = "GCIMS") annot <- data.frame(SampleID = "Sample1", FileName = "small.mea.gz") dataset <- GCIMSDataset$new(annot, base_dir) filterRt(dataset, rt_range = c(5, 50))
Filter GCIMSSample samples by retention time
## S4 method for signature 'GCIMSSample' filterRt(object, rt_range)
## S4 method for signature 'GCIMSSample' filterRt(object, rt_range)
object |
A GCIMSSample object |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
A subset of the sample, only in the selected rt_range
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) s <- filterRt(s, rt_range = c(5, 50))
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) s <- filterRt(s, rt_range = c(5, 50))
Find Peaks in an object
findPeaks(object, ...)
findPeaks(object, ...)
object |
An object to find peaks on |
... |
Additional arguments for downstream methods |
The object, with found peaks
Peak detection for a GCIMSChromatogram
## S4 method for signature 'GCIMSChromatogram' findPeaks(object, ...)
## S4 method for signature 'GCIMSChromatogram' findPeaks(object, ...)
object |
A GCIMSChromatogram object |
... |
Arguments passed on to
|
The modified GCIMSChromatogram, with a peak list
Other GCIMSChromatogram:
GCIMSChromatogram
,
GCIMSChromatogram-class
,
dtime,GCIMSChromatogram-method
,
estimateBaseline,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
Peak detection on the GCIMS dataset
## S4 method for signature 'GCIMSDataset' findPeaks(object, ...)
## S4 method for signature 'GCIMSDataset' findPeaks(object, ...)
object |
A GCIMSDataset object, modified in-place |
... |
Arguments passed on to
|
The modified GCIMSDataset, with a peak list
Peak detection for a GCIMSSample
## S4 method for signature 'GCIMSSample' findPeaks(object, ...)
## S4 method for signature 'GCIMSSample' findPeaks(object, ...)
object |
A GCIMSSample object |
... |
Arguments passed on to
|
The modified GCIMSSample, with a peak list
Peak detection for a GCIMSSpectrum
## S4 method for signature 'GCIMSSpectrum' findPeaks(object, ...)
## S4 method for signature 'GCIMSSpectrum' findPeaks(object, ...)
object |
A GCIMSSpectrum object |
... |
Arguments passed on to
|
The modified GCIMSSpectrum, with a peak list
Generics defined at the GCIMS package. We are open to moving them to an existing generics-only package if you need so.
dtime(object, ...) getTIS(object, ...) getRIC(object, ...) plotTIS(object, ...) plotRIC(object, ...) filterDt(object, ...) decimate(object, ...) align(object, ...) prealign(object, ...) estimateBaseline(object, ...) baseline(object, ...) baseline(object) <- value integratePeaks(object, ...)
dtime(object, ...) getTIS(object, ...) getRIC(object, ...) plotTIS(object, ...) plotRIC(object, ...) filterDt(object, ...) decimate(object, ...) align(object, ...) prealign(object, ...) estimateBaseline(object, ...) baseline(object, ...) baseline(object) <- value integratePeaks(object, ...)
object |
An object to get the baseline |
... |
Additional arguments for downstream methods |
value |
baseline to set |
A numeric vector with the drift time
The Total Ion Spectrum as a numeric vector or a matrix (depending if the object is one sample or several)
The Reverse Ion Chromatogram, as a numeric vector or a matrix (depending if the object is one sample or several)
A plot
A plot
The object, modified
The object, modified
The object, modified
The object, modified
The object, with a baseline estimated
The baseline of the object
The object
The object, with integrated peaks
dtime()
: Get drift time vector
getTIS()
: Get the Total Ion Spectrum
getRIC()
: Get the Reverse Ion Chromatogram
plotTIS()
: Plot total ion spectrum
plotRIC()
: Plot Reverse Ion Chromatogram
filterDt()
: Filter in Drift time
decimate()
: Decimate an object
align()
: Align an object
prealign()
: Align an object in drift time
estimateBaseline()
: Estimate the baseline in an object
baseline()
: Get the baseline of an object
baseline(object) <- value
: Set the baseline of an object
integratePeaks()
: Integrate peaks of an object
x <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) dtime(x) # c(1,2)
x <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) dtime(x) # c(1,2)
Create a GCIMSChromatogram object
GCIMSChromatogram( retention_time, intensity, drift_time_idx = NA_integer_, drift_time_ms = NA_real_, description = "", baseline = NULL, peaks = NULL, peaks_debug_info = NULL )
GCIMSChromatogram( retention_time, intensity, drift_time_idx = NA_integer_, drift_time_ms = NA_real_, description = "", baseline = NULL, peaks = NULL, peaks_debug_info = NULL )
retention_time |
A numeric vector with retention times |
intensity |
A numeric vector with the corresponding intensities |
drift_time_idx |
The index or indices used to get the intensity |
drift_time_ms |
The drift times corresponding to |
description |
A string with a description (used as plot title, useful e.g. to know the sample it came from) |
baseline |
A numeric vector of the same length as |
peaks |
A data frame with peaks, use |
peaks_debug_info |
A list with arbitrary debug information from |
A GCIMSChromatogram object
Other GCIMSChromatogram:
GCIMSChromatogram-class
,
dtime,GCIMSChromatogram-method
,
estimateBaseline,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
GCIMSChromatogram( retention_time = seq(from = 0, to = 10, length.out = 200), intensity = 1:200 )
GCIMSChromatogram( retention_time = seq(from = 0, to = 10, length.out = 200), intensity = 1:200 )
GCIMSChromatogram is an S4 class to store a GCIMS Chromatogram It can be a single chromatogram or the aggregation of several chromatograms.
## S3 method for class 'GCIMSChromatogram' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S4 method for signature 'GCIMSChromatogram' description(object) ## S4 replacement method for signature 'GCIMSChromatogram,ANY' description(object) <- value ## S4 method for signature 'GCIMSChromatogram' peaks(object) ## S4 replacement method for signature 'GCIMSChromatogram' peaks(object) <- value ## S4 method for signature 'GCIMSChromatogram,ANY' plot(x, y, ...)
## S3 method for class 'GCIMSChromatogram' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S4 method for signature 'GCIMSChromatogram' description(object) ## S4 replacement method for signature 'GCIMSChromatogram,ANY' description(object) <- value ## S4 method for signature 'GCIMSChromatogram' peaks(object) ## S4 replacement method for signature 'GCIMSChromatogram' peaks(object) <- value ## S4 method for signature 'GCIMSChromatogram,ANY' plot(x, y, ...)
x |
A GCIMSChromatogram object to plot |
row.names |
|
optional |
logical. If |
... |
additional arguments to be passed to or from methods. |
object |
A GCIMSChromatogram object |
value |
A data frame with the peak list |
y |
From the generic |
The description of the chromatogram
The chromatogram object
A data frame with the peaks in the chromatogram
The GCIMSChromatogram object
A ggplot2 plot object
as.data.frame(GCIMSChromatogram)
: Coerce to data frame
description(GCIMSChromatogram)
: Get the description
description(object = GCIMSChromatogram) <- value
: Set the description
peaks(GCIMSChromatogram)
: Get the peak list
peaks(GCIMSChromatogram) <- value
: Set the peak list
plot(x = GCIMSChromatogram, y = ANY)
: plot method
retention_time
A numeric vector with retention times
intensity
A numeric vector with the corresponding intensities
baseline
A numeric vector of the same length as intensity
with the corresponding baseline,
or NULL
if not set. Use estimateBaseline()
to estimate it, baseline()
to directly access it.
drift_time_idx
The index or indices used to get the intensity
drift_time_ms
The drift times corresponding to drift_time_idx
.
description
A string with a description (used as plot title, useful e.g. to know the sample it came from)
peaks
A data frame with peaks, use findPeaks()
to fill it, or peaks()
to set/get it
peaks_debug_info
A list with arbitrary debug information from findPeaks()
Other GCIMSChromatogram:
GCIMSChromatogram
,
dtime,GCIMSChromatogram-method
,
estimateBaseline,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
GCIMSDataset is an R6 class to store a dataset.
When the dataset is created, the on_ram
option controls whether the actual
data is stored not in memory or it is read/saved from/to files as
needed, so the dataset object scales with large number of samples.
GCIMSDataset$new_from_list()
: Create a new GCIMSDataset from a list of samples
GCIMSDataset$new_from_saved_dir()
: Create a new on disk GCIMSDataset from a directory
new_from_list()
Create a new GCIMSDataset object from a list of samples. Note that with this
constructor on_ram
is TRUE
by default
GCIMSDataset$new_from_list( samples, pData=NULL, scratch_dir = NULL, keep_intermediate = FALSE, on_ram = TRUE )
new_from_saved_dir()
Creates a new GCIMSDataset object from a directory where a GCIMSDataset with
on_ram=FALSE
was saved.
GCIMSDataset$new_from_saved_dir( input_dir, scratch_dir = dirname(input_dir) )
input_dir
: The path to the directory where the dataset.rds
is saved and all the corresponding sample_*.rds
files are.
Typically a subdirectory of scratch_dir
.
scratch_dir
: The new scratch directory where further processing samples will be saved. By default it is the parent of input_dir
.
pData
A data frame with at least the SampleID and filename columns.
align
To store alignment results
peaks
To store the peak list
TIS
A matrix of n_samples vs drift time, with the Total Ion Spectrum of each sample
RIC
A matrix of n_samples vs retention time, with the Reverse Ion Chromatogram of each sample
dt_ref
A numeric drift time of reference
rt_ref
A numeric retention time of reference
userData
A list to store arbitrary data in the dataset
sampleNames
The sample names of the GCIMSDataset samples
new()
Create a new GCIMSDataset object
GCIMSDataset$new( pData = NULL, base_dir = NULL, ..., samples = NULL, parser = "default", scratch_dir = NULL, keep_intermediate = FALSE, on_ram = FALSE )
pData
A data frame holding phenotype data for the samples (or NULL
). The data frame
should at least have a SampleID
column, and a filename
column if samples are stored in files.
base_dir
The base directory. Sample i
is found on file.path(base_dir, pData$filename[i])
.
...
Unused
samples
A named list of GCIMSSample
objects to be included in the dataset (or NULL
). Names
should correspond to the SampleID
column in the pData
data frame.
parser
Function that takes a file path and returns a GCIMSSample object. Use "default"
to use the
default parser in the GCIMS package, that supports .mea
files (from GAS). Check
out vignette("importing-custom-data-formats", package = "GCIMS")
for more information
scratch_dir
A directory where intermediate and processed samples will be stored
keep_intermediate
If TRUE
, intermediate results will not be deleted (ignored if on_ram
is TRUE
).
on_ram
If TRUE
, samples are not stored on disk, but rather kept on RAM. Set it to TRUE
only with
small datasets.
dummy_dataset <- GCIMSDataset$new( pData = data.frame(SampleID = character(), filename = character(0)), base_dir = tempdir() )
print()
prints the dataset to the screen
GCIMSDataset$print()
subset()
Create a new dataset containing a subset of the samples
GCIMSDataset$subset(samples, inplace = FALSE, new_scratch_dir = NA)
samples
A numeric vector (sample indices), a character vector (sample names)
or a logical vector of the length equal to the number of samples in the
dataset (TRUE
elements will be subset)
inplace
if TRUE
subset happens in-place, otherwise subset will return a copy.
new_scratch_dir
A new scratch directory, only used if inplace=FALSE
and the dataset is on-disk.
A GCIMSDataset (new or the current one depending on inplace
), with the requested sample subset
.impl__subset__()
Do not call this method. It does an inplace subset. Use
obj$subset(samples, inplace = TRUE)
instead
GCIMSDataset$.impl__subset__(samples)
samples
A numeric vector (sample indices), a character vector (sample names)
or a logical vector of the length equal to the number of samples in the
dataset (TRUE
elements will be subset)
The given GCIMSDataset object, with a subset of the samples
appendDelayedOp()
Appends a delayed operation to the dataset so it will run afterwards
GCIMSDataset$appendDelayedOp(operation)
operation
A DelayedOperation object
The modified GCIMSDataset object
hasDelayedOps()
Find out if the dataset has pending operations
GCIMSDataset$hasDelayedOps()
Returns TRUE
if the dataset has pending operations, FALSE
otherwise
realize()
Execute all pending operations on the dataset
GCIMSDataset$realize(keep_intermediate = NA)
keep_intermediate
logical or NA
. Only when the analysis is on disk, keep intermediate result files.
If NA
, the keep_intermediate
option given at the dataset initialization takes precedence.
The dataset object, invisibly
getSample()
Get a sample from a GCIMSDataset
GCIMSDataset$getSample(sample)
sample
Either an integer (sample index) or a string (sample name)
The GCIMSSample object
extract_dtime_rtime()
Sets an action to extract the reference retention and drift times
GCIMSDataset$extract_dtime_rtime()
getRIC()
Get the Reverse Ion Chromatogram
GCIMSDataset$getRIC()
A matrix with the reverse ion chromatograms for all samples
extract_RIC_and_TIS()
Extracts the RIC and the TIS
GCIMSDataset$extract_RIC_and_TIS()
The GCIMSDataset
is_on_disk()
Whether the dataset is saved on disk or stored in RAM
GCIMSDataset$is_on_disk()
TRUE
if on disk, FALSE
otherwise
copy()
Creates a copy of the dataset. If the dataset is stored on disk, then
a new scratch_dir
must be used.
GCIMSDataset$copy(scratch_dir = NA)
scratch_dir
The scratch directory where samples being processed will be stored, if the copy is on disk.
A new GCIMSDataset object
updateScratchDir()
For on-disk datasets, copy all samples to a new scratch dir.
This is useful when creating copies of the dataset, using the dataset$copy()
method.
GCIMSDataset$updateScratchDir(scratch_dir, override_current_dir = NULL)
scratch_dir
The new scratch_dir
, must be different from the current one
override_current_dir
Typically used only internally, overrides the location of the samples. Useful when we are loading a dataset from a directory and the directory was moved since it was saved.
getCurrentDir()
Get the directory where processed samples are being saved, on on-disk datasets.
GCIMSDataset$getCurrentDir()
Either a path or NULL
. NULL
is returned if samples have not been saved (either
because have not been loaded or because the dataset is stored on RAM)
clone()
The objects of this class are cloneable with this method.
GCIMSDataset$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------ ## Method `GCIMSDataset$new` ## ------------------------------------------------ dummy_dataset <- GCIMSDataset$new( pData = data.frame(SampleID = character(), filename = character(0)), base_dir = tempdir() )
## ------------------------------------------------ ## Method `GCIMSDataset$new` ## ------------------------------------------------ dummy_dataset <- GCIMSDataset$new( pData = data.frame(SampleID = character(), filename = character(0)), base_dir = tempdir() )
GCIMSDataset_fromList
GCIMSDataset_fromList( samples, pData = NULL, scratch_dir = NULL, keep_intermediate = FALSE, on_ram = TRUE )
GCIMSDataset_fromList( samples, pData = NULL, scratch_dir = NULL, keep_intermediate = FALSE, on_ram = TRUE )
samples |
A named list of GCIMSSample objects. names should match |
pData |
A data frame with at least the SampleID and filename columns. |
scratch_dir |
A directory to save intermediate results. |
keep_intermediate |
Whether to keep sample files for intermediate results. Only used if |
on_ram |
logical. Whether the dataset should be kept stored on RAM or on disk. |
A GCIMSDataset object
# Create a new GCIMSDataset with the convenient constructor function: sample1 <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) dummy_obj <- GCIMSDataset_fromList( pData = data.frame(SampleID = "Sample1", Sex = "female"), samples = list(Sample1 = sample1) )
# Create a new GCIMSDataset with the convenient constructor function: sample1 <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) dummy_obj <- GCIMSDataset_fromList( pData = data.frame(SampleID = "Sample1", Sex = "female"), samples = list(Sample1 = sample1) )
Create a GCIMSSample object
GCIMSSample(drift_time, retention_time, data, ...)
GCIMSSample(drift_time, retention_time, data, ...)
drift_time , retention_time , data , ...
|
See the Slots section in GCIMSSample page |
A GCIMSSample object
# Create a new GCIMSSample with the convenient constructor function: dummy_obj <-GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3), gc_column = "Optional column name", drift_gas = "nitrogen", drift_tube_length = 98.0 # in mm )
# Create a new GCIMSSample with the convenient constructor function: dummy_obj <-GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3), gc_column = "Optional column name", drift_gas = "nitrogen", drift_tube_length = 98.0 # in mm )
GCIMS Sample is an S4 class to store one sample with the drift and retention time ranges and other relevant attributes (GC column, drift tube length...) if available
The actual spectra is stored in the data
slot, in a matrix,
where the first index (rows) corresponds to drift times and the
second to retention times (columns).
drift_time
numeric. (required)
retention_time
numeric. (required)
data
matrix A matrix with drift time in the rows and retention time in columns. (required)
gc_column
character. (optional) The type of chromatographic column used
drift_tube_length
numeric (optional) The length of the drift tube, in mm
drift_gas
character. (optional) The drift gas used (e.g "nitrogen")
params
list (optional) Arbitrary list of parameters and annotations
history
character. A character vector with a summary of information of the processing details the sample has gone through already
filepath
character. A string with the path to the raw data
description
A string (optional). A sample name or ID or description used in plots
proc_params
list (internal). Data processing parameters computed and used internally.
peaks
A data frame (internal). The peak list, typically set using findPeaks()
.
Use peaks()
to get/set this.
peaks_debug_info
A list with arbitrary debug information from findPeaks()
.
baseline
A matrix of the same dimensions as data
with the baseline. Use estimateBaseline()
to estimate it
and baseline()
to get or set it.
class_version
"numeric_version" (internal) The GCIMSSample object defines internally a class version, so if a GCIMSSample object is saved, the GCIMS package is updated and the GCIMSSample class has changed during the upgrade it may be possible to upgrade the previously saved object when it's loaded.
# Create a new GCIMSSample with methods::new() dummy_obj <-methods::new( "GCIMSSample", drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3), gc_column = "Optional column name", drift_gas = "nitrogen", drift_tube_length = 98.0 # in mm )
# Create a new GCIMSSample with methods::new() dummy_obj <-methods::new( "GCIMSSample", drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3), gc_column = "Optional column name", drift_gas = "nitrogen", drift_tube_length = 98.0 # in mm )
Methods for the GCIMSSample class
## S3 method for class 'GCIMSSample' x[i, j, ...] ## S3 method for class 'GCIMSSample' dim(x) ## S3 method for class 'GCIMSSample' subset(x, dt_idx = NULL, rt_idx = NULL, dt_range = NULL, rt_range = NULL, ...) ## S4 method for signature 'GCIMSSample' description(object) ## S4 replacement method for signature 'GCIMSSample,ANY' description(object) <- value ## S4 method for signature 'GCIMSSample' peaks(object) ## S4 replacement method for signature 'GCIMSSample' peaks(object) <- value
## S3 method for class 'GCIMSSample' x[i, j, ...] ## S3 method for class 'GCIMSSample' dim(x) ## S3 method for class 'GCIMSSample' subset(x, dt_idx = NULL, rt_idx = NULL, dt_range = NULL, rt_range = NULL, ...) ## S4 method for signature 'GCIMSSample' description(object) ## S4 replacement method for signature 'GCIMSSample,ANY' description(object) <- value ## S4 method for signature 'GCIMSSample' peaks(object) ## S4 replacement method for signature 'GCIMSSample' peaks(object) <- value
x |
A GCIMSSample object |
i |
index for drift time to subset |
j |
index for retention time to subset |
... |
ignored |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
object |
A GCIMSSample object |
value |
A data frame with the peak list |
[
: object x
with features i
and cells j
An integer vector with the number of rows and columns of the matrix
subset
: A subsetted GCIMSSample
object
[
: Simple subsetter for GCIMSSample objects
dim(GCIMSSample)
: Dimension of the data matrix
subset(GCIMSSample)
: Subset a GCIMSSample object
description(GCIMSSample)
: Get the description
description(object = GCIMSSample) <- value
: Set the description
peaks(GCIMSSample)
: Get the peak list
peaks(GCIMSSample) <- value
: Set the peak list
# `[' examples obj <- GCIMSSample(drift_time=1:2, retention_time=1:3, data = matrix(1:6, nrow=2, ncol=3)) dim(obj)
# `[' examples obj <- GCIMSSample(drift_time=1:2, retention_time=1:3, data = matrix(1:6, nrow=2, ncol=3)) dim(obj)
Functions to extract the drift time, the retention time and the intensity.
## S4 method for signature 'GCIMSSample' dtime(object) ## S4 method for signature 'GCIMSSample' rtime(object) ## S4 method for signature 'GCIMSSample' intensity( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL ) ## S4 replacement method for signature 'GCIMSSample' intensity(object) <- value
## S4 method for signature 'GCIMSSample' dtime(object) ## S4 method for signature 'GCIMSSample' rtime(object) ## S4 method for signature 'GCIMSSample' intensity( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL ) ## S4 replacement method for signature 'GCIMSSample' intensity(object) <- value
object |
A GCIMSSample object. |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
value |
A matrix of dimensions |
The drift time of the sample
The retention time of the sample
dtime(GCIMSSample)
: Get the drift time vector
rtime(GCIMSSample)
: Get the retention time vector
intensity(GCIMSSample)
: Get the intensity matrix
intensity(GCIMSSample) <- value
: Set the intensity matrix
mea_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") gcims_sample <- read_mea(mea_file) my_matrix <- intensity(gcims_sample, dt_range = c(7, 8), rt_range = c(1,30)) mea_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") gcims_sample <- read_mea(mea_file) my_matrix <- intensity(gcims_sample) intensity(gcims_sample) <- my_matrix/100
mea_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") gcims_sample <- read_mea(mea_file) my_matrix <- intensity(gcims_sample, dt_range = c(7, 8), rt_range = c(1,30)) mea_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") gcims_sample <- read_mea(mea_file) my_matrix <- intensity(gcims_sample) intensity(gcims_sample) <- my_matrix/100
GCIMSSpectrum is an S4 class to store a GCIMS Spectrum. It can be a single spectrum or the aggregation of several spectra.
GCIMSSpectrum(...) ## S4 method for signature 'GCIMSSpectrum' description(object) ## S4 replacement method for signature 'GCIMSSpectrum,ANY' description(object) <- value ## S4 method for signature 'GCIMSSpectrum' dtime(object) ## S4 method for signature 'GCIMSSpectrum' rtime(object) ## S4 method for signature 'GCIMSSpectrum' intensity(object, dt_range = NULL, dt_idx = NULL) ## S4 method for signature 'GCIMSSpectrum' peaks(object) ## S4 replacement method for signature 'GCIMSSpectrum' peaks(object) <- value ## S4 method for signature 'GCIMSSpectrum,ANY' plot(x, y, ...)
GCIMSSpectrum(...) ## S4 method for signature 'GCIMSSpectrum' description(object) ## S4 replacement method for signature 'GCIMSSpectrum,ANY' description(object) <- value ## S4 method for signature 'GCIMSSpectrum' dtime(object) ## S4 method for signature 'GCIMSSpectrum' rtime(object) ## S4 method for signature 'GCIMSSpectrum' intensity(object, dt_range = NULL, dt_idx = NULL) ## S4 method for signature 'GCIMSSpectrum' peaks(object) ## S4 replacement method for signature 'GCIMSSpectrum' peaks(object) <- value ## S4 method for signature 'GCIMSSpectrum,ANY' plot(x, y, ...)
... |
See the slots section |
object |
A GCIMSSpectrum object |
value |
A data frame with the peak list |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
x |
A GCIMSSpectrum object to plot |
y |
unused |
A GCIMSSpectrum object
A data frame with the peaks in the spectrum
The GCIMSSpectrum object
GCIMSSpectrum()
: Friendly constructor
description(GCIMSSpectrum)
: Get the description
description(object = GCIMSSpectrum) <- value
: Set the description
dtime(GCIMSSpectrum)
: Get the drift time vector
rtime(GCIMSSpectrum)
: Get the retention time where this spectrum was extracted
intensity(GCIMSSpectrum)
: Get the intensity matrix
peaks(GCIMSSpectrum)
: Get the peak list
peaks(GCIMSSpectrum) <- value
: Set the peak list
plot(x = GCIMSSpectrum, y = ANY)
: plot method
drift_time
A numeric vector with drift times
intensity
A numeric vector with the corresponding intensities
baseline
A numeric vector of the same length as intensity
with the corresponding baseline,
or NULL
if not set. Use estimateBaseline()
to estimate it, baseline()
to directly access it.
retention_time_idx
The index or indices used to get the intensity
retention_time_s
The retention times corresponding to the retention time indices.
description
A string with a description (used as plot title, useful e.g. to know the sample it came from)
peaks
A data frame with peaks, use findPeaks()
to fill it, or peaks()
to set/get it
peaks_debug_info
A list with arbitrary debug information from findPeaks()
spec <- GCIMSSpectrum(drift_time = 1:10, intensity = c(1:5, 6:2))
spec <- GCIMSSpectrum(drift_time = 1:10, intensity = c(1:5, 6:2))
Get the extracted ion chromatogram
getChromatogram( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, aggregate = colSums )
getChromatogram( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, aggregate = colSums )
object |
A GCIMSSample object |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
aggregate |
Function that takes the subsetted intensity matrix according to the region of interest
and aggregates the drift times, returning a vector representing the chromatogram intensity. |
A GCIMSChromatogram object
x <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) getChromatogram(x) # Take the maximum intensity in the region for each retention time: sp1 <- getChromatogram(x, aggregate = function(x) apply(x, 2, max))
x <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) getChromatogram(x) # Take the maximum intensity in the region for each retention time: sp1 <- getChromatogram(x, aggregate = function(x) apply(x, 2, max))
Get Reverse Ion Chromatogram
## S4 method for signature 'GCIMSDataset' getRIC(object)
## S4 method for signature 'GCIMSDataset' getRIC(object)
object |
A GCIMSDataset object |
The RIC matrix
Get the reverse ion chromatogram
## S4 method for signature 'GCIMSSample' getRIC(object)
## S4 method for signature 'GCIMSSample' getRIC(object)
object |
A GCIMSSample object |
A numeric vector with the reverse ion chromatogram
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) ric <- getRIC(s)
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) ric <- getRIC(s)
Get IMS spectrum from a sample
getSpectrum( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, aggregate = rowSums )
getSpectrum( object, dt_range = NULL, rt_range = NULL, dt_idx = NULL, rt_idx = NULL, aggregate = rowSums )
object |
A GCIMSSample object |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
dt_idx |
A numeric vector with the drift time indices to extract (or a logical vector of the length of drift time) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
aggregate |
Function that takes the subsetted intensity matrix according to the region of interest
and aggregates the retention times, returning a vector representing the spectrum intensity. |
A GCIMSSpectrum object
x <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) getSpectrum(x, rt_idx = 2) # Take the maximum intensity in the region for each drift time: sp1 <- getSpectrum(x, aggregate = function(x) apply(x, 1, max))
x <- GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3) ) getSpectrum(x, rt_idx = 2) # Take the maximum intensity in the region for each drift time: sp1 <- getSpectrum(x, aggregate = function(x) apply(x, 1, max))
Get Total Ion Spectra matrix
## S4 method for signature 'GCIMSDataset' getTIS(object)
## S4 method for signature 'GCIMSDataset' getTIS(object)
object |
A GCIMSDataset object |
A matrix with samples in rows and the drift time in columns
Get the total ion spectrum
## S4 method for signature 'GCIMSSample' getTIS(object)
## S4 method for signature 'GCIMSSample' getTIS(object)
object |
A GCIMSSample object |
A numeric vector with the total ion spectrum
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) tis <- getTIS(s)
sample_file <- system.file("extdata", "sample_formats", "small.mea.gz", package = "GCIMS") s <- read_mea(sample_file) tis <- getTIS(s)
Impute a Peak table
imputePeakTable(peak_table, dataset, cluster_stats)
imputePeakTable(peak_table, dataset, cluster_stats)
peak_table |
A matrix, with samples in rows and clusters in columns. It must have row names and column names. |
dataset |
The dataset object to extract samples from |
cluster_stats |
A data frame with the |
The imputed peak_table
# We are going to create a peak table matrix, typically resulting from [peakTable()] # The peak table may have some missing values # Since the missing values correspond to peaks that have not been detected # in those particular samples, we can integrate the region where they should appear # to get a value different than zero that reflects the noise level. # # Ingredients: # - The GCIMSSample objects, so we can integrate the regions of interest (given as a GCIMSDataset) # - The peak table matrix we want to impute # - The definition of the regions corresponding to each cluster (cluster_stats) # # We will prepare here a synthetic example, check the vignette for a real use case # # Imagine we have information on the location of Cluster1 and Cluster2 cluster_stats <- data.frame( cluster = c("Cluster1", "Cluster2"), dt_min_ms = c(8, 10), dt_max_ms = c(9, 12), rt_min_s = c(120, 300), rt_max_s = c(128, 320) ) # We have a peak table for two samples and two peaks peak_table <- matrix(NA_real_, nrow = 2, ncol = 2) rownames(peak_table) <- c("Sample1", "Sample2") colnames(peak_table) <- c("Cluster1", "Cluster2") # where we previously integrated Cluster2 in Sample 1 and Cluster1 in Sample2: peak_table["Sample1", "Cluster2"] <- 9.5 peak_table["Sample2", "Cluster1"] <- 3.6 # The table has missing values, because some peaks were not detected. # Maybe they are close to the noise level, or maybe they do not exist peak_table # We will fill the missing values by integrating whatever we find # (typically noise or small peaks) in the cluster regions of each sample. So we # need the sample matrices. # # Let's build dummy Sample1 and Sample2: ## Create drift time and retention time vectors: dt <- seq(from = 0, to = 13, by = 0.1) # ms rt <- seq(from = 0, to = 350, by = 1) # s ## Create matrices with random gaussian noise. set.seed(42) s1_intensity <- matrix( rnorm(length(dt)*length(rt), sd = 0.1), nrow = length(dt), ncol = length(rt) ) s2_intensity <- matrix( rnorm(length(dt)*length(rt), sd = 0.1), nrow = length(dt), ncol = length(rt) ) # The matrix will have a pleateau in a region where the peak is supposed # to be, so when we impute the region corresponding to Sample1-Cluster1 we see a # higher value: s1_intensity[dt > 8.25 & dt < 8.75, rt > 122 & rt < 126] <- 1 ## Create GCIMSSample objects s1 <- GCIMSSample( drift_time = dt, retention_time = rt, data = s1_intensity ) s2 <- GCIMSSample( drift_time = dt, retention_time = rt, data = s2_intensity ) ## And a dataset with the samples: dataset <- GCIMSDataset_fromList(list(Sample1 = s1, Sample2 = s2)) # Now we can impute the table peak_table_imp <- imputePeakTable( peak_table = peak_table, dataset = dataset, cluster_stats = cluster_stats ) peak_table_imp
# We are going to create a peak table matrix, typically resulting from [peakTable()] # The peak table may have some missing values # Since the missing values correspond to peaks that have not been detected # in those particular samples, we can integrate the region where they should appear # to get a value different than zero that reflects the noise level. # # Ingredients: # - The GCIMSSample objects, so we can integrate the regions of interest (given as a GCIMSDataset) # - The peak table matrix we want to impute # - The definition of the regions corresponding to each cluster (cluster_stats) # # We will prepare here a synthetic example, check the vignette for a real use case # # Imagine we have information on the location of Cluster1 and Cluster2 cluster_stats <- data.frame( cluster = c("Cluster1", "Cluster2"), dt_min_ms = c(8, 10), dt_max_ms = c(9, 12), rt_min_s = c(120, 300), rt_max_s = c(128, 320) ) # We have a peak table for two samples and two peaks peak_table <- matrix(NA_real_, nrow = 2, ncol = 2) rownames(peak_table) <- c("Sample1", "Sample2") colnames(peak_table) <- c("Cluster1", "Cluster2") # where we previously integrated Cluster2 in Sample 1 and Cluster1 in Sample2: peak_table["Sample1", "Cluster2"] <- 9.5 peak_table["Sample2", "Cluster1"] <- 3.6 # The table has missing values, because some peaks were not detected. # Maybe they are close to the noise level, or maybe they do not exist peak_table # We will fill the missing values by integrating whatever we find # (typically noise or small peaks) in the cluster regions of each sample. So we # need the sample matrices. # # Let's build dummy Sample1 and Sample2: ## Create drift time and retention time vectors: dt <- seq(from = 0, to = 13, by = 0.1) # ms rt <- seq(from = 0, to = 350, by = 1) # s ## Create matrices with random gaussian noise. set.seed(42) s1_intensity <- matrix( rnorm(length(dt)*length(rt), sd = 0.1), nrow = length(dt), ncol = length(rt) ) s2_intensity <- matrix( rnorm(length(dt)*length(rt), sd = 0.1), nrow = length(dt), ncol = length(rt) ) # The matrix will have a pleateau in a region where the peak is supposed # to be, so when we impute the region corresponding to Sample1-Cluster1 we see a # higher value: s1_intensity[dt > 8.25 & dt < 8.75, rt > 122 & rt < 126] <- 1 ## Create GCIMSSample objects s1 <- GCIMSSample( drift_time = dt, retention_time = rt, data = s1_intensity ) s2 <- GCIMSSample( drift_time = dt, retention_time = rt, data = s2_intensity ) ## And a dataset with the samples: dataset <- GCIMSDataset_fromList(list(Sample1 = s1, Sample2 = s2)) # Now we can impute the table peak_table_imp <- imputePeakTable( peak_table = peak_table, dataset = dataset, cluster_stats = cluster_stats ) peak_table_imp
Integrate peaks in a GCIMSDataset
## S4 method for signature 'GCIMSDataset' integratePeaks( object, peak_list, integration_size_method = c("fixed_size", "free_size"), rip_saturation_threshold = 0.1 )
## S4 method for signature 'GCIMSDataset' integratePeaks( object, peak_list, integration_size_method = c("fixed_size", "free_size"), rip_saturation_threshold = 0.1 )
object |
The GCIMSDataset object, modified inline |
peak_list |
A data frame with peak lists |
integration_size_method |
Either fixed_size or free_size |
rip_saturation_threshold |
The threshold |
A modified GCIMSDataset object
Peak integration for a GCIMSSample
## S4 method for signature 'GCIMSSample' integratePeaks( object, peak_list, integration_size_method = c("fixed_size", "free_size"), rip_saturation_threshold = 0.1, verbose = FALSE )
## S4 method for signature 'GCIMSSample' integratePeaks( object, peak_list, integration_size_method = c("fixed_size", "free_size"), rip_saturation_threshold = 0.1, verbose = FALSE )
object |
A GCIMSSample object |
peak_list |
A data frame with the peak list |
integration_size_method |
If "fixed_size", the ROI integration limits are the same for all the peaks that belong to the same cluster. If "free_size", each ROI has its own integration limits, regardless of the cluster it is assigned to. |
rip_saturation_threshold |
Used to compute the "Saturation" column. If the ratio of the RIP intensity at the ROI apex with respect to the maximum RIP is below this threshold, the RIP is considered almost depleted, and it's more likely that the ROI suffers from non-linearities. |
verbose |
If |
The modified GCIMSSample, with an updated peak list
Get the intensity vector
## S4 method for signature 'GCIMSChromatogram' intensity(object, rt_range = NULL, rt_idx = NULL)
## S4 method for signature 'GCIMSChromatogram' intensity(object, rt_range = NULL, rt_idx = NULL)
object |
A GCIMSChromatogram object |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
rt_idx |
A numeric vector with the retention time indices to extract (or a logical vector of the length of retention time) |
The retention intensity vector
Other GCIMSChromatogram:
GCIMSChromatogram
,
GCIMSChromatogram-class
,
dtime,GCIMSChromatogram-method
,
estimateBaseline,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
Extract the volume of each ROI across samples to create a peak table.
omit_times(peak_list, rt_time_2_omit = NULL, dt_time_2_omit = NULL)
omit_times(peak_list, rt_time_2_omit = NULL, dt_time_2_omit = NULL)
peak_list |
The output of |
rt_time_2_omit |
A vector including a set of retention times where ROIs detected
should not be considered. As default is is set as |
dt_time_2_omit |
A vector including a set of drift times where ROIs detected
should not be considered. As default is is set as |
A peak_list
without the ROIs present in the retention and drift times
not desired.
peak_list <- data.frame( rt_apex_s = c(1, 2, 3, 3, 4, 4, 5, 5, 6, 6), dt_apex_ms = c(2, 4, 6, 4, 8, 4, 10, 4, 4, 12) ) peak_list_filt <- omit_times(peak_list, dt_time_2_omit = 4)
peak_list <- data.frame( rt_apex_s = c(1, 2, 3, 3, 4, 4, 5, 5, 6, 6), dt_apex_ms = c(2, 4, 6, 4, 8, 4, 10, 4, 4, 12) ) peak_list_filt <- omit_times(peak_list, dt_time_2_omit = 4)
Overlay a peak list to a plot
overlay_peaklist( peaklist = NULL, ..., dt_range = NULL, rt_range = NULL, pdata = NULL, color_by = NULL, mapping_roi = c(dt_min_ms = "dt_min_ms", dt_max_ms = "dt_max_ms", rt_min_s = "rt_min_s", rt_max_s = "rt_max_s"), palette = P40 )
overlay_peaklist( peaklist = NULL, ..., dt_range = NULL, rt_range = NULL, pdata = NULL, color_by = NULL, mapping_roi = c(dt_min_ms = "dt_min_ms", dt_max_ms = "dt_max_ms", rt_min_s = "rt_min_s", rt_max_s = "rt_max_s"), palette = P40 )
peaklist |
A data frame with at least the columns: |
... |
Ignored. |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
pdata |
A phenotype data data frame, with a |
color_by |
A character with a column name of |
mapping_roi |
A 4-elements named character vector with the names of the columns from |
palette |
A character vector with color names to use drawing the rectangles. Use |
If peaklist
includes dt_apex_ms
and rt_apex_s
a cross will be plotted on the peak apex.
A list with the ggplot layers to overlay, including geom_rect
and possibly geom_point
and scale_fill_manual
.
dt <- 1:10 rt <- 1:10 int <- matrix(0.0, nrow = length(dt), ncol = length(rt)) int[2, 4:8] <- c(.5, .5, 1, .5, 0.5) int[3, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[4, 4:8] <- c(1, 2, 5, 2, 1) int[5, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[6, 4:8] <- c(.5, .5, 1, .5, 0.5) dummy_obj <-GCIMSSample( drift_time = dt, retention_time = rt, data = int ) plt <- plot(dummy_obj) # Add a rectangle on top of the plot rect <- data.frame( dt_min_ms = 2.75, dt_max_ms = 5.6, rt_min_s = 4.6, rt_max_s = 7.4 ) plt + overlay_peaklist( peaklist = rect )
dt <- 1:10 rt <- 1:10 int <- matrix(0.0, nrow = length(dt), ncol = length(rt)) int[2, 4:8] <- c(.5, .5, 1, .5, 0.5) int[3, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[4, 4:8] <- c(1, 2, 5, 2, 1) int[5, 4:8] <- c(0.5, 2, 2, 2, 0.5) int[6, 4:8] <- c(.5, .5, 1, .5, 0.5) dummy_obj <-GCIMSSample( drift_time = dt, retention_time = rt, data = int ) plt <- plot(dummy_obj) # Add a rectangle on top of the plot rect <- data.frame( dt_min_ms = 2.75, dt_max_ms = 5.6, rt_min_s = 4.6, rt_max_s = 7.4 ) plt + overlay_peaklist( peaklist = rect )
Get/Set the phenotype data
## S4 method for signature 'GCIMSDataset' pData(object) ## S4 replacement method for signature 'GCIMSDataset,ANY' pData(object) <- value
## S4 method for signature 'GCIMSDataset' pData(object) ## S4 replacement method for signature 'GCIMSDataset,ANY' pData(object) <- value
object |
A GCIMSDataset object |
value |
The data frame with annotations, it should have a FileName column and a SampleID column. |
A data frame with the phenotype data
pData(object = GCIMSDataset) <- value
: Set pData
Get the peak list
## S4 method for signature 'GCIMSDataset' peaks(object) ## S4 replacement method for signature 'GCIMSDataset' peaks(object) <- value
## S4 method for signature 'GCIMSDataset' peaks(object) ## S4 replacement method for signature 'GCIMSDataset' peaks(object) <- value
object |
A GCIMSDataset object |
value |
The data frame with a peak list |
A data frame with the detected peaks
peaks(GCIMSDataset) <- value
: Set the peak list
Extract the volume of each ROI across samples to create a peak table.
peakTable(peak_list_clustered, aggregate_conflicting_peaks = NULL)
peakTable(peak_list_clustered, aggregate_conflicting_peaks = NULL)
peak_list_clustered |
A peak list with clusters assigned. Also, you can create your own peak table
and use it as input value for |
aggregate_conflicting_peaks |
|
A list with three fields: peak_table
, peak_table_matrix
, and peak_table_duplicity
.
peak_table
, and peak_table_matrix
, provide information of the peak table. peak_table
is a dataframe
containing cluster volumes, whose columns represent samples and rows clusters. peak_table_matrix
presents
the same information content as peak_table
but in matrix form. Note that in peak_table
columns represent
clusters and rows samples. Finally, peak_table_duplicity
is a dataframe that shows ROI duplicity information
among clusters. Ideally, only one peak per sample should belong to a cluster.
# Create your peak table from scratch: pl <- data.frame( SampleID = c("S1", "S1", "S2", "S2"), cluster = c("Cluster1", "Cluster2", "Cluster1", "Cluster2"), Volume = c(10, 20, 8, 18) ) peak_table <- peakTable(pl) peak_table$peak_table_matrix # You can use imputePeakTable() to fill in the missing values # If the clustering doesn't work great, you may end up with two peaks # from the same sample on the same cluster. This does not make sense # empirically, because it's either one or the other. In case of such # ambiguity, peakTable() will give an error. # # If you want, you can override the error by taking the average volume # of those ambiguous peaks, or the maximum, using, # e.g. `aggregate_conflicting_peaks = max`. # # In any case, you will get information on how many peaks were aggregated # in the `peak_table_duplicity` field (ideally should be full of `1`): peak_table$peak_table_duplicity
# Create your peak table from scratch: pl <- data.frame( SampleID = c("S1", "S1", "S2", "S2"), cluster = c("Cluster1", "Cluster2", "Cluster1", "Cluster2"), Volume = c(10, 20, 8, 18) ) peak_table <- peakTable(pl) peak_table$peak_table_matrix # You can use imputePeakTable() to fill in the missing values # If the clustering doesn't work great, you may end up with two peaks # from the same sample on the same cluster. This does not make sense # empirically, because it's either one or the other. In case of such # ambiguity, peakTable() will give an error. # # If you want, you can override the error by taking the average volume # of those ambiguous peaks, or the maximum, using, # e.g. `aggregate_conflicting_peaks = max`. # # In any case, you will get information on how many peaks were aggregated # in the `peak_table_duplicity` field (ideally should be full of `1`): peak_table$peak_table_duplicity
Wraps the plt
with plotly::ggplotly()
and sets the xaxis
and yaxis
ticks to "auto"
, so the axis labels are updated when zooming.
plot_interactive(plt)
plot_interactive(plt)
plt |
A ggplot plot |
A plotly plot
d <- data.frame(x = c(1,2), y=c(1,2)) plt <- ggplot2::ggplot(d) + ggplot2::geom_point(ggplot2::aes(x = x, y = y)) plot_interactive(plt)
d <- data.frame(x = c(1,2), y=c(1,2)) plt <- ggplot2::ggplot(d) + ggplot2::geom_point(ggplot2::aes(x = x, y = y)) plot_interactive(plt)
Topographical plot of a GCIMSSample object
## S4 method for signature 'GCIMSSample,ANY' plot( x, dt_range = NULL, rt_range = NULL, ..., remove_baseline = FALSE, trans = "cubic_root" )
## S4 method for signature 'GCIMSSample,ANY' plot( x, dt_range = NULL, rt_range = NULL, ..., remove_baseline = FALSE, trans = "cubic_root" )
x |
A GCIMSSample object |
dt_range |
A numeric vector of length 2 with the drift time range to plot (in milliseconds) |
rt_range |
A numeric vector of length 2 with the retention time range to plot (in seconds) |
... |
Ignored |
remove_baseline |
Set to |
trans |
The transformation to the intensity values. "cubic_root" is the default. "intensity" is also valid.
See the |
A plot of the GCIMSSample
dummy_obj <-GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3), gc_column = "Optional column name", drift_gas = "nitrogen", drift_tube_length = 98.0 # in mm ) plot(dummy_obj)
dummy_obj <-GCIMSSample( drift_time = 1:2, retention_time = 1:3, data = matrix(1:6, nrow = 2, ncol = 3), gc_column = "Optional column name", drift_gas = "nitrogen", drift_tube_length = 98.0 # in mm ) plot(dummy_obj)
Plot Reverse Ion Chromatograms
## S4 method for signature 'GCIMSDataset' plotRIC(object, rt_range = NULL, sample = NULL)
## S4 method for signature 'GCIMSDataset' plotRIC(object, rt_range = NULL, sample = NULL)
object |
A GCIMSDataset object |
rt_range |
The minimum and maximum retention times to extract (length 2 vector) |
sample |
A number or a string with the sample index or name. If |
A plot
Plot Total Ion Spectra
## S4 method for signature 'GCIMSDataset' plotTIS(object, dt_range = NULL, sample = NULL)
## S4 method for signature 'GCIMSDataset' plotTIS(object, dt_range = NULL, sample = NULL)
object |
A GCIMSDataset object |
dt_range |
The minimum and maximum drift times to extract (length 2 vector) |
sample |
A number or a string with the sample index or name. If |
The plot of the TIS
Align a GCIMSSample object, in drift time and to the injection point in retention time
## S4 method for signature 'GCIMSSample' prealign(object, align_dt, align_ip, rip_ref_ms, min_start, rt_ref)
## S4 method for signature 'GCIMSSample' prealign(object, align_dt, align_ip, rip_ref_ms, min_start, rt_ref)
object |
A GCIMSSample object |
align_dt |
if |
align_ip |
if |
rip_ref_ms |
The reference position of the Reactant Ion Peak in the dataset (in ms) |
min_start |
minimun injection point, to calculate where to begin the spectrums and cut as few points as posible, to be used in injection point alignment |
rt_ref |
retention time reference for alignment to injection point |
The modified GCIMSSample
This function reads a .mea file (supporting gzip compressed .mea.gz files as well) and returns a GCIMS object
read_mea(filename)
read_mea(filename)
filename |
A .mea or a .mea.gz path to a file |
Thanks to Daniel Sanders and Thomas Wortelmann from GAS Dortmund for providing the necessary support to implement this function.
The GC-IMS sample in a GCIMSSample object
mea_file <- system.file("extdata/sample_formats/small.mea.gz", package = "GCIMS") sample <- read_mea(mea_file)
mea_file <- system.file("extdata/sample_formats/small.mea.gz", package = "GCIMS") sample <- read_mea(mea_file)
Runs all delayed operations on the object
realize(object, keep_intermediate = NA)
realize(object, keep_intermediate = NA)
object |
A GCIMSDataset object, modified in-place |
keep_intermediate |
A logical, whether to keep the intermediate files of
the previous realization once this one finishes. If |
The same GCIMSDataset object, without pending operations
base_dir <- system.file("extdata", "sample_formats", package = "GCIMS") annot <- data.frame(SampleID = "Sample1", FileName = "small.mea.gz") dataset <- GCIMSDataset$new(annot, base_dir) print(dataset) realize(dataset) print(dataset)
base_dir <- system.file("extdata", "sample_formats", package = "GCIMS") annot <- data.frame(SampleID = "Sample1", FileName = "small.mea.gz") dataset <- GCIMSDataset$new(annot, base_dir) print(dataset) realize(dataset) print(dataset)
Get the retention time vector
## S4 method for signature 'GCIMSChromatogram' rtime(object)
## S4 method for signature 'GCIMSChromatogram' rtime(object)
object |
A GCIMSChromatogram |
The retention time vector (in s)
Other GCIMSChromatogram:
GCIMSChromatogram
,
GCIMSChromatogram-class
,
dtime,GCIMSChromatogram-method
,
estimateBaseline,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
smooth,GCIMSChromatogram-method
Get a reference retention time vector for the dataset
## S4 method for signature 'GCIMSDataset' rtime(object, sample = NULL)
## S4 method for signature 'GCIMSDataset' rtime(object, sample = NULL)
object |
A GCIMSDataset |
sample |
A number or a string with the sample index or name. If |
a retention time vector
Sample names
## S4 method for signature 'GCIMSDataset' sampleNames(object) ## S4 replacement method for signature 'GCIMSDataset,ANY' sampleNames(object) <- value
## S4 method for signature 'GCIMSDataset' sampleNames(object) ## S4 replacement method for signature 'GCIMSDataset,ANY' sampleNames(object) <- value
object |
GCIMSDataset object |
value |
A character vector of length the number of samples with the sample names |
The GCIMSDataset object
sampleNames(object = GCIMSDataset) <- value
: Sample names
Show progress bar
show_progress_bar()
show_progress_bar()
logic to show the progress bar
Smoothing a GCIMS chromatogram using a Savitzky-Golay filter
## S4 method for signature 'GCIMSChromatogram' smooth(x, rt_length_s = 3, rt_order = 2L)
## S4 method for signature 'GCIMSChromatogram' smooth(x, rt_length_s = 3, rt_order = 2L)
x |
A GCIMSChromatogram object |
rt_length_s |
The length of the filter in retention time (in s) |
rt_order |
The order of the filter in retention time |
The modified GCIMSChromatogram
Other GCIMSChromatogram:
GCIMSChromatogram
,
GCIMSChromatogram-class
,
dtime,GCIMSChromatogram-method
,
estimateBaseline,GCIMSChromatogram-method
,
findPeaks,GCIMSChromatogram-method
,
intensity,GCIMSChromatogram-method
,
rtime,GCIMSChromatogram-method
Smoothing a GCIMS dataset using a Savitzky-Golay filter
## S4 method for signature 'GCIMSDataset' smooth(x, dt_length_ms = 0.14, rt_length_s = 3, dt_order = 2, rt_order = 2)
## S4 method for signature 'GCIMSDataset' smooth(x, dt_length_ms = 0.14, rt_length_s = 3, dt_order = 2, rt_order = 2)
x |
A GCIMSDataset object, modified in-place |
dt_length_ms |
the length of the filter in drift time (in ms) |
rt_length_s |
The length of the filter in retention time (in s) |
dt_order |
The order of the filter in drift time |
rt_order |
The order of the filter in retention time |
The modified GCIMSDataset
Smoothing a GCIMS sample using a Savitzky-Golay filter
## S4 method for signature 'GCIMSSample' smooth(x, dt_length_ms, rt_length_s, dt_order = 2, rt_order = 2)
## S4 method for signature 'GCIMSSample' smooth(x, dt_length_ms, rt_length_s, dt_order = 2, rt_order = 2)
x |
A GCIMSSample object |
dt_length_ms |
the length of the filter in drift time (in ms) |
rt_length_s |
The length of the filter in retention time (in s) |
dt_order |
The order of the filter in drift time |
rt_order |
The order of the filter in retention time |
The modified GCIMSSample
Smoothing a GCIMS Spectrum using a Savitzky-Golay filter
## S4 method for signature 'GCIMSSpectrum' smooth(x, dt_length_ms, dt_order = 2)
## S4 method for signature 'GCIMSSpectrum' smooth(x, dt_length_ms, dt_order = 2)
x |
A GCIMSSpectrum object |
dt_length_ms |
the length of the filter in drift time (in ms) |
dt_order |
The order of the filter in drift time |
The modified GCIMSSpectrum
This function is useful when you have saved a GCIMSSample object with a previous version of the GCIMS package and you want to load it using a new version of the package.
## S4 method for signature 'GCIMSSample' updateObject(object, ..., verbose = FALSE)
## S4 method for signature 'GCIMSSample' updateObject(object, ..., verbose = FALSE)
object |
A GCIMSSample object, typically that has been serialized and loaded from disk |
... |
Unused |
verbose |
Unused |
The function allows you to update the old object, adding missing slots, etc so it is fully compatible with the new class definition.
The updated GCIMSSample object
obj <- GCIMSSample(drift_time=1:2, retention_time=1:3, data = matrix(1:6, nrow=2, ncol=3)) # Update the object: newobj <- updateObject(obj)
obj <- GCIMSSample(drift_time=1:2, retention_time=1:3, data = matrix(1:6, nrow=2, ncol=3)) # Update the object: newobj <- updateObject(obj)