1. Introduction

The enrichment analysis module performs metabolite set enrichment analysis (MSEA) for human and mammalian species based on several libraries containing ~6300 groups of metabolite sets. Users can upload either 1) a list of compounds, 2) a list of compounds with concentrations, or 3) a concentration table.

2. Enrichment Analysis Workflow

Below we will go over 2 use-cases to perform Enrichment Analysis, the first using as input a list of compounds, and the second as input a concentration table.

2.1 Over representation analysis

We will go over two analysis workflows, the first is when the input is a list to perform over representation analysis. The first step is to create a vector containing a list of compound names. The list will then be cross-referenced (CrossReferencing) against the MetaboAnalyst compound libraries (HMDB, PubChem, KEGG, etc.), and any compounds without a hit will have NA. This step may take long due to downloading of libraries if they do not already exist in your working directory.

library(MetaboAnalystR)

## When input is a list

# Create vector consisting of compounds for enrichment analysis 
tmp.vec <- c("Acetoacetic acid", "Beta-Alanine", "Creatine", "Dimethylglycine", "Fumaric acid", "Glycine", "Homocysteine", "L-Cysteine", "L-Isolucine", "L-Phenylalanine", "L-Serine", "L-Threonine", "L-Tyrosine", "L-Valine", "Phenylpyruvic acid", "Propionic acid", "Pyruvic acid", "Sarcosine")

# Create mSetObj
mSet<-InitDataObjects("conc", "msetora", FALSE)
## [1] "MetaboAnalyst R objects intialized ..."
#Set up mSetObj with the list of compounds
mSet<-Setup.MapData(mSet, tmp.vec);

# Cross reference list of compounds against libraries (hmdb, pubchem, chebi, kegg, metlin)
mSet<-CrossReferencing(mSet, "name");
## [1] "Loaded files from MetaboAnalyst web-server."
## [1] "Loaded files from MetaboAnalyst web-server."

To view the compound name map to identify any compounds within the uploaded list without hits…

# Example compound name map
mSet$name.map 

$query.vec
 [1] "Acetoacetic acid"   "Beta-Alanine"       "Creatine"           "Dimethylglycine"    "Fumaric acid"      
 [6] "Glycine"            "Homocysteine"       "L-Cysteine"         "L-Isolucine"        "L-Phenylalanine"   
[11] "L-Serine"           "L-Threonine"        "L-Tyrosine"         "L-Valine"           "Phenylpyruvic acid"
[16] "Propionic acid"     "Pyruvic acid"       "Sarcosine"         

$hit.inx
 [1]  42  40  46  62  88  78 588 446  NA 104 120 109 103 702 131 159 164 185

$hit.values
 [1] "Acetoacetic acid"   "Beta-Alanine"       "Creatine"           "Dimethylglycine"    "Fumaric acid"      
 [6] "Glycine"            "Homocysteine"       "L-Cysteine"         NA                   "L-Phenylalanine"   
[11] "L-Serine"           "L-Threonine"        "L-Tyrosine"         "L-Valine"           "Phenylpyruvic acid"
[16] "Propionic acid"     "Pyruvic acid"       "Sarcosine"         

$match.state
 [1] 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1

Continute with the enrichment analysis…

# Create the mapping results table
mSet<-CreateMappingResultTable(mSet)
## [1] "Loaded files from MetaboAnalyst web-server."
# Input the name of the compound without any matches 
mSet<-PerformDetailMatch(mSet, "L-Isolucine");
## [1] "Loaded files from MetaboAnalyst web-server."
## [1] "Loaded files from MetaboAnalyst web-server."
# Create list of candidates to replace the compound
mSet <- GetCandidateList(mSet);
## [1] "Loaded files from MetaboAnalyst web-server."
# Identify the name of the compound to replace
mSet<-SetCandidate(mSet, "L-Isolucine", "L-Isoleucine");
## [1] "Loaded files from MetaboAnalyst web-server."
# Set the metabolite filter
mSet<-SetMetabolomeFilter(mSet, F);

# Select metabolite set library, refer to 
mSet<-SetCurrentMsetLib(mSet, "pathway", 0);

# Calculate hypergeometric score, results table generated in your working directory
mSet<-CalculateHyperScore(mSet)
## [1] "Loaded files from MetaboAnalyst web-server."
# Plot the ORA, bar-graph
mSet<-PlotORA(mSet, "ora_0_", "bar", "png", 72, width=NA)

2.2 Quantitative Enrichment Analysis

Below, we will go over a second analysis workflow to perform QEA, where the data input is a concentration table consisting of concentrations of 77 urine samples from cancer patients (cachexic vs. control) measured by 1H NMR - Eisner et al. 2010.

# Create mSetObj
mSet<-InitDataObjects("conc", "msetqea", FALSE)
## [1] "MetaboAnalyst R objects intialized ..."
# Read in data table
mSet<-Read.TextData(mSet, "http://www.metaboanalyst.ca/MetaboAnalyst/resources/data/human_cachexia.csv", "rowu", "disc");

# Perform cross-referencing of compound names
mSet<-CrossReferencing(mSet, "name");
## [1] "Loaded files from MetaboAnalyst web-server."
## [1] "Loaded files from MetaboAnalyst web-server."
# Create mapping results table
mSet<-CreateMappingResultTable(mSet)
## [1] "Loaded files from MetaboAnalyst web-server."
# Mandatory check of data 
mSet<-SanityCheckData(mSet);
##  [1] "Successfully passed sanity check!"                                                                    
##  [2] "Samples are not paired."                                                                              
##  [3] "2 groups were detected in samples."                                                                   
##  [4] "Only English letters, numbers, underscore, hyphen and forward slash (/) are allowed."                 
##  [5] "<font color=\"orange\">Other special characters or punctuations (if any) will be stripped off.</font>"
##  [6] "All data values are numeric."                                                                         
##  [7] "A total of 0 (0%) missing values were detected."                                                      
##  [8] "<u>By default, these values will be replaced by a small value.</u>"                                   
##  [9] "Click <b>Skip</b> button if you accept the default practice"                                          
## [10] "Or click <b>Missing value imputation</b> to use other methods"
# Replace missing values with minimum concentration levels
mSet<-ReplaceMin(mSet);

# Perform no normalization
mSet<-PreparePrenormData(mSet)
mSet<-Normalization(mSet, "NULL", "NULL", "NULL", "PIF_178", ratio=FALSE, ratioNum=20)

# Plot normalization
mSet<-PlotNormSummary(mSet, "norm_0_", "png", 72, width=NA)

# Plot sample-wise normalization
mSet<-PlotSampleNormSummary(mSet, "snorm_0_", "png", 72, width=NA)

# Set the metabolome filter
mSet<-SetMetabolomeFilter(mSet, F);

# Set the metabolite set library to pathway
mSet<-SetCurrentMsetLib(mSet, "pathway", 0);

# Calculate the global test score
mSet<-CalculateGlobalTestScore(mSet)
## [1] "Loaded files from MetaboAnalyst web-server."
# Plot the QEA
mSet<-PlotQEA.Overview(mSet, "qea_0_", "bar", "png", 72, width=NA)

3. Sweave Report

To prepare the sweave report, please use the PreparePDFReport function. You must ensure that you have the nexessary Latex libraries to generate the report (i.e. pdflatex, LaTexiT). The object created must be named mSet, and specify the user name in quotation marks.