2 Main steps of the pipeline
2.1 Starting with GWENA
Installation can either be from:
- the official version of the last Bioconductor release (recommended).
- the last stable version from the Bioc Devel branch.
- the day-to-day development version from the Github repository.
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
# 1. From Bioconductor release
BiocManager::install("GWENA")
# 2. From Bioconductor devel
BiocManager::install("GWENA", version = "devel")
# 3. From Github repository
BiocManager::install("Kumquatum/GWENA")
# OR
if (!requireNamespace("devtools", quietly=TRUE))
install.packages("devtools")
devtools::install_github("Kumquatum/GWENA")
Package loading:
library(GWENA)
library(magrittr) # Not mandatory, we use the pipe `%>%` to ease readability.
threads_to_use <- 2
2.2 Input data
2.2.1 The expression data
GWENA support expression matrix data coming from either RNA-seq or microarray experiments. Expression data have to be stored as text or spreadsheet files and formatted with genes as columns and samples as rows. To read this file with R, use the appropriate function according to the data separator (e.g. read.csv, read.table). Moreover, the expression data have to be normalized and transcripts expression reduced to the gene level (See How can I reduce my transcriptomic data to the gene level ? since GWENA is designed to build gene co-expression networks.
In this vignette, we use the microarray data set GSE85358 from the Kuehne et al. study. This data was gathered from a skin ageing study and has been processed and normalized with the R script provided in Additional data n°10 of the corresponding article.
# Import expression table
data("kuehne_expr")
# If kuehne_expr was in a file :
# kuehne_expr = read.table(<path_to_file>, header=TRUE, row.names=1)
# Number of genes
ncol(kuehne_expr)
#> [1] 15801
# Number of samples
nrow(kuehne_expr)
#> [1] 48
# Overview of expression table
kuehne_expr[1:5,1:5]
#> A_19_P00325768 A_19_P00800244 A_19_P00801821 A_19_P00802027
#> 253949420929_1_1 10.27450 5.530172 10.75672 16.78277
#> 253949420929_1_2 10.23440 5.712894 11.05393 16.25480
#> 253949420929_1_3 10.54336 5.889068 10.92150 16.39615
#> 253949420929_1_4 10.32649 5.646343 10.55770 16.37210
#> 253949420929_2_1 10.13626 5.726866 11.23012 16.41413
#> A_19_P00802201
#> 253949420929_1_1 8.549254
#> 253949420929_1_2 8.313369
#> 253949420929_1_3 8.469018
#> 253949420929_1_4 7.983723
#> 253949420929_2_1 7.521542
# Checking expression data set is correctly defined
is_data_expr(kuehne_expr)
#> $bool
#> [1] TRUE
#>
#> $reason
#> NULL
2.2.2 The metadata
To be able to perform the phenotypic association step of the pipeline (optional), we need to specify in another matrix the information associated with each sample (e.g. condition, treatment, phenotype, experiment date…). This information is often provided in a separate file (also text or spreadsheet) and can be read in R with read.csv or read.table functions.
# Import phenotype table (also called traits)
data("kuehne_traits")
# If kuehne_traits was in a file :
# kuehne_traits = read.table(<path_to_file>, header=TRUE, row.names=1)
# Phenotype
unique(kuehne_traits$Condition)
#> [1] "young" "old"
# Overview of traits table
kuehne_traits[1:5,]
#> Slide Array Exp Condition Age
#> 1 253949420929 1 1_1 young 23
#> 2 253949420929 2 1_2 old 66
#> 3 253949420929 3 1_3 young 21
#> 4 253949420929 4 1_4 old 62
#> 5 253949420929 5 2_1 young 25
2.2.3 Using SummarizedExperiment
object
GWENA is also compatible with the use of SummarizedExperiment. The previous dataset can therefore be transformed as one and used in the next steps
se_kuehne <- SummarizedExperiment::SummarizedExperiment(
assays = list(expr = t(kuehne_expr)),
colData = S4Vectors::DataFrame(kuehne_traits)
)
S4Vectors::metadata(se_kuehne) <- list(
experiment_type = "Expression profiling by array",
transcriptomic_technology = "Microarray",
GEO_accession_id = "GSE85358",
overall_design = paste("Gene expression in epidermal skin samples from the",
"inner forearms 24 young (20 to 25 years) and 24 old",
"(55 to 66 years) human volunteers were analysed",
"using Agilent Whole Human Genome Oligo Microarrays",
"8x60K V2."),
contributors = c("Kuehne A", "Hildebrand J", "Soehle J", "Wenck H",
"Terstegen L", "Gallinat S", "Knott A", "Winnefeld M",
"Zamboni N"),
title = paste("An integrative metabolomics and transcriptomics study to",
"identify metabolic alterations in aged skin of humans in",
"vivo"),
URL = "https://www.ncbi.nlm.nih.gov/pubmed/28201987",
PMIDs = 28201987
)
2.3 Gene filtering
Although the co-expression method implemented within GWENA is designed to manage and filter out low co-expressed genes, it is advisable to first reduce the dataset size. Indeed, loading a full expression matrix without filtering for uninformative data will result in excessive processing time, CPU and memory usage, and data storage. However, the author urges the users to proceed carefully during the filtering as it will impact the gene network building.
Multiple filtration methods have been natively implemented :
- For RNA-seq and microarray:
filter_low_var
: Filtering on low variation of expression
- For RNA-seq data:
filter_RNA_seq(<...>, method = "at least one")
: only one sample needs to have a value above the minimal count threshold in the genefilter_RNA_seq(<...>, method = "mean")
: the means of all samples for the gene needs to be above min_countfilter_RNA_seq(<...>, method = "all")
: all samples for the gene need to be above min_count
NB: The authors of WGCNA (used in GWENA for network building) advise against using differentially expressed (DE) genes as a filter since its module detection method is based on unsupervised clustering. Moreover, using DE genes will break the scale-free property (small-world network) on which the adjacency matrix is calculated.
In this example, we will be filtering the low variable genes with filter_low_var
function.
kuehne_expr_filtered <- filter_low_var(kuehne_expr, pct = 0.7, type = "median")
# Remaining number of genes
ncol(kuehne_expr_filtered)
#> [1] 11060
2.4 Network building
Gene co-expression networks are an ensemble of genes (nodes) linked to each other (edges) according to the strength of their relation. In GWENA, this strength is estimated by the computation of a (dis)similarity score which can start with a distance (euclidian, minkowski, …) but is usually a correlation. Among these, Pearson’s one is the most popular, however in GWENA we use Spearman correlation by default. It is less sensitive to outliers which are frequent in transcriptomics datasets and does not assume that the data follows the normal distribution.
The co-expression network is built according to the following sub-steps :
- A correlation (or distance) between each pair of genes is computed.
- The correlation distributions are fitted to a power law.
- An adjacency score is computed by adjusting previous correlations by the fitted power law.
- A topological overlap score is computed by accounting for the network’s topology.
These successive adjustments improve the detection of modules for the next step.
# In order to fasten the example execution time, we only take an
# arbitary sample of the genes.
kuehne_expr_filtered <- kuehne_expr_filtered[, 1:1000]
net <- build_net(kuehne_expr_filtered, cor_func = "spearman",
n_threads = threads_to_use)
# Power selected :
net$metadata$power
#> [1] 8
# Fit of the power law to data ($R^2$) :
fit_power_table <- net$metadata$fit_power_table
fit_power_table[fit_power_table$Power == net$metadata$power, "SFT.R.sq"]
#> [1] 0.917733
2.5 Modules detection
At this point, the network is a complete graph: all nodes are connected to all other nodes with different strengths. Because gene co-expression networks have a scale free property, groups of genes are strongly linked with one another. In co-expression networks these groups are called modules and assumed to be representative of genes working together to a common set of functions.
Such modules can be detected using unsupervised learning or modeling. GWENA use the hierarchical clustering but other methods can be used (kmeans, Gaussian mixture models, etc.).
detection <- detect_modules(kuehne_expr_filtered,
net$network,
detailled_result = TRUE,
merge_threshold = 0.25)
Important: Module 0 contains all genes that did not fit into any modules.
Since this operation tends to create multiple smaller modules with highly similar expression profile (based on the eigengene of each), they are usually merged into one.
# Number of modules before merging :
length(unique(detection$modules_premerge))
#> [1] 10
# Number of modules after merging:
length(unique(detection$modules))
#> [1] 4
plot_modules_merge(modules_premerge = detection$modules_premerge,
modules_merged = detection$modules)
Resulting modules contain more genes whose repartition can be seen by a simple barplot.
ggplot2::ggplot(data.frame(detection$modules %>% stack),
ggplot2::aes(x = ind)) + ggplot2::stat_count() +
ggplot2::ylab("Number of genes") +
ggplot2::xlab("Module")
Each of the modules presents a distinct profile, which can be plotted in two figures to separate the positive (+ facet) and negative (- facet) correlations profile. As a summary of this profile, the eigengene (red line) is displayed to act as a signature.
# plot_expression_profiles(kuehne_expr_filtered, detection$modules)
2.6 Biological integration
2.6.1 Functional enrichment
A popular way to explore the modules consists of linking them with a known biological function by using currated gene sets. Among the available ones, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), WikiPathways, Reactome, Human Phenotype Ontology (HPO) put modules into a broader systemic perspective.
In oppositions, databases references like TRANSFAC, miRTarBase, Human Protein Atlas (HPA), and CORUM give more details about tissue/cell/condition information.
Using the over-representation analysis (ORA) tool GOSt from g:Profiler, we can retrieve the biological association for each module and plot it as follows.
enrichment <- bio_enrich(detection$modules)
plot_enrichment(enrichment)
2.6.2 Phenotypic association
If phenotypic information is available about the samples provided, an association test can help to determine if a module is specifically linked to a trait. In this case, module 1 seems to be strongly linked to Age
.
# With data.frame/matrix
phenotype_association <- associate_phenotype(
detection$modules_eigengenes,
kuehne_traits %>% dplyr::select(Condition, Age, Slide))
# With SummarizedExperiment
phenotype_association <- associate_phenotype(
detection$modules_eigengenes,
SummarizedExperiment::colData(se_kuehne) %>%
as.data.frame %>%
dplyr::select(Condition, Age, Slide))
plot_modules_phenotype(phenotype_association)
Combination of phenotypic information with the previous functional enrichment can guide further analysis.
2.7 Graph visualization and topological analysis
Information can be retrieved from the network topology itself. For example, hub genes are highly connected genes known to be associated with key biological functions. They can be detected by different methods :
get_hub_high_co
: Highest connectivity, select the top n (n depending on parameter given) highest connected genes. Similar to WGCNA’s selection of hub genesget_hub_degree
: Superior degree, select genes whose degree is greater than the average connection degree of the network. Definition from network theory.get_hub_kleinberg
: Kleinberg’s score, select genes whose Kleinberg’s score is superior to the provided threshold.
Manipulation of graph objects can be quite demanding in memory and CPU usage. Caution is advised when choosing to plot networks larger than 100 genes. Since co-expression networks are complete graphs, readability is hard because all genes are connected with each other. In order to clarity visualization, edges with a similarity score below a threshold are removed.
module_example <- detection$modules$`2`
graph <- build_graph_from_sq_mat(net$network[module_example, module_example])
plot_module(graph, upper_weight_th = 0.999995,
vertex.label.cex = 0,
node_scaling_max = 7,
legend_cex = 1)
#> [,1] [,2]
#> [1,] 50.58012 140.5504
#> [2,] 21.08917 152.0063
#> [3,] 27.39993 135.1931
#> [4,] 28.71651 136.0138
#> [5,] 46.76784 143.8584
#> [6,] 34.51100 141.3436
#> [7,] 41.84268 138.9095
#> [8,] 38.47961 130.7164
#> [9,] 23.88503 157.4520
#> [10,] 23.41465 138.6647
#> [11,] 39.70626 153.6173
#> [12,] 23.02678 154.3228
#> [13,] 19.18969 152.4926
#> [14,] 25.15098 138.4913
#> [15,] 24.32163 150.3003
#> [16,] 39.87467 159.1642
#> [17,] 36.56681 137.2886
#> [18,] 18.21465 150.9230
#> [19,] 31.74638 129.3699
#> [20,] 25.90023 134.9545
#> [21,] 44.65622 149.5879
#> [22,] 38.71308 138.6789
#> [23,] 27.72004 138.5748
#> [24,] 25.30918 157.4298
#> [25,] 40.29675 140.8810
#> [26,] 47.83052 153.1308
#> [27,] 17.89480 149.2559
#> [28,] 36.14437 145.3657
#> [29,] 38.56620 152.6153
#> [30,] 43.86661 132.9663
#> [31,] 26.86393 135.1859
#> [32,] 23.68576 134.6071
#> [33,] 42.59780 152.3618
#> [34,] 35.53421 142.8160
#> [35,] 23.77674 155.8338
#> [36,] 33.52379 161.3192
#> [37,] 44.40206 134.7827
#> [38,] 26.19660 145.1978
#> [39,] 25.38566 159.9670
#> [40,] 47.72039 137.0785
#> [41,] 47.69104 150.9875
#> [42,] 34.35559 162.7816
#> [43,] 49.51415 153.5775
#> [44,] 43.95894 151.4515
#> [45,] 27.62075 140.7807
#> [46,] 39.11740 156.0284
#> [47,] 34.55527 136.4902
#> [48,] 33.51352 129.2939
#> [49,] 17.62801 147.3338
#> [50,] 42.49896 137.3389
#> [51,] 25.55669 136.9830
#> [52,] 35.41926 138.4517
#> [53,] 50.04457 145.2998
#> [54,] 49.57204 148.9703
#> [55,] 34.32981 139.6315
#> [56,] 40.54763 146.8267
#> [57,] 28.08279 137.9323
#> [58,] 21.66242 137.4817
#> [59,] 26.42600 154.6296
#> [60,] 27.61698 131.2687
#> [61,] 37.35947 158.7045
#> [62,] 40.42775 130.0586
#> [63,] 31.92362 155.5588
#> [64,] 42.79440 131.4872
#> [65,] 41.77757 133.9092
#> [66,] 25.53016 155.9730
#> [67,] 45.89253 142.7279
#> [68,] 48.25507 138.5454
#> [69,] 27.18270 137.5137
#> [70,] 41.80935 151.0135
#> [71,] 43.79126 147.8223
#> [72,] 41.28250 159.6125
#> [73,] 47.87562 146.7805
#> [74,] 21.52687 138.6083
#> [75,] 44.80498 144.1265
#> [76,] 45.49015 132.7086
#> [77,] 51.18529 149.9253
#> [78,] 28.28211 153.4771
#> [79,] 44.11295 131.6025
#> [80,] 32.55118 141.6995
#> [81,] 47.94501 154.4557
#> [82,] 36.27714 159.7844
#> [83,] 46.97268 147.9325
#> [84,] 40.17470 133.4356
#> [85,] 46.85386 152.2037
#> [86,] 22.81649 138.3760
#> [87,] 23.40435 140.8581
#> [88,] 20.46166 155.5734
#> [89,] 38.01008 160.9863
#> [90,] 22.55673 139.7197
#> [91,] 49.46518 137.8520
#> [92,] 43.47500 140.5015
#> [93,] 37.36621 162.5551
#> [94,] 24.79649 135.0398
#> [95,] 24.70901 154.6465
#> [96,] 22.56752 156.4121
#> [97,] 27.56784 152.1233
#> [98,] 29.73535 135.5011
#> [99,] 31.08133 142.8044
#> [100,] 42.06159 145.8392
#> [101,] 49.11340 140.0007
#> [102,] 17.74018 145.8599
#> [103,] 28.37542 156.7010
#> [104,] 33.82240 143.2491
#> [105,] 25.81008 137.6255
#> [106,] 32.23296 144.0112
#> [107,] 46.34020 137.1192
#> [108,] 39.85764 160.6812
#> [109,] 45.60022 146.9726
#> [110,] 21.69160 155.2943
#> [111,] 21.24886 144.3521
#> [112,] 25.86341 131.7805
#> [113,] 44.11442 156.8514
#> [114,] 41.85735 155.1175
#> [115,] 45.21335 155.8351
#> [116,] 43.21434 149.7850
#> [117,] 48.90122 150.2909
#> [118,] 45.68152 150.7357
#> [119,] 47.00990 149.7560
#> [120,] 39.08881 157.8701
#> [121,] 28.46156 147.3859
#> [122,] 26.65500 138.3465
#> [123,] 32.98533 150.3261
#> [124,] 36.19853 130.3664
#> [125,] 31.99102 161.1294
#> [126,] 24.59457 138.1965
#> [127,] 20.69940 150.5175
#> [128,] 26.99426 131.9422
#> [129,] 37.67361 157.0288
#> [130,] 24.26454 153.2177
#> [131,] 42.50826 159.9666
#> [132,] 37.22760 141.8985
#> [133,] 27.43320 135.8924
#> [134,] 43.44169 138.6424
#> [135,] 18.28965 142.7658
#> [136,] 41.79296 149.0635
#> [137,] 44.77384 137.7573
#> [138,] 45.50488 133.9960
#> [139,] 43.50737 136.1596
#> [140,] 32.86058 162.6901
#> [141,] 31.20837 149.9996
#> [142,] 25.79175 141.9446
#> [143,] 20.23813 145.3624
#> [144,] 49.17559 141.5308
#> [145,] 36.54912 161.2151
#> [146,] 19.37896 153.9143
#> [147,] 41.84231 132.5402
#> [148,] 34.03137 151.7387
#> [149,] 39.45321 145.6148
#> [150,] 38.38327 159.7069
#> [151,] 22.50323 148.6563
#> [152,] 32.84443 139.8514
#> [153,] 45.36931 152.3342
#> [154,] 20.77055 147.4064
#> [155,] 23.73408 135.9036
#> [156,] 37.19474 139.1537
#> [157,] 25.21076 139.8549
#> [158,] 28.57050 137.5530
#> [159,] 37.12909 133.3110
#> [160,] 49.76553 147.2344
#> [161,] 46.56857 157.6545
#> [162,] 27.52399 148.9583
#> [163,] 34.91591 128.8872
#> [164,] 40.33036 154.9758
#> [165,] 48.86317 136.3755
#> [166,] 43.98115 158.9711
#> [167,] 22.75550 141.0080
#> [168,] 44.15425 142.5724
#> [169,] 29.09059 133.9481
#> [170,] 31.32906 148.3768
#> [171,] 31.46427 162.4582
#> [172,] 48.35386 144.3984
#> [173,] 27.04136 146.5880
#> [174,] 24.09087 148.6043
#> [175,] 23.90825 133.9998
#> [176,] 46.34137 135.4741
#> [177,] 39.41950 147.8717
#> [178,] 43.74780 160.2894
#> [179,] 35.04745 158.5307
#> [180,] 23.01031 151.0574
#> [181,] 37.96413 155.1592
#> [182,] 41.91921 130.3762
#> [183,] 35.54085 132.4923
#> [184,] 24.61461 139.3681
#> [185,] 38.68904 133.7379
#> [186,] 18.83503 148.1335
#> [187,] 34.82236 154.6841
#> [188,] 46.92025 133.9982
#> [189,] 32.46747 130.3634
#> [190,] 22.81715 152.4216
#> [191,] 45.02198 136.2399
#> [192,] 41.22390 144.4039
#> [193,] 29.79377 157.5595
#> [194,] 37.70991 146.9925
#> [195,] 37.01570 143.7483
#> [196,] 30.06019 162.2549
#> [197,] 27.96824 135.4384
#> [198,] 36.21917 140.4711
#> [199,] 25.97027 158.5458
#> [200,] 17.84190 144.1892
#> [201,] 47.88372 156.4880
#> [202,] 27.93976 161.6380
#> [203,] 35.05807 161.4065
#> [204,] 37.43733 135.1087
#> [205,] 25.19251 134.3400
#> [206,] 34.46442 144.9097
#> [207,] 36.47905 129.0536
#> [208,] 38.01662 144.9899
#> [209,] 32.47394 131.7239
#> [210,] 31.88986 158.8691
#> [211,] 38.81501 162.1794
#> [212,] 41.08189 142.5185
#> [213,] 41.70840 161.2431
#> [214,] 29.35793 149.3531
#> [215,] 36.20320 157.6250
#> [216,] 26.15202 151.0288
#> [217,] 38.05929 136.8953
#> [218,] 45.19598 141.3301
#> [219,] 24.01543 147.0230
#> [220,] 47.83397 142.6253
#> [221,] 38.56914 140.5218
#> [222,] 41.19385 158.1259
#> [223,] 47.80688 135.2838
#> [224,] 37.73718 129.3393
#> [225,] 42.13282 156.7931
#> [226,] 50.35724 143.9276
#> [227,] 22.25636 136.4249
#> [228,] 34.75223 130.3919
#> [229,] 48.98665 155.1368
#> [230,] 31.78932 145.5521
#> [231,] 42.99231 144.2530
#> [232,] 44.76313 139.4845
#> [233,] 36.74569 150.6596
#> [234,] 46.71197 154.7938
#> [235,] 29.09051 159.8003
#> [236,] 51.31430 146.1081
#> [237,] 35.90182 162.8225
#> [238,] 28.30372 131.6453
#> [239,] 23.01890 137.7110
#> [240,] 28.77705 132.8870
#> [241,] 19.48633 143.7122
#> [242,] 28.81285 139.5181
#> [243,] 29.99576 152.6418
#> [244,] 24.87621 151.9990
#> [245,] 27.89387 155.0972
#> [246,] 30.88795 137.1259
#> [247,] 21.78090 140.3726
#> [248,] 43.75550 153.2864
#> [249,] 20.41945 138.5384
#> [250,] 27.62856 160.1525
#> [251,] 51.63026 144.7184
#> [252,] 24.20149 140.6124
#> [253,] 37.37720 153.8125
#> [254,] 50.96658 148.6083
#> [255,] 27.85774 132.4046
#> [256,] 33.41153 146.2476
#> [257,] 50.28784 150.8547
#> [258,] 50.26096 139.0707
#> [259,] 24.29668 137.2649
#> [260,] 29.48249 151.1199
#> [261,] 23.22058 145.2455
#> [262,] 40.30029 161.8882
#> [263,] 36.37317 155.2268
#> [264,] 44.24692 154.5351
#> [265,] 51.39497 147.3772
#> [266,] 24.72325 135.9329
#> [267,] 38.10202 132.1221
#> [268,] 35.76425 148.6246
#> [269,] 49.05978 152.0689
#> [270,] 34.01020 131.3905
#> [271,] 40.34281 139.1468
#> [272,] 35.78158 134.1209
#> [273,] 43.76248 146.0499
#> [274,] 18.96732 145.3811
#> [275,] 24.33896 159.1397
#> [276,] 22.07507 150.0310
#> [277,] 33.84988 156.0762
#> [278,] 32.93886 159.8694
#> [279,] 22.46995 147.1482
#> [280,] 39.01159 149.5753
#> [281,] 22.01596 141.0187
#> [282,] 30.44740 159.0900
#> [283,] 35.46544 146.9010
#> [284,] 32.82169 147.7865
#> [285,] 39.61649 143.7385
#> [286,] 26.08367 136.8048
#> [287,] 30.18387 147.4072
#> [288,] 45.67541 153.8216
#> [289,] 33.66574 158.2651
#> [290,] 31.27960 154.0703
#> [291,] 42.83946 158.1221
#> [292,] 39.79824 137.3897
#> [293,] 22.98432 136.4790
#> [294,] 40.41431 151.9892
#> [295,] 51.41349 143.0422
#> [296,] 43.10290 134.1766
#> [297,] 29.66456 154.4005
#> [298,] 21.70356 134.3907
#> [299,] 22.91896 134.4277
#> [300,] 26.98327 156.5984
#> [301,] 27.65390 134.4849
#> [302,] 30.01710 145.8379
#> [303,] 26.67044 131.3231
#> [304,] 50.79369 141.8596
#> [305,] 22.98911 158.4123
#> [306,] 24.95003 142.0562
#> [307,] 49.31476 143.0518
#> [308,] 20.42185 139.4506
#> [309,] 28.84347 132.1932
#> [310,] 25.92406 136.2088
#> [311,] 33.18996 154.2321
#> [312,] 26.11266 147.8707
#> [313,] 47.28323 141.3447
#> [314,] 21.65802 157.2209
#> [315,] 26.59277 141.5434
#> [316,] 26.63732 133.5404
#> [317,] 40.42755 149.9687
#> [318,] 33.26136 133.9370
#> [319,] 34.56042 159.9588
#> [320,] 46.90716 145.4341
#> [321,] 39.57947 131.8957
#> [322,] 29.79136 144.1942
#> [323,] 34.10146 132.8523
#> [324,] 33.63666 137.6712
#> [325,] 36.70605 131.5605
#> [326,] 40.33343 135.0774
#> [327,] 34.05546 148.7296
#> [328,] 35.34146 156.5163
#> [329,] 42.00758 135.5327
#> [330,] 26.77577 139.1883
#> [331,] 26.42458 160.8260
#> [332,] 44.98775 157.8021
#> [333,] 31.18934 157.1287
#> [334,] 32.54005 152.8699
#> [335,] 35.90535 135.6792
#> [336,] 25.14813 133.4860
#> [337,] 38.45027 151.0438
#> [338,] 50.40073 152.3226
#> [339,] 19.52084 151.2454
#> [340,] 32.69072 157.1282
#> [341,] 27.20908 158.6367
#> [342,] 39.23138 129.4952
#> [343,] 35.07307 150.4006
#> [344,] 28.09234 140.1666
#> [345,] 48.85340 145.8381
#> [346,] 28.80187 136.9149
#> [347,] 28.51980 158.4224
#> [348,] 45.61887 148.5278
#> [349,] 25.60156 149.3648
#> [350,] 41.02050 136.7305
#> [351,] 27.79107 150.5182
#> [352,] 40.45364 156.8191
#> [353,] 43.29971 155.5901
#> [354,] 21.80994 153.5121
#> [355,] 47.65319 139.9254
#> [356,] 24.79303 136.2965
#> [357,] 38.95534 142.3155
#> [358,] 46.54057 138.5874
#> [359,] 29.46309 134.8997
#> [360,] 38.97921 135.5277
#> [361,] 21.84049 139.2836
#> [362,] 41.93448 140.7664
#> [363,] 42.69739 142.4511
#> [364,] 31.52513 151.6135
#> [365,] 29.24964 161.2402
#> [366,] 37.42460 148.7513
#> [367,] 34.17087 134.9623
#> [368,] 21.65480 145.9040
#> [369,] 46.09027 140.1192
#> [370,] 42.12541 147.6020
#> [371,] 36.46726 152.3538
#> [372,] 20.77988 148.9017
#> [373,] 29.90460 155.8735
#> [374,] 20.58162 153.6911
#> [375,] 20.91417 140.0264
#> [376,] 19.34381 149.5519
#> [377,] 41.61464 153.5560
#> [378,] 48.36694 148.4134
#> [379,] 24.95636 145.9771
#> [380,] 35.00460 153.1044
#> [381,] 28.18266 145.1164
#> [382,] 19.38317 146.8931
#> [383,] 45.59267 158.9967
#> [384,] 26.28217 153.1204
#> [385,] 46.59569 156.2477
#> [386,] 45.33386 145.5315
#> [387,] 40.83315 131.4741
#> [388,] 30.74944 160.5447
2.8 Networks comparison
A co-expression network can be built for each of the experimental conditions studied (e.g. control/test) and then be compared with each other to detect differences of patterns in co-expression. These may indicate breaks of inhibition, inefficiency of a factor of transcription, etc. These analyses can focus on preserved modules between conditions (e.g. to detect housekeeping genes), or unpreserved modules (e.g. to detect genes contributing to a disease).
GWENA uses a comparison test based on random re-assignment of gene names inside modules to see whether patterns inside modules change (from NetRep package). This permutation test is repeated a large number of times to evaluate the significance of the result obtained.
To perform the comparison, all previous steps leading to modules detection need to be done for each condition. To save CPU, memory and time, the parameter keep_cor_mat
from the build_net
function can be switched to TRUE so the similarity matrix is kept and can be passed to compare_conditions
. If not, the matrix is re-computed in compare_conditions
.
# Expression by condition with data.frame/matrix
samples_by_cond <- lapply(kuehne_traits$Condition %>% unique, function(cond){
df <- kuehne_traits %>%
dplyr::filter(Condition == cond) %>%
dplyr::select(Slide, Exp)
apply(df, 1, paste, collapse = "_")
}) %>% setNames(kuehne_traits$Condition %>% unique)
expr_by_cond <- lapply(samples_by_cond %>% names, function(cond){
samples <- samples_by_cond[[cond]]
kuehne_expr_filtered[which(rownames(kuehne_expr_filtered) %in% samples),]
}) %>% setNames(samples_by_cond %>% names)
# Expression by condition with SummarizedExperiment
se_expr_by_cond <- lapply(unique(se_kuehne$Condition), function(cond){
se_kuehne[, se_kuehne$Condition == cond]
}) %>% setNames(unique(se_kuehne$Condition))
# Network building and modules detection by condition
net_by_cond <- lapply(expr_by_cond, build_net, cor_func = "spearman",
n_threads = threads_to_use, keep_matrices = "both")
mod_by_cond <- mapply(detect_modules, expr_by_cond,
lapply(net_by_cond, `[[`, "network"),
MoreArgs = list(detailled_result = TRUE),
SIMPLIFY = FALSE)
comparison <- compare_conditions(expr_by_cond,
lapply(net_by_cond, `[[`, "adja_mat"),
lapply(net_by_cond, `[[`, "cor_mat"),
lapply(mod_by_cond, `[[`, "modules"),
pvalue_th = 0.05)
The final object contains a table summarizing the comparison of the modules,
directly available with the comparison$result$young$old$comparison
command.
The comparison take into account the permutation test result and the z summary.
comparison |
---|
preserved |
preserved |
inconclusive |
preserved |
moderately preserved |
preserved |
The detail of the pvalues can also be seen as a heatmap. Since all evaluation metrics of compare_conditions
need to be significant to consider a module preserved/unpreserved/one of them, it could be interesting to see which metrics prevented a module to be significant.
plot_comparison_stats(comparison$result$young$old$p.values)