| mjn {pegas} | R Documentation |
This function computes the median-joining network (MJN) as described by Bandelt et al. (1999).
mjn(x, epsilon = 0, max.n.cost = 10000, prefix = "median.vector_")
x |
a matrix (or data frame) of DNA sequences or binary 0/1 data. |
epsilon |
tolerance parameter. |
max.n.cost |
the maximum number of costs to be computed. |
prefix |
the prefix used to label the median vectors. |
MJN is a network method where intermediate (unobserved) sequences (the
median vectors) are reconstructed and included in the final
network. Unlike mst, rmst, and msn,
mjn works with the original sequences, the distances being
calculated internally using a Hamming distance method (with
dist(x, "manhattan") for binary data or dist.dna(x,
"N") for DNA sequences).
The parameter epsilon controls how the search for new median
vectors is performed: the larger this parameter, the wider the search
(see the example with binary data).
If the sequences are very divergent, the search for new median vectors
can take a very long time. The argument max.n.cost controls how
many such vectors are added to the network (the default value should
avoid the function to run endlessly).
an object of class "haploNet" with an extra attribute
(data) containing the original data together with the median vectors.
This version still needs to be tested with large data sets. Bandelt et al. (1999) reported long computing times because of the need to compute a lot of median vectors.
Emmanuel Paradis
Bandelt, H. J., Forster, P. and Rohl, A. (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution, 16, 37–48.
## data in Table 1 of Bandelt et al. (1999):
x <- c(0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 0, 0, 0, 0, 0,
1, 0, 0, 0, 1, 1, 1, 0, 0,
0, 1, 1, 1, 1, 1, 0, 1, 1)
x <- matrix(x, 4, 9, byrow = TRUE)
rownames(x) <- LETTERS[1:4]
(nt0 <- mjn(x))
(nt1 <- mjn(x, 1))
(nt2 <- mjn(x, 2))
plot(nt0)
## Not run:
## same like in Fig. 4 of Bandelt et al. (1999):
plotNetMDS(nt2, dist(attr(nt2, "data"), "manhattan"), 3)
## End(Not run)
## data in Table 2 of Bandelt et al. (1999):
z <- list(c("g", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"),
c("a", "g", "g", "a", "a", "a", "a", "a", "a", "a", "a", "a"),
c("a", "a", "a", "g", "a", "a", "a", "a", "a", "a", "g", "g"),
c("a", "a", "a", "a", "g", "g", "a", "a", "a", "a", "g", "g"),
c("a", "a", "a", "a", "a", "a", "a", "a", "g", "g", "c", "c"),
c("a", "a", "a", "a", "a", "a", "g", "g", "g", "g", "a", "a"))
names(z) <- c("A1", "A2", "B1", "B2", "C", "D")
z <- as.matrix(as.DNAbin(z))
(ntz <- mjn(z, 2))
## Not run:
## same like in Fig. 5 of Bandelt et al. (1999):
plotNetMDS(ntz, dist.dna(attr(ntz, "data"), "N"), 3)
## End(Not run)