ds.mincount {preseqR}R Documentation

Estimating the expected number of species represented r or more times

Description

The function estimates the expected number of species represented at least r times in a random sample based on the initial sample.

Usage

ds.mincount(n, r=1, mt=20)

Arguments

n

A two-column matrix. The first column is the frequency j = 1,2,…; and the second column is n_j, the number of species with each species represented j times in the initial sample. The first column must be sorted in an ascending order.

mt

An positive integer constraining possible rational function approximations. Default is 20.

r

A vector of positive integers. Default is 1.

Details

The difference between this function and ds.mincount.bootstrap is that no bootstrapping for the initial sample. Therefore the function could be less stable than estimates by bootstrap. However, this function is much faster. In general, we recommand ds.mincount.bootstrap for estimating the expected number of species represented at least r times in a sample.

See ds.mincount.bootstrap for more information.

Value

FUN

The constructed estimator for the number of species represneted at least r times in a sample. The input of the estimator is a vector of sampling efforts t, i.e. the relative sample sizes comparing with the initial sample. For example, t = 2 means the sample is twice the size of the initial sample.

FUN.elements

A list of two components for the estimator. The estimator can be expressed as

\hat{E}(S_r(t)) = ∑_{i=1}^l c_i ≤ft(\frac{t}{t - x_i}\right)^r.

PF.elements contains both coefficients c_i and roots x_i.

M

The number of terms used when applying rational function approximation to the power series of the average discovery rate.

M.adjust

The number of terms in the estiamtor, equal to l

Author(s)

Chao Deng

References

Kalinin V (1965). Functionals related to the poisson distribution and statistical structure of a text. Articles on Mathematical Statistics and the Theory of Probability pp. 202-220.

Daley, T., & Smith, A. D. (2013). Predicting the molecular complexity of sequencing libraries. Nature methods, 10(4), 325-327.

Examples

## load library
library(preseqR)

## import data
data(ShakespeareWordHist)

## construct the estimator for the number of unique word
## represented at least once, twice or twenty times in a sample
estimator = ds.mincount(ShakespeareWordHist, r=c(1,2,20))

## print the elements of the estimator
estimator$FUN.elements

## The number of unique words represented at least once, twice or twenty times
## when the sample size is 10 or 20 times of the initial sample
estimator$FUN(c(10, 20))

[Package preseqR version 3.1.2 Index]