| TermDocFreq {textmineR} | R Documentation |
This function takes a document term matrix as input and returns a data frame with columns for term frequency, document frequency, and inverse-document frequency
TermDocFreq(dtm)
dtm |
A document term matrix of class |
Returns a data.frame or tibble with 4 columns.
The first column, term is a vector of token labels.
The second column, term_freq is the count of times term
appears in the entire corpus. The third column doc_freq is the
count of the number of documents in which term appears.
The fourth column, idf is the log-weighted
inverse document frequency of term.
# Load a pre-formatted dtm and topic model data(nih_sample_dtm) data(nih_sample_topic_model) # Get the term frequencies term_freq_mat <- TermDocFreq(nih_sample_dtm) str(term_freq_mat)