| LatentSemanticAnalysis {text2vec} | R Documentation |
Creates LSA(Latent semantic analysis) model. See https://en.wikipedia.org/wiki/Latent_semantic_analysis for details.
LatentSemanticAnalysis LSA
R6Class object.
For usage details see Methods, Arguments and Examples sections.
lsa = LatentSemanticAnalysis$new(n_topics, method = c("randomized", "irlba"))
lsa$fit_transform(x, ...)
lsa$transform(x, ...)
lsa$components
$new(n_topics)create LSA model with n_topics latent topics
$fit_transform(x, ...)fit model to an input sparse matrix (preferably in dgCMatrix
format) and then transform x to latent space
$transform(x, ...)transform new data x to latent space
A LSA object.
An input document-term matrix. Preferably in dgCMatrix format
integer desired number of latent topics.
character, one of c("randomized", "irlba"). Defines underlying SVD algorithm.
For very large data "randomized" usually works faster and more accurate.
Arguments to internal functions. Notably useful for fit_transform() -
these arguments will be passed to irlba or svdr functions which are used as backend for SVD.
data("movie_review")
N = 100
tokens = word_tokenizer(tolower(movie_review$review[1:N]))
dtm = create_dtm(itoken(tokens), hash_vectorizer())
n_topics = 10
lsa_1 = LatentSemanticAnalysis$new(n_topics)
d1 = lsa_1$fit_transform(dtm)
# the same, but wrapped with S3 methods
d2 = fit_transform(dtm, lsa_1)