| read.paragraph2vec {doc2vec} | R Documentation |
Read a binary paragraph2vec model from disk
read.paragraph2vec(file)
file |
the path to the model file |
an object of class paragraph2vec which is a list with elements
model: a Rcpp pointer to the model
model_path: the path to the model on disk
dim: the dimension of the embedding matrix
library(tokenizers.bpe) data(belgium_parliament, package = "tokenizers.bpe") x <- subset(belgium_parliament, language %in% "french") x <- subset(x, nchar(text) > 0 & txt_count_words(text) < 1000) model <- paragraph2vec(x = x, type = "PV-DM", dim = 100, iter = 20) model <- paragraph2vec(x = x, type = "PV-DBOW", dim = 100, iter = 20) path <- "mymodel.bin" write.paragraph2vec(model, file = path) model <- read.paragraph2vec(file = path) vocab <- summary(model, type = "vocabulary", which = "docs") vocab <- summary(model, type = "vocabulary", which = "words") embedding <- as.matrix(model, which = "docs") embedding <- as.matrix(model, which = "words")