sentiment_by {sentimentr}R Documentation

Polarity Score (Sentiment Analysis) By Groups

Description

Approximate the sentiment (polarity) of text by grouping variable(s). For a full description of the sentiment detection algorithm see sentiment. See sentiment for more details about the algorithm, the sentiment/valence shifter keys that can be passed into the function, and other arguments that can be passed.

Usage

sentiment_by(text.var, by = NULL,
  averaging.function = sentimentr::average_downweighted_zero, group.names,
  ...)

Arguments

text.var

The text variable. Also takes a sentimentr or sentiment_by object.

by

The grouping variable(s). Default NULL uses the original row/element indices; if you used a column of 12 rows for text.var these 12 rows will be used as the grouping variable. Also takes a single grouping variable or a list of 1 or more grouping variables.

averaging.function

A function for performing the group by averaging. The default, average_downweighted_zero, downweights zero values in the averaging. Note that the function must handle NAs. The sentimentr functions average_weighted_mixed_sentiment and average_mean are also available. The former upweights negative when the analysts suspects the speaker is likely to surround negatives with positives (mixed) as a polite social convention but still the affective state is negative. The later is a standard mean average.

group.names

A vector of names that corresponds to group. Generally for internal use.

...

Other arguments passed to sentiment.

Value

Returns a data.table with grouping variables plus:

See Also

Other sentiment functions: sentiment

Examples

mytext <- c(
   'do you like it?  It is red. But I hate really bad dogs',
   'I am the best friend.',
   "Do you really like it?  I'm not happy"
)

## works on a character vector but not the preferred method avoiding the 
## repeated cost of doing sentence boundary disambiguation every time 
## `sentiment` is run
## Not run: 
sentiment(mytext)
sentiment_by(mytext)

## End(Not run)

## preferred method avoiding paying the cost 
mytext <- get_sentences(mytext)

sentiment_by(mytext)
sentiment_by(mytext, averaging.function = average_mean)
sentiment_by(mytext, averaging.function = average_weighted_mixed_sentiment)
get_sentences(sentiment_by(mytext))

(mysentiment <- sentiment_by(mytext, question.weight = 0))
stats::setNames(get_sentences(sentiment_by(mytext, question.weight = 0)),
    round(mysentiment[["ave_sentiment"]], 3))

pres_dat <- get_sentences(presidential_debates_2012)

## Not run: 
## less optimized way
with(presidential_debates_2012, sentiment_by(dialogue, person))

## End(Not run)

## Not run: 
sentiment_by(pres_dat, 'person')

(out <- sentiment_by(pres_dat, c('person', 'time')))
plot(out)
plot(uncombine(out))

sentiment_by(out, presidential_debates_2012$person)
with(presidential_debates_2012, sentiment_by(out, time))

highlight(with(presidential_debates_2012, sentiment_by(out, list(person, time))))

## End(Not run)

## Not run: 
## tidy approach
library(dplyr)
library(magrittr)

cannon_reviews %>%
   mutate(review_split = get_sentences(text)) %$%
   sentiment_by(review_split)

## End(Not run)

[Package sentimentr version 2.2.3 Index]