check_text {textclean}R Documentation

Check Text For Potential Problems

Description

Uncleaned text may result in errors, warnings, and incorrect results in subsequent analysis. check_text checks text for potential problems and suggests possible fixes. Potential text anomalies that are detected include: factors, missing ending punctuation, empty cells, double punctuation, non-space after comma, no alphabetic characters, non-ASCII, missing value, and potentially misspelled words.

Usage

check_text(x, file = NULL)

Arguments

x

The text variable.

file

A connection, or a character string naming the file to print to. If NULL prints to the console. Note that this is assigned as an attribute and passed to print.

Value

Returns a list with the following potential text faults reports:

Note

The output is a list but prints as a pretty formatted output with potential problem elements, the accompanying text, and possible suggestions to fix the text.

Examples

## Not run: 
x <- c("i like", "<p>i want. </p>thet them ther .", "I am ! that|", "", NA, 
    "&quot;they&quot;,were there", ".", "   ", "?", "3;", "I like goud eggs!", 
    "i 4like...", "\\tgreat",  "She said \"yes\"")
check_text(x)
print(check_text(x), include.text=FALSE)

y <- c("A valid sentence.", "yet another!")
check_text(y)

## End(Not run)

[Package textclean version 0.7.2 Index]