| stri_extract_all {stringi} | R Documentation |
These functions extract all substrings matching a given pattern.
stri_extract_all_* extracts all the matches.
On the other hand, stri_extract_first_* and stri_extract_last_*
provide the first or the last matches, respectively.
stri_extract_all(str, ..., regex, fixed, coll, charclass)
stri_extract_first(str, ..., regex, fixed, coll, charclass)
stri_extract_last(str, ..., regex, fixed, coll, charclass)
stri_extract(str, ..., regex, fixed, coll, charclass, mode = c("first", "all",
"last"))
stri_extract_all_charclass(str, pattern, merge = TRUE, simplify = FALSE,
omit_no_match = FALSE)
stri_extract_first_charclass(str, pattern)
stri_extract_last_charclass(str, pattern)
stri_extract_all_coll(str, pattern, simplify = FALSE, omit_no_match = FALSE,
..., opts_collator = NULL)
stri_extract_first_coll(str, pattern, ..., opts_collator = NULL)
stri_extract_last_coll(str, pattern, ..., opts_collator = NULL)
stri_extract_all_regex(str, pattern, simplify = FALSE,
omit_no_match = FALSE, ..., opts_regex = NULL)
stri_extract_first_regex(str, pattern, ..., opts_regex = NULL)
stri_extract_last_regex(str, pattern, ..., opts_regex = NULL)
stri_extract_all_fixed(str, pattern, simplify = FALSE,
omit_no_match = FALSE, ..., opts_fixed = NULL)
stri_extract_first_fixed(str, pattern, ..., opts_fixed = NULL)
stri_extract_last_fixed(str, pattern, ..., opts_fixed = NULL)
str |
character vector with strings to search in |
... |
supplementary arguments passed to the underlying functions,
including additional settings for |
mode |
single string;
one of: |
pattern, regex, fixed, coll, charclass |
character vector defining search patterns; for more details refer to stringi-search |
merge |
single logical value;
should consecutive matches be merged into one string; |
simplify |
single logical value;
if |
omit_no_match |
single logical value; if |
opts_collator, opts_fixed, opts_regex |
a named list used to tune up
a search engine's settings; see |
Vectorized over str and pattern.
If you would like to extract regex capture groups individually,
check out stri_match.
stri_extract, stri_extract_all, stri_extract_first,
and stri_extract_last are convenience functions.
They just call stri_extract_*_*, depending on the arguments used.
Relying
on one of those underlying functions will make your code run slightly faster.
For stri_extract_all*, if simplify=FALSE (the default), then
a list of character vectors is returned. Each list element
represents the results of a separate search scenario.
If a pattern is not found and omit_no_match=FALSE,
then a character vector of length 1,
with single NA value will be generated.
Otherwise, i.e. if simplify is not FALSE,
then stri_list2matrix with byrow=TRUE argument
is called on the resulting object.
In such a case, a character matrix with an appropriate number of rows
(according to the length of str, pattern, etc.)
is returned. Note that stri_list2matrix's fill argument is set
to an empty string and NA,
for simplify equal to TRUE and NA, respectively.
stri_extract_first* and stri_extract_last*,
on the other hand, return a character vector.
A NA element indicates no match.
Other search_extract: stri_extract_all_boundaries,
stri_match_all,
stringi-search
stri_extract_all('XaaaaX', regex=c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_extract_all('Bartolini', coll='i')
stri_extract_all('stringi is so good!', charclass='\\p{Zs}') # all whitespaces
stri_extract_all_charclass(c('AbcdeFgHijK', 'abc', 'ABC'), '\\p{Ll}')
stri_extract_all_charclass(c('AbcdeFgHijK', 'abc', 'ABC'), '\\p{Ll}', merge=FALSE)
stri_extract_first_charclass('AaBbCc', '\\p{Ll}')
stri_extract_last_charclass('AaBbCc', '\\p{Ll}')
stri_extract_all_coll(c('AaaaaaaA', 'AAAA'), 'a')
stri_extract_first_coll(c('Yy\u00FD', 'AAA'), 'y', strength=2, locale="sk_SK")
stri_extract_last_coll(c('Yy\u00FD', 'AAA'), 'y', strength=1, locale="sk_SK")
stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_extract_first_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_extract_last_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+', '\\p{Ll}{2,3}', '\\p{Ll}{2,3}?'))
stri_list2matrix(stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+')))
stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+'), simplify=TRUE)
stri_extract_all_regex('XaaaaX', c('\\p{Ll}', '\\p{Ll}+'), simplify=NA)
stri_extract_all_fixed("abaBAba", "Aba", case_insensitive=TRUE)
stri_extract_all_fixed("abaBAba", "Aba", case_insensitive=TRUE, overlap=TRUE)