| slide_index {slider} | R Documentation |
slide_index() is similar to slide(), but allows a secondary .i-ndex
vector to be provided.
This is often useful in business calculations, when
you want to compute a rolling computation looking "3 months back", which
is approximately but not equivalent to, 3 * 30 days. slide_index() allows
for these irregular window sizes.
slide_index(.x, .i, .f, ..., .before = 0L, .after = 0L, .complete = FALSE)
slide_index_vec(
.x,
.i,
.f,
...,
.before = 0L,
.after = 0L,
.complete = FALSE,
.ptype = NULL
)
slide_index_dbl(.x, .i, .f, ..., .before = 0L, .after = 0L, .complete = FALSE)
slide_index_int(.x, .i, .f, ..., .before = 0L, .after = 0L, .complete = FALSE)
slide_index_lgl(.x, .i, .f, ..., .before = 0L, .after = 0L, .complete = FALSE)
slide_index_chr(.x, .i, .f, ..., .before = 0L, .after = 0L, .complete = FALSE)
slide_index_dfr(
.x,
.i,
.f,
...,
.before = 0L,
.after = 0L,
.complete = FALSE,
.names_to = rlang::zap(),
.name_repair = c("unique", "universal", "check_unique")
)
slide_index_dfc(
.x,
.i,
.f,
...,
.before = 0L,
.after = 0L,
.complete = FALSE,
.size = NULL,
.name_repair = c("unique", "universal", "check_unique", "minimal")
)
.x |
The vector to iterate over and apply |
.i |
The index vector that determines the window sizes. The lower bound
of the window range will be computed as There are 3 restrictions on the index:
|
.f |
If a function, it is used as is. If a formula, e.g.
This syntax allows you to create very compact anonymous functions. |
... |
Additional arguments passed on to the mapped function. |
.before, .after |
The number of values before or after the current element of Any object that can be added or subtracted from The ranges that result from computing |
.complete |
Should |
.ptype |
A prototype corresponding to the type of the output. If If supplied, the result of each call to If |
.names_to |
|
.name_repair |
One of With |
.size |
If, Alternatively, specify the desired number of rows, and any inputs of length 1 will be recycled appropriately. |
A vector fulfilling the following invariants:
slide_index()vec_size(slide_index(.x)) == vec_size(.x)
vec_ptype(slide_index(.x)) == list()
slide_index_vec() and slide_index_*() variantsvec_size(slide_index_vec(.x)) == vec_size(.x)
vec_size(slide_index_vec(.x)[[1]]) == 1L
vec_ptype(slide_index_vec(.x, .ptype = ptype)) == ptype
slide(), hop_index(), slide_index2()
x <- 1:5
# In some cases, sliding over `x` with a strict window size of 2
# will fit your use case.
slide(x, ~.x, .before = 1)
# However, if this `i` is a date vector paired with `x`, when computing
# rolling calculations you might want to iterate over `x` while
# respecting the fact that `i` is an irregular sequence.
i <- as.Date("2019-08-15") + c(0:1, 4, 6, 7)
# For example, a "2 day" window should not pair `"2019-08-19"` and
# `"2019-08-21"` together, even though they are next to each other in `x`.
# `slide_index()` computes the lookback value from the current date in `.i`,
# meaning that if you are currently on `"2019-08-21"` and look back 1 day,
# it will correctly not include `"2019-08-19"`.
slide_index(i, i, ~.x, .before = 1)
# We could have equivalently used a lubridate period object for this as well,
# since `i - lubridate::days(1)` is allowed
slide_index(i, i, ~.x, .before = lubridate::days(1))
# ---------------------------------------------------------------------------
# When `.i` has repeated values, they are always grouped together.
i <- c(2017, 2017, 2018, 2019, 2020, 2020)
slide_index(i, i, ~.x)
slide_index(i, i, ~.x, .after = 1)
# ---------------------------------------------------------------------------
# Rolling regressions
# Rolling regressions are easy with `slide_index()` because:
# - Data frame `.x` values are iterated over rowwise
# - The index is respected by using `.i`
set.seed(123)
df <- data.frame(
y = rnorm(100),
x = rnorm(100),
i = as.Date("2019-08-15") + c(0, 2, 4, 6:102) # <- irregular
)
# 20 day rolling regression. Current day + 19 days back.
# Additionally, set `.complete = TRUE` to not compute partial results.
regr <- slide_index(df, df$i, ~lm(y ~ x, .x), .before = 19, .complete = TRUE)
regr[16:18]
# The first 16 slots are NULL because there is no possible way to
# look back 19 days from the 16th index position and construct a full
# window. But on the 17th index position, `""2019-09-03"`, if we look
# back 19 days we get to `""2019-08-15"`, which is the same value as
# `i[1]` so a full window can be constructed.
df$i[16] - 19 >= df$i[1] # FALSE
df$i[17] - 19 >= df$i[1] # TRUE
# ---------------------------------------------------------------------------
# Accessing the current index value
# A very simplistic version of `purrr::map2()`
fake_map2 <- function(.x, .y, .f, ...) {
Map(.f, .x, .y, ...)
}
# Occasionally you need to access the index value that you are currently on.
# This is generally not possible with a single call to `slide_index()`, but
# can be easily accomplished by following up a `slide_index()` call with a
# `purrr::map2()`. In this example, we want to use the distance from the
# current index value (in days) as a multiplier on `x`. Values further
# away from the current date get a higher multiplier.
set.seed(123)
# 25 random days past 2000-01-01
i <- sort(as.Date("2000-01-01") + sample(100, 25))
df <- data.frame(i = i, x = rnorm(25))
weight_by_distance <- function(df, i) {
df$weight = abs(as.integer(df$i - i))
df$x_weighted = df$x * df$weight
df
}
# Use `slide_index()` to just generate the rolling data.
# Here we take the current date + 5 days before + 5 days after.
dfs <- slide_index(df, df$i, ~.x, .before = 5, .after = 5)
# Follow up with a `map2()` with `i` as the second input.
# This allows you to track the current `i` value and weight accordingly.
result <- fake_map2(dfs, df$i, weight_by_distance)
head(result)