storr {storr}R Documentation

Object cache

Description

Create an object cache; a "storr". A storr is a simple key-value store where the actual content is stored in a content-addressible way (so that duplicate objects are only stored once) and with a caching layer so that repeated lookups are fast even if the underlying storage driver is slow.

Usage

storr(driver, default_namespace = "objects")

Arguments

driver

A driver object

default_namespace

Default namespace to store objects in. By default "objects" is used, but this might be useful to have two diffent storr objects pointing at the same underlying storage, but storing things in different namespaces.

Details

To create a storr you need to provide a "driver" object. There are three in this package: driver_environment for ephemeral in-memory storage, driver_rds for serialized storage to disk, and driver_dbi for use with DBI-compliant database interfaces. The redux package (on CRAN) provides a storr driver that uses Redis.

There are convenience functions (e.g., storr_environment and storr_rds) that may be more convenient to use than this function.

Once a storr has been made it provides a number of methods. Because storr uses R6 (R6Class) objects, each method is accessed by using $ on a storr object (see the examples). The methods are described below in the "Methods" section.

The default_namespace affects all methods of the storr object that refer to namespaces; if a namespace is not given, then the action (get, set, del, list, import, export) will affect the default_namespace. By default this is "objects".

Methods

destroy

Totally destroys the storr by telling the driver to destroy all the data and then deleting the driver. This will remove all data and cannot be undone.

Usage: destroy()

flush_cache

Flush the temporary cache of objects that accumulates as the storr is used. Should not need to be called often.

Usage: flush_cache()

set

Set a key to a value.

Usage: set(key, value, namespace = self$default_namespace, use_cache = TRUE)

Arguments:

Value: Invisibly, the hash of the saved object.

set_by_value

Like set but saves the object with a key that is the same as the hash of the object. Equivalent to $set(digest::digest(value), value).

Usage: set_by_value(value, namespace = self$default_namespace, use_cache = TRUE)

Arguments:

get

Retrieve an object from the storr. If the requested value is not found then a KeyError will be raised (an R error, but can be caught with tryCatch; see the "storr" vignette).

Usage: get(key, namespace = self$default_namespace, use_cache = TRUE)

Arguments:

get_hash

Retrieve the hash of an object stored in the storr (rather than the object itself).

Usage: get_hash(key, namespace = self$default_namespace)

Arguments:

del

Delete an object fom the storr.

Usage: del(key, namespace = self$default_namespace)

Arguments:

Value: A logical vector the same length as the recycled length of key/namespace, with each element being TRUE if an object was deleted, FALSE otherwise.

duplicate

Duplicate the value of a set of keys into a second set of keys. Because the value stored against a key is just the hash of its content, this operation is very efficient - it does not make a copy of the data, just the pointer to the data (for more details see the storr vignette which explains the storage model in more detail). Multiple keys (and/or namespaces) can be provided, with keys and nmespaces recycled as needed. However, the number of source and destination keys must be the same. The order of operation is not defined, so if the sets of keys are overlapping it is undefined behaviour.

Usage: duplicate(key_src, key_dest, namespace = self$default_namespace, namespace_src = namespace, namespace_dest = namespace)

Arguments:

fill

Set one or more keys (potentially across namespaces) to the same value, without duplication effort serialisation, or duplicating data.

Usage: fill(key, value, namespace = self$default_namespace, use_cache = TRUE)

Arguments:

clear

Clear a storr. This function might be slow as it will iterate over each key. Future versions of storr might allow drivers to implement a bulk clear method that will allow faster clearing.

Usage: clear(namespace = self$default_namespace)

Arguments:

exists

Test if a key exists within a namespace

Usage: exists(key, namespace = self$default_namespace)

Arguments:

Value: A logical vector the same length as the recycled length of key/namespace, with each element being TRUE if the object exists and FALSE otherwise.

exists_object

Test if an object with a given hash exists within the storr

Usage: exists_object(hash)

Arguments:

mset

Set multiple elements at once

Usage: mset(key, value, namespace = self$default_namespace, use_cache = TRUE)

Arguments:

Details: The arguments key and namespace are recycled such that either can be given as a scalar if the other is a vector. Other recycling is not allowed.

mget

Get multiple elements at once

Usage: mget(key, namespace = self$default_namespace, use_cache = TRUE, missing = NULL)

Arguments:

Details: The arguments key and namespace are recycled such that either can be given as a scalar if the other is a vector. Other recycling is not allowed.

Value: A list with a length of the recycled length of key and namespace. If any elements are missing, then an attribute missing will indicate the elements that are missing (this will be an integer vector with the indices of values were not found in the storr).

mset_by_value

Set multiple elements at once, by value. A cross between mset and set_by_value.

Usage: mset_by_value(value, namespace = self$default_namespace, use_cache = TRUE)

Arguments:

gc

Garbage collect the storr. Because keys do not directly map to objects, but instead map to hashes which map to objects, it is possible that hash/object pairs can persist with nothing pointing at them. Running gc will remove these objects from the storr.

Usage: gc()

get_value

Get the content of an object given its hash.

Usage: get_value(hash, use_cache = TRUE)

Arguments:

Value: The object if it is present, otherwise throw a HashError.

set_value

Add an object value, but don't add a key. You will not need to use this very often, but it is used internally.

Usage: set_value(value, use_cache = TRUE)

Arguments:

Value: Invisibly, the hash of the object.

mset_value

Add a vector of object values, but don't add keys. You will not need to use this very often, but it is used internally.

Usage: mset_value(values, use_cache = TRUE)

Arguments:

list

List all keys stored in a namespace.

Usage: list(namespace = self$default_namespace)

Arguments:

Value: A sorted character vector (possibly zero-length).

list_hashes

List all hashes stored in the storr

Usage: list_hashes()

Value: A sorted character vector (possibly zero-length).

list_namespaces

List all namespaces known to the database

Usage: list_namespaces()

Value: A sorted character vector (possibly zero-length).

import

Import R objects from an environment.

Usage: import(src, list = NULL, namespace = self$default_namespace, skip_missing = FALSE)

Arguments:

export

Export objects from the storr into something else.

Usage: export(dest, list = NULL, namespace = self$default_namespace, skip_missing = FALSE)

Arguments:

Value: Invisibly, dest, which allows use of e <- st$export(new.env()) and x <- st$export(list()).

archive_export

Export objects from the storr into a special "archive" storr, which is an storr_rds with name mangling turned on (which encodes keys with base64 so that they do not voilate filesystem naming conventions).

Usage: archive_export(path, names = NULL, namespace = NULL)

Arguments:

archive_import

Inverse of archive_export; import objects from a storr that was created by archive_export.

Usage: archive_import(path, names = NULL, namespace = NULL)

Arguments:

index_export

Generate a data.frame with an index of objects present in a storr. This can be saved (for an rds storr) in lieu of the keys/ directory and re-imported with index_import. It will provide a more version control friendly export of the data in a storr.

Usage: index_export(namespace = NULL)

Arguments:

index_import

Import an index.

Usage: index_import(index)

Arguments:

Examples

st <- storr(driver_environment())
## Set "mykey" to hold the mtcars dataset:
st$set("mykey", mtcars)
## and get the object:
st$get("mykey")
## List known keys:
st$list()
## List hashes
st$list_hashes()
## List keys in another namespace:
st$list("namespace2")
## We can store things in other namespaces:
st$set("x", mtcars, "namespace2")
st$set("y", mtcars, "namespace2")
st$list("namespace2")
## Duplicate data do not cause duplicate storage: despite having three
## keys we only have one bit of data:
st$list_hashes()
st$del("mykey")

## Storr objects can be created that have a default namespace that is
## not "objects" by using the \code{default_namespace} argument (this
## one also points at the same memory as the first storr).
st2 <- storr(driver_environment(st$driver$envir),
             default_namespace = "namespace2")
## All functions now use "namespace2" as the default namespace:
st2$list()
st2$del("x")
st2$del("y")

[Package storr version 1.2.5 Index]