public class SortingCollection<T>
extends java.lang.Object
implements java.lang.Iterable<T>
When iterating over the collection, the number of file handles required is numRecordsInCollection/maxRecordsInRam. If this becomes a limiting factor, a file handle cache could be added.
If Snappy DLL is available and snappy.disable system property is not set to true, then Snappy is used to compress temporary files.
Modifier and Type | Class and Description |
---|---|
static interface |
SortingCollection.Codec<T>
Client must implement this class, which defines the way in which records are written to and
read from file.
|
Modifier and Type | Method and Description |
---|---|
void |
add(T rec) |
void |
cleanup()
Delete any temporary files.
|
void |
doneAdding()
This method can be called after caller is done adding to collection, in order to possibly free
up memory.
|
boolean |
isDestructiveIteration() |
CloseableIterator<T> |
iterator()
Prepare to iterate through the records in order.
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM)
Syntactic sugar around the ctor, to save some typing of type parameters.
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
boolean printRecordSizeSampling)
Syntactic sugar around the ctor, to save some typing of type parameters.
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
boolean printRecordSizeSampling,
java.nio.file.Path... tmpDir)
Syntactic sugar around the ctor, to save some typing of type parameters
|
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.util.Collection<java.io.File> tmpDirs)
Deprecated.
since 2017-09. Use
newInstanceFromPaths(Class, Codec, Comparator, int, Collection) instead |
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.io.File... tmpDir)
Deprecated.
since 2017-09. Use
newInstance(Class, Codec, Comparator, int, Path...) instead |
static <T> SortingCollection<T> |
newInstance(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.nio.file.Path... tmpDir)
Syntactic sugar around the ctor, to save some typing of type parameters
|
static <T> SortingCollection<T> |
newInstanceFromPaths(java.lang.Class<T> componentType,
SortingCollection.Codec<T> codec,
java.util.Comparator<T> comparator,
int maxRecordsInRAM,
java.util.Collection<java.nio.file.Path> tmpDirs)
Syntactic sugar around the ctor, to save some typing of type parameters
|
void |
setDestructiveIteration(boolean destructiveIteration)
Tell this collection that it is allowed to discard data during iteration in order to reduce memory footprint,
precluding a second iteration.
|
void |
spillToDisk()
Sort the records in memory, write them to a file, and clear the buffer of records in memory.
|
public void add(T rec)
public void doneAdding()
public boolean isDestructiveIteration()
public void setDestructiveIteration(boolean destructiveIteration)
public void spillToDisk()
public CloseableIterator<T> iterator()
iterator
in interface java.lang.Iterable<T>
public void cleanup()
@Deprecated public static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.io.File... tmpDir)
newInstance(Class, Codec, Comparator, int, Path...)
insteadcomponentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDir
- Where to write files of records that will not fit in RAM@Deprecated public static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.util.Collection<java.io.File> tmpDirs)
newInstanceFromPaths(Class, Codec, Comparator, int, Collection)
insteadcomponentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDirs
- Where to write files of records that will not fit in RAMpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling)
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to diskprintRecordSizeSampling
- If true record size will be sampled and output at DEBUG log levelpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, boolean printRecordSizeSampling, java.nio.file.Path... tmpDir)
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to diskprintRecordSizeSampling
- If true record size will be sampled and output at DEBUG log leveltmpDir
- Where to write files of records that will not fit in RAMpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM)
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to diskpublic static <T> SortingCollection<T> newInstance(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.nio.file.Path... tmpDir)
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDir
- Where to write files of records that will not fit in RAMpublic static <T> SortingCollection<T> newInstanceFromPaths(java.lang.Class<T> componentType, SortingCollection.Codec<T> codec, java.util.Comparator<T> comparator, int maxRecordsInRAM, java.util.Collection<java.nio.file.Path> tmpDirs)
componentType
- Class of the record to be sorted. Necessary because of Java generic lameness.codec
- For writing records to file and reading them back into RAMcomparator
- Defines output sort ordermaxRecordsInRAM
- how many records to accumulate in memory before spilling to disktmpDirs
- Where to write files of records that will not fit in RAM