Class FastqInputFormat

    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

        org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected boolean splittable  
      • Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

        DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected boolean isSplitable​(org.apache.hadoop.mapreduce.JobContext context, org.apache.hadoop.fs.Path filename)
      Checks to see if the file we are looking at is splittable.
      • Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

        addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
      • Methods inherited from class org.apache.hadoop.mapreduce.InputFormat

        createRecordReader
    • Field Detail

      • splittable

        protected boolean splittable
    • Constructor Detail

      • FastqInputFormat

        public FastqInputFormat()
    • Method Detail

      • isSplitable

        protected boolean isSplitable​(org.apache.hadoop.mapreduce.JobContext context,
                                      org.apache.hadoop.fs.Path filename)
        Checks to see if the file we are looking at is splittable. A file is splittable if it is: - Uncompressed. - Compressed with the BGZFEnhancedGzipCodec _and_ the underlying stream is a BGZF stream. BGZFEnhancedGzipCodec looks for files with a .gz extension, which means that the codec may be selected if the file is a non-block GZIPed file, and thus is non-splittable. To validate this, we use HTSJDKs in-built mechanism for checking if a stream is a BGZF stream. - Any other splittable codec (e.g., .bgz/BGZFCodec, .bz2/BZip2Codec)
        Overrides:
        isSplitable in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<Void,​org.apache.hadoop.io.Text>
        Parameters:
        context - The job context to get the configuration from.
        filename - The path the input file is saved at.
        Returns:
        Returns false if this file is compressed.