Class FeatureGenerator


  • public class FeatureGenerator
    extends java.lang.Object
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static java.lang.String[] str
      definition for all the adopted SYBYL atom types
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.String generateAllFeatures​(java.util.ArrayList<java.lang.String> molf, java.util.ArrayList<java.util.ArrayList<java.lang.String>> atomf, java.lang.String pathToInputFile)
      This function is used to generate all the features for the molecules contained in the input file
      java.lang.String[] generateAtomicFeatures​(org.openscience.cdk.interfaces.IAtomContainer molecule)
      This function is used to generate the atomic features for a single molecule
      java.util.ArrayList<java.util.ArrayList<java.lang.String>> generateAtomicFeatures​(org.openscience.cdk.interfaces.IAtomContainerSet containers)
      This function is used to generate the atomic features for the molecules contained in the input file.
      java.lang.String[] generateAtomType​(org.openscience.cdk.interfaces.IAtomContainer molecule)
      This function is used to generate the atomic type for a single molecule.
      java.lang.String generateExtendedMolecularFeatures​(org.openscience.cdk.interfaces.IAtomContainer molecule)
      This function generates molecular features for a single molecule
      java.lang.String generateMolecularFeatures​(org.openscience.cdk.interfaces.IAtomContainer molecule)
      This function generates molecular features for a single molecule
      java.util.ArrayList<java.lang.String> generateMolecularFeatures​(org.openscience.cdk.interfaces.IAtomContainerSet set, java.lang.String pathToInputFile)
      This function is used to generate the molecular features for the molecules contained in the input file.
      java.lang.String[] generatePath​(org.openscience.cdk.interfaces.IAtomContainer molecule, int depth)
      This function is used to generate the environmental features for a single molecule.
      org.openscience.cdk.interfaces.IAtomContainerSet readFile​(java.lang.String pathToInputFile)
      This functions reads molecules from a chemical file and preprocess the molecules.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • str

        public static final java.lang.String[] str
        definition for all the adopted SYBYL atom types
    • Constructor Detail

      • FeatureGenerator

        public FeatureGenerator()
    • Method Detail

      • readFile

        public org.openscience.cdk.interfaces.IAtomContainerSet readFile​(java.lang.String pathToInputFile)
                                                                  throws java.io.FileNotFoundException,
                                                                         org.openscience.cdk.exception.CDKException
        This functions reads molecules from a chemical file and preprocess the molecules.
        Parameters:
        pathToInputFile - : path of the file which contains the input molecules
        Returns:
        an IAtomContainerSet contains all the input molecules
        Throws:
        java.io.FileNotFoundException
        org.openscience.cdk.exception.CDKException
      • generateMolecularFeatures

        public java.util.ArrayList<java.lang.String> generateMolecularFeatures​(org.openscience.cdk.interfaces.IAtomContainerSet set,
                                                                               java.lang.String pathToInputFile)
                                                                        throws java.io.IOException,
                                                                               org.openscience.cdk.exception.CDKException
        This function is used to generate the molecular features for the molecules contained in the input file.
        Parameters:
        pathToInputFile - : path of the file which contains the input molecules
        Returns:
        an arraylist which contains the molecular features.
        Throws:
        java.io.IOException
        org.openscience.cdk.exception.CDKException
      • generateMolecularFeatures

        public java.lang.String generateMolecularFeatures​(org.openscience.cdk.interfaces.IAtomContainer molecule)
                                                   throws org.openscience.cdk.exception.CDKException
        This function generates molecular features for a single molecule
        Parameters:
        molecule -
        Returns:
        String representation of molecular features where each feature is separated by a comma
        Throws:
        org.openscience.cdk.exception.CDKException
      • generateExtendedMolecularFeatures

        public java.lang.String generateExtendedMolecularFeatures​(org.openscience.cdk.interfaces.IAtomContainer molecule)
                                                           throws org.openscience.cdk.exception.CDKException
        This function generates molecular features for a single molecule
        Parameters:
        molecule -
        Returns:
        String representation of molecular features where each feature is separated by a comma
        Throws:
        org.openscience.cdk.exception.CDKException
      • generateAtomicFeatures

        public java.util.ArrayList<java.util.ArrayList<java.lang.String>> generateAtomicFeatures​(org.openscience.cdk.interfaces.IAtomContainerSet containers)
                                                                                          throws org.openscience.cdk.exception.CDKException,
                                                                                                 java.io.IOException
        This function is used to generate the atomic features for the molecules contained in the input file.
        Parameters:
        containers - : IAtomContainerSet, which contains the input molecules
        Returns:
        String representation of atomic features where each feature is separated by a comma
        Throws:
        org.openscience.cdk.exception.CDKException
        java.io.IOException
      • generateAtomType

        public java.lang.String[] generateAtomType​(org.openscience.cdk.interfaces.IAtomContainer molecule)
                                            throws org.openscience.cdk.exception.CDKException
        This function is used to generate the atomic type for a single molecule.
        Parameters:
        molecule -
        Returns:
        String representation of atomic type of one molecule where each feature is separated by a comma
        Throws:
        org.openscience.cdk.exception.CDKException
      • generatePath

        public java.lang.String[] generatePath​(org.openscience.cdk.interfaces.IAtomContainer molecule,
                                               int depth)
                                        throws org.openscience.cdk.exception.CDKException
        This function is used to generate the environmental features for a single molecule.
        Parameters:
        molecule -
        depth - : the bond length of an atom to the centered atom.
        Returns:
        String representation of the environmental features for a single molecule where each feature is separated by a comma
        Throws:
        org.openscience.cdk.exception.CDKException
      • generateAtomicFeatures

        public java.lang.String[] generateAtomicFeatures​(org.openscience.cdk.interfaces.IAtomContainer molecule)
                                                  throws org.openscience.cdk.exception.CDKException,
                                                         org.openscience.cdk.exception.NoSuchAtomTypeException
        This function is used to generate the atomic features for a single molecule
        Parameters:
        molecule -
        Returns:
        String representation of the atomic features for a single molecule where each feature is separated by a comma
        Throws:
        org.openscience.cdk.exception.CDKException
        org.openscience.cdk.exception.NoSuchAtomTypeException
      • generateAllFeatures

        public java.lang.String generateAllFeatures​(java.util.ArrayList<java.lang.String> molf,
                                                    java.util.ArrayList<java.util.ArrayList<java.lang.String>> atomf,
                                                    java.lang.String pathToInputFile)
                                             throws java.lang.Exception
        This function is used to generate all the features for the molecules contained in the input file
        Parameters:
        molf - : the molecular features of all the molecules
        atomf - : the atomic features of all the molecules
        pathToInputFile - : path of the file which contains the input molecules
        Returns:
        the path of the output csv file which contains all the features for the molecules
        Throws:
        java.lang.Exception