INDEX file used ONLY for SMARTS substructure pattern!!!


Basic filters for multimolecule file (like CHEMBL):
1) -s molecules that match a SMARTS string
2) -v molecules that DON't match a SMARTS string
3) -f and -l molecules in a certain range
4) --unique only unique molecules
5) --filter molecules that meet specified criteria

for the --filter option::

babel chembl.fs --filter "MW<130" output.smi
       multimol file             filter
babel chembl.fs --filter "s='CN' s!='[N+]'"  output.smi


• Substructure:
babel index.fs outfile.yyy -s SMILES
or
babel index.fs outfile.yyy -s filename.xxx

where yyy is a format id known to OpenBabel, e.g. sdf
• Molecular similarity based on Tanimoto coefficient:
babel index.fs outfile.yyy -at15 -sSMILES # best 15 molecules
babel index.fs outfile.yyy -at0.7 -sSMILES # tanimoto > 0.7
babel index.fs outfile.yyy -at0.7,0.9 -sSMILES # 0.7 < tanimoto < 0.9


Available filters
---------------------
abonds    Number of aromatic bonds
atoms    Number of atoms
bonds    Number of bonds
cansmi    Canonical SMILES
cansmiNS    Canonical SMILES without isotopes or stereo
dbonds    Number of double bonds
formula    Chemical formula
HBA1    Number of Hydrogen Bond Acceptors 1 (JoelLib)
HBA2    Number of Hydrogen Bond Acceptors 2 (JoelLib)
HBD    Number of Hydrogen Bond Donors (JoelLib)
InChI    IUPAC InChI identifier
InChIKey    InChIKey
L5    Lipinski Rule of Five
logP    octanol/water partition coefficient
MR    molar refractivity
MW    Molecular Weight filter
nF    Number of Fluorine Atoms
s    SMARTS filter
sbonds    Number of single bonds
smarts    SMARTS filter
tbonds    Number of triple bonds
title    For comparing a molecule's title
TPSA    topological polar surface area

================================================
================================================
Similarity search and molecular fingerprints

Available fingerprints:
FP2    Indexes linear fragments up to 7 atoms.
FP3    SMARTS patterns specified in the file patterns.txt
FP4    SMARTS patterns specified in the file SMARTS_InteLigand.txt
MACCS    SMARTS patterns specified in the file MACCS.txt

babel referencemol.smi chembl.sdf -ofpt -xfFP2  # gives you the tanimoto for all molecules in chembl.sdf with respect to referencemol
babel chembl.sdf -sreferencemol.smi  -ofpt  -xfFP2  # gives you the tanimoto for all molecules in chembl.sdf with respect to referencemol
babel referencemol.smi chembl.sdf -ofpt  -xfFP2 -s c1cccc1Br # tanimoto for matching SMARTS (referencemol should match as well!!!!)
babel chembl.fs -xfFP2 -squery.mol -at5 # 5 most similar to query.mol
obabel chembl_02.fs -O out.svg -s first.sdf # prints results in 2D depiction



================================================
================================================
Sorting:

obabel  infile.xxx  outfile.xxx  --sort desc

If the descriptor desc provides a numerical value, the molecule with the smallest value is output first. For descriptors that provide a string output the order is alphabetical, but for the InChI descriptor a more chemically informed order is used (e.g. “CH4” is before than “C2H6”, “CH4” is less than “ClH” hydrogen chloride).

The order can be reversed by preceding the descriptor name with ~, e.g.:

obabel  infile.xxx  outfile.yyy  --sort ~logP





