Wolfram Function Repository
Instantuse addon functions for the Wolfram Language
Function Repository Resource:
Measure the similarity between two molecules
ResourceFunction["MoleculeFingerprintSimilarity"][mol_{1},mol_{2}] returns the fingerprint similarity between molecules mol_{1} and mol_{2}. 
"FingerprintType"  "RDKit"  the algorithm to use when encoding the molecule 
"SimilarityMeasure"  "Tanimoto"  the bit vector similarity measure to use 
"AtomPairs"  atoms are typed based on atomic number, number of pi electrons, and vertex degree, and all pairs of atom types, together with the distance between them, are hashed and corresponding bits in the fingerprint are set 
"MACCSKeys"  166 bit structural key descriptors in which each bit is associated with a SMARTS pattern 
"MorganConnectivity"  extendedconnectivity fingerprints, atoms are typed based on atomic number, heavyatom degree, mass number and ring membership and the neighborhood around the atoms are used to set the bits 
"MorganFeatures"  atoms are typed based on chemical features, such as Hbond acceptor/donor, aromaticity, acidity, etc. 
"TopologicalTorsions"  similar to "AtomPairs", but rather than pairs of atoms, all sets of four consecutively bonded atoms are used to generate the bits 
"RDKit"  identifies all subgraphs within a particular range of sizes, hashes each subgraph to generate a raw bit ID, mods that raw bit ID to fit in the assigned fingerprint size and then sets the corresponding bit 
"Asymmetric"  (a&b)_{o}/min(a_{o}+b_{o}) 
"BraunBlanquet"  (a&b)_{o}/max(a_{o}+b_{o}) 
"Cosine"  (a&b)_{o } / 
"Dice"  2(a&b)_{o}/(a_{o}+b_{o}) 
"Kulczynski"  ((a&b)_{o }(a_{o}+b_{o})) / 2a_{o}b_{o} 
"McConnaughey"  ((a&b)_{o}(a_{o}+b_{o})a_{o}b_{o}) / a_{o}b_{o} 
"Russel"  (a&b)_{o }/a_{o} 
"Sokal"  (a&b)_{o}/(2a_{o}+2b_{o}3(a&b)_{o}) 
"Tanimoto"  (a&b)_{o}/(a_{o}+b_{o}+(a&b)_{o}) 
Get the fingerprint similarity between two similar molecules:
In[1]:= 

Out[1]= 

In[2]:= 

Out[2]= 

Get the fingerprint similarity between two dissimilar molecules:
In[3]:= 

Out[3]= 

In[4]:= 

Out[4]= 

MoleculeFingerprintSimilarity works on molecules created from any source, from MoleculeRecognize to Entity:
In[5]:= 

Out[5]= 

The fingerprint method and similarity measure used can greatly affect the calculated similarity. Take two nominally similar molecules:
In[6]:= 

Out[6]= 

Measure the similarity using all available fingerprint types and similarity measures:
In[7]:= 

Out[7]= 

Visualize the results in a table:
In[8]:= 

Out[8]= 

MoleculeFingerprintSimilarity returns 0 for completely dissimilar molecules:
In[9]:= 

Out[9]= 

MoleculeFingerprintSimilarity returns 1 as the result if the two given molecules are identical:
In[10]:= 

Out[10]= 

The presence or absence of explicit hydrogens in the molecular graph can influence the computed similarity:
In[11]:= 

Out[11]= 

In[12]:= 

Out[12]= 

Create molecules from the list of central nervous system (CNS) agents obtained from PubChem:
In[13]:= 

Find the five nearest molecules to the tranquilizer diazepam:
In[14]:= 

Out[14]= 

In[15]:= 

Out[15]= 

This work is licensed under a Creative Commons Attribution 4.0 International License