Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Compute the 166-bit MACCS (Molecular ACCess System) key
ResourceFunction["MACCSKeys"][molecule] returns the MACCS key for the Molecule molecule. | |
ResourceFunction["MACCSKeys"][smiles] returns the MACCS key for a molecule specified by the SMILES string smiles. |
MACCSKeys can take either a SMILES string or a Molecule as input. By default, it returns a SparseArray containing the 166 bits:
In[1]:= |
Out[1]= |
In[2]:= |
Out[2]= |
MACCSKeys is a Listable function:
In[3]:= |
Out[3]= |
Option values include "SparseArray", "OnBits", "MoleculePlot", "Function" and "SMARTS". The default setting of "SparseArray" returns the 166-bit vector:
In[4]:= |
Out[4]= |
The "OnBits" setting returns a list of the active (non-zero) bits in the MACCS key. These are 1-indexed (as is conventional in the Wolfram Language):
In[5]:= |
Out[5]= |
The "MoleculePlot" setting returns an association whose keys are the active bits and whose values are the MoleculePlots corresponding to the MoleculePattern that was matched for that key. Here we take the first three, for brevity:
In[6]:= |
Out[6]= |
The "Function" setting returns an association whose values are pure functions responsible for generating each key:
In[7]:= |
Out[7]= |
The "SMARTS" setting returns an Association whose values are the SMARTS specification for the pattern. Note that not all MACCS keys can be defined as SMARTS patterns (these return a “?”) and some MACCS keys require finding a certain number of matches above some threshold, so the SMARTS specification alone is not always a complete description of the key:
In[8]:= |
Out[8]= |
Compare the structural similarity of six common statin drugs using the JaccardDissimilarity of the MACCS keys (one minus this is equivalent to the Tanimoto similarity):
In[9]:= |
Out[527]= |
Empirically, less than 3% of randomly selected molecules have a MACCS Tanimoto similarity above 0.6. Use this as a threshold to visualize which molecules are similar to one another:
In[528]:= |
Out[528]= |
Generate random compound IDs and SMILES strings:
In[529]:= |
Get their MAACS keys:
In[530]:= |
The function for Tanimoto similarity:
In[531]:= |
Use the function to compute the Tanimoto similarites of the random compounds:
In[532]:= |
Generate a histogram of the similarity scores:
In[533]:= |
Out[533]= |
The mean Tanimoto similarity score for randomly selected molecules is approximately 0.35:
In[534]:= |
Out[534]= |
Only about 3% of randomly chosen molecules will have a Tanimoto similarity score above 0.6:
In[535]:= |
Out[535]= |
This work is licensed under a Creative Commons Attribution 4.0 International License