Wolfram Language Paclet Repository
Community-contributed installable additions to the Wolfram Language
SELF-referencIng Embedded Strings
Contributed By: Jason Biggs, Wolfram Research
SELFIES (SELF-referencIng Embedded Strings) is a string encoding for a chemical structure, like the "SMILES" or "InChI" formats. What sets SELFIES apart is its robustness with respect to token rearrangement. Take a given SELFIES string and rearrange the components and you are left with a valid SELFIES string, with no exception. This is in contrast to the SMILES representation, where an unmatched parenthesis or ring closure can lead to a syntactically invalid string. The aim of SELFIES is to provide a framework for generative machine learning models, to create entirely new molecules from mixing and matching components from existing molecules.
To install this paclet in your Wolfram Language environment,
evaluate this code:
PacletInstall["WolframChemistry/Selfies"]
Generate a SELFIES string from a Molecule:
In[1]:= |
Out[1]= |
Convert the output back to a SMILES string:
In[2]:= |
Out[2]= |
Find all the tokens contained within the SELFIES string:
In[3]:= |
Out[3]= |
Taken the list of tokens and combine them to create new molecules:
In[4]:= |
Out[4]= |
Wolfram Language Version 12.3