Wolfram Language Paclet Repository

Community-contributed installable additions to the Wolfram Language

Primary Navigation

    • Cloud & Deployment
    • Core Language & Structure
    • Data Manipulation & Analysis
    • Engineering Data & Computation
    • External Interfaces & Connections
    • Financial Data & Computation
    • Geographic Data & Computation
    • Geometry
    • Graphs & Networks
    • Higher Mathematical Computation
    • Images
    • Knowledge Representation & Natural Language
    • Machine Learning
    • Notebook Documents & Presentation
    • Scientific and Medical Data & Computation
    • Social, Cultural & Linguistic Data
    • Strings & Text
    • Symbolic & Numeric Computation
    • System Operation & Setup
    • Time-Related Computation
    • User Interface Construction
    • Visualization & Graphics
    • Random Paclet
    • Alphabetical List
  • Using Paclets
    • Get Started
    • Download Definition Notebook
  • Learn More about Wolfram Language

Selfies

Guides

  • Selfies Functions

Tech Notes

  • Generating Molecules With SELFIES

Symbols

  • EncodingToSelfies
  • FromSelfies
  • SelfiesAlphabet
  • SelfiesCounts
  • SelfiesEncoding
  • SplitSelfies
  • ToSelfies
WolframChemistry`Selfies`
SelfiesEncoding
​
SelfiesEncoding
[selfies,alphabet]
converts a SELFIES string to an integer-labeled encoding.
​
​
SelfiesEncoding
[selfies,alphabet,"OneHot"]
returns a list of boolean vectors for each token in the SELFIES string.
​
Examples  
(1)
Basic Examples  
(1)
Create an alphabet from a SELFIES string:
In[1]:=
alphabet=
SelfiesAlphabet
selfies=
ToSelfies
["CC1(C)CC(C1)[C@H]1O[C@@H]1C1(C)CC1"]
Out[1]=
{[O],[C@@H1],[Ring2],[Branch1],[Ring1],[C@H1],[C]}
Use the alphabet to encode the string to a numeric vector:
In[2]:=
SelfiesEncoding
[selfies,alphabet,"Label"]
Out[2]=
{6,6,3,6,6,6,6,3,2,6,4,3,5,0,1,4,4,6,3,6,6,6,6,4,2}
Create a one-hot encoding:
In[3]:=
SelfiesEncoding
[selfies,alphabet,"OneHot"]//MatrixForm
Out[3]//MatrixForm=
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
Convert the encoding back to a SELFIES string:
In[4]:=
EncodingToSelfies
[%,alphabet,"OneHot"]
Out[4]=
[C][C][Branch1][C][C][C][C][Branch1][Ring2][C][Ring1][Branch1][C@H1][O][C@@H1][Ring1][Ring1][C][Branch1][C][C][C][C][Ring1][Ring2]
TechNotes
▪
Generating Molecules With SELFIES
RelatedGuides
▪
Selfies Functions
RelatedLinks
▪
Self-Referencing Embedded Strings (SELFIES): A 100% robust molecular string representationMario Krenn, Florian Haese, AkshatKumar Nigam, Pascal Friederich, Alan Aspuru-Guzik
See Also
FromSelfies
 ▪
SelfiesAlphabet
 ▪
SelfiesEncoding
""

© 2025 Wolfram. All rights reserved.

  • Legal & Privacy Policy
  • Contact Us
  • WolframAlpha.com
  • WolframCloud.com