Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Find interleaved, common and different substrings in a pair of strings
ResourceFunction["StringAlign"][s1,s2] aligns characters in strings s1 and s2 so that long common consecutive substrings are preserved. |
Find the common substrings and difference pairs for two strings:
In[1]:= | ![]() |
Out[3]= | ![]() |
Create a pseudorandom string with several hundred characters and make random substitutions to form a second string:
In[4]:= | ![]() |
Compute the alignment of these two strings:
In[5]:= | ![]() |
Out[5]= | ![]() |
SequenceAlignment gives a similar but not identical result:
In[6]:= | ![]() |
Out[6]= | ![]() |
Create a pseudorandom string with two hundred characters and make random substitutions to form a second string:
In[7]:= | ![]() |
Compute the alignment of these two strings:
In[8]:= | ![]() |
Out[8]= | ![]() |
SequenceAlignment gives a result that is similar but not identical:
In[9]:= | ![]() |
Out[9]= | ![]() |
Create a pair of strings of 30000 characters, with 10 alterations of four characters each:
In[10]:= | ![]() |
SequenceAlignment takes several seconds to find the optimal alignments:
In[11]:= | ![]() |
Out[11]= | ![]() |
StringAlign is several times faster:
In[12]:= | ![]() |
Out[12]= | ![]() |
The byte sizes of the results are similar and both are near the common string length, indicating both do a good job of compressing the string differences:
In[13]:= | ![]() |
Out[13]= | ![]() |
Wolfram Language 12.3 (May 2021) or above
This work is licensed under a Creative Commons Attribution 4.0 International License