Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Find interleaved, common and different substrings in a pair of strings
ResourceFunction["StringAlign"][s1,s2] aligns characters in strings s1 and s2 so that long common consecutive substrings are preserved. |
Find the common substrings and difference pairs for two strings:
In[1]:= |
Out[3]= |
Create a pseudorandom string with several hundred characters and make random substitutions to form a second string:
In[4]:= |
Compute the alignment of these two strings:
In[5]:= |
Out[5]= |
SequenceAlignment gives a similar but not identical result:
In[6]:= |
Out[6]= |
Create a pseudorandom string with two hundred characters and make random substitutions to form a second string:
In[7]:= |
Compute the alignment of these two strings:
In[8]:= |
Out[8]= |
SequenceAlignment gives a result that is similar but not identical:
In[9]:= |
Out[9]= |
Create a pair of strings of 30000 characters, with 10 alterations of four characters each:
In[10]:= |
SequenceAlignment takes several seconds to find the optimal alignments:
In[11]:= |
Out[11]= |
StringAlign is several times faster:
In[12]:= |
Out[12]= |
The byte sizes of the results are similar and both are near the common string length, indicating both do a good job of compressing the string differences:
In[13]:= |
Out[13]= |
Wolfram Language 12.3 (May 2021) or above
This work is licensed under a Creative Commons Attribution 4.0 International License