Function Repository Resource:

ReconstituteSequenceFromReferenceDifferences

Source Notebook

Reconstitute a sequence given positional differences with a reference sequence

Contributed by: John Cassel, Wolfram|Alpha Scientific Content

ResourceFunction["ReconstituteSequenceFromReferenceDifferences"][diff,ref]

produces a sequence by applying the relevant diff substitutions to the reference sequence ref.

Details

The positional differences used by this function can be produced with the resource function AlignmentToPositionDifferences.
The reference sequence can be a string or a bio sequence object.

Examples

Basic Examples (1) 

Reconstitute a simple sequence from a positional difference with a reference sequence:

In[1]:=
ResourceFunction[
 "ReconstituteSequenceFromReferenceDifferences"][{{2, "B" -> "D"}}, "ABC"]
Out[1]=

Scope (1) 

Biomolecular sequences can also serve as the source for a reconstituted sequence:

In[2]:=
ResourceFunction[
 "ReconstituteSequenceFromReferenceDifferences"][{{2, "G" -> "A"}, {5,
    "TG" -> "AC"}}, BioSequence["DNA", "GGGTTGCTC"]]
Out[2]=

Properties and Relations (1) 

This function will undo the effect of SequenceAlignment and the resource function AlignmentToPositionDifferences:

In[3]:=
refSeq = "ABC";
exampleDiffs = ResourceFunction["AlignmentToPositionDifferences"][
  SequenceAlignment[refSeq, "ADC"]]
Out[3]=
In[4]:=
ResourceFunction[
 "ReconstituteSequenceFromReferenceDifferences"][exampleDiffs, refSeq]
Out[4]=

Neat Examples (1) 

Use the SARS-CoV-2 reference sequence to reconstitute another SARS-CoV-2 sequence from positional differences obtained before:

In[5]:=
reconstutedSeq = ResourceFunction[
  "ReconstituteSequenceFromReferenceDifferences"][{{1, "ATT" -> ""}, {241, "C" -> "T"}, {1059, "C" -> "T"}, {3037, "C" -> "T"}, {8083, "G" -> "A"}, {10319, "C" -> "T"}, {14408, "C" -> "T"}, {14805, "C" -> "T"}, {18424, "A" -> "G"}, {21304, "C" -> "T"}, {23403, "A" -> "G"}, {25111, "T" -> "A"}, {25563, "G" -> "T"}, {25907, "G" -> "T"}, {27964, "C" -> "T"}, {28472, "C" -> "T"}, {28869, "C" -> "T"}, {29402, "G" -> "T"}, {29871, "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" -> ""}},
  ResourceData["Genetic Sequences for the SARS-CoV-2 Coronavirus", "ReferenceBioSequence"]]
Out[5]=
In[6]:=
BioSequence["DNA", ResourceFunction["ImportFASTA"]["MW850352"][[2, 1]]] === reconstutedSeq
Out[6]=

Version History

  • 1.0.0 – 13 April 2021

Related Resources

License Information