Function Repository Resource:

FCGRImage

Source Notebook

Produce a frequency chaos game representation image from a string of nucleotides

Contributed by: Daniel Lichtblau

ResourceFunction["FCGRImage"][str]

gives the frequency chaos game representation (FCGR) image of a DNA nucleotide sequence str comprised of the characters "A","T","G" and "C".

ResourceFunction["FCGRImage"][str,k]

gives the FCGR image at resolution 2k.

ResourceFunction["FCGRImage"][str,k,bases]

gives the FCGR image using a square with corners given by bases, beginning at the lower right and proceeding counterclockwise.

Details and Options

ResourceFunction["FCGRImage"][str] uses a default pixelation level of 7 and works with the square having "A" as the lower-right vertex, proceeding counterclockwise with "T", "G" and "C".
ResourceFunction["FCGRImage"][str,k] gives an image of dimension 2k×2k.
For ResourceFunction["FCGRImage"][str,k,bases], the third argument must be a permutation of the list {"A","C","G","T"}.
Any occurrence of the character "U" in str is converted to "T".
All characters in str are converted to uppercase.
All characters in the converted input str other than {"A","C","G","T"} will be removed.
There is an "off-centered" asymmetry of the nucleotide positions from using grids that have side lengths of powers of 2. This will be illustrated in examples. It can be removed by enlarging the grids by one unit in each dimension.
By default, the asymmetry is not removed. The body of literature on the FCGR gives no indication of doing this. Perhaps more importantly, empirical evidence suggests that results are better when working with the slightly off-centered version.
ResourceFunction["FCGRImage"][bioseq] gives the FCGR image of a BioSequence expression that is of type "DNA", "RNA", "CircularDNA" or "CircularRNA". The BioSequence expression should not have any degenerate letters.

Examples

Basic Examples (1) 

Create a "random" FCGR image using a pseudorandom string of nucleotides:

In[1]:=
SeedRandom[12345678];
charstring = ToString[RandomChoice[{"A", "T", "C", "G"}, 2000]];
ResourceFunction["FCGRImage"][charstring, ImageSize -> 200]
Out[1]=

Scope (4) 

Select a human gene from Wolfram Language's curated data:

In[2]:=
gene = GenomeData["SCNN1A", "FullSequence"];

Show the chaos game representation image for this gene:

In[3]:=
ResourceFunction["FCGRImage"][gene]
Out[3]=

Show this gene at higher levels of pixelation:

In[4]:=
ResourceFunction["FCGRImage"][gene, 8]
Out[4]=

One may observe a fractal nature to these, with similarities appearing as the resolution level is increased:

In[5]:=
ResourceFunction["FCGRImage"][gene, 9]
Out[5]=

Select a human gene from Wolfram Language's curated data:

In[6]:=
gene = GenomeData["SCNN1A", "FullSequence"];

Compute the image with respect to a square having the purines positioned at the bottom vertices and the pyrimidines at the top:

In[7]:=
ResourceFunction["FCGRImage"][gene, 8, {"A", "G", "C", "T"}]
Out[7]=

Reverse the order of the purines from the default ordering:

In[8]:=
ResourceFunction["FCGRImage"][gene, 8, {"G", "T", "A", "C"}]
Out[8]=

Make a BioSequence expression from a "Gene" entity:

In[9]:=
gene = BioSequence[
  Entity["Gene", {"BRCA1", {"Species" -> "HomoSapiens"}}]]
Out[10]=

Show its chaos game representation image:

In[11]:=
ResourceFunction["FCGRImage"][gene]
Out[12]=

Import a FASTA file representing DNA from the human β-globin region on chromosome 11 from the NCBI Nucleotide Database:

In[13]:=
gene = First[
   URLExecute[
    "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi", <|
     "db" -> "nuccore", "id" -> "U01317.1", "rettype" -> "fasta", "retmode" -> "text"|>, "FASTA"]];

Show its chaos game representation image:

In[14]:=
ResourceFunction["FCGRImage"][gene, 9]
Out[14]=

Options (3) 

Show simple images using sequences of the same nucleotide repeated three times:

In[15]:=
Grid[Partition[
  Map[ResourceFunction["FCGRImage"][#, 4, ImageSize -> 200] &, {"AAA",
     "TTT", "CCC", ",GGG"}], 2]]
Out[15]=

The off-center asymmetry seen previously is perhaps more apparent when sequences of length 4 are used:

In[16]:=
Grid[Partition[
  Map[ResourceFunction["FCGRImage"][#, 4, ImageSize -> 200] &, {"AAAA", "TTTT", "CCCC", ",GGGG"}], 2]]
Out[16]=

The asymmetry can be removed using the option setting "Centered"True:

In[17]:=
Grid[Partition[
  Map[ResourceFunction["FCGRImage"][#, 4, "Centered" -> True, ImageSize -> 200] &, {"AAAA", "TTTT", "CCCC", ",GGGG"}], 2]]
Out[17]=

Requirements

Wolfram Language 11.3 (March 2018) or above

Version History

  • 2.1.0 – 10 February 2023
  • 2.0.1 – 19 April 2022
  • 2.0.0 – 14 November 2019
  • 1.0.0 – 14 February 2019

Source Metadata

Related Resources

License Information