Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Retrieve information on reference SNPs from the NCBI database
ResourceFunction["NCBIGenomicSNPData"][snp, "VariantDetails"] gives the dataset of variant information for a specified snp. | |
ResourceFunction["NCBIGenomicSNPData"][snp, "FrequencyData"] gives the dataset of allele frequencies for a specified snp. | |
ResourceFunction["NCBIGenomicSNPData"][snp, "ALFAFrequencyData"] gives the dataset of ALFA allele frequencies for a specified snp. | |
ResourceFunction["NCBIGenomicSNPData"][snp, "ClinicalSignificance"] gives the dataset of associated diseases for a specified snp. |
| "Gene" | associated ID |
| "GeneSymbol" | associated gene symbol |
| "Orientation" | orientation of the genomic sequence |
| "NucleotideSeqAccession" | NCBI nucleotide sequence accession ID |
| "NucleotidePosition" | position of the allele on the nucleotide sequence |
| "DeletedSequence" | sequence of deleted nucleotides or the codon |
| "InsertedSequence" | sequence of inserted nucleotides or the codon |
| "NucleotideVarSequenceOntologyAccession" | accession ID of the Sequence Ontology (SO) concept describing the nucleotide sequence variation |
| "NucleotideVarSequenceOntologyTerm" | name of the Sequence Ontology (SO) concept describing the nucleotide sequence variation |
| "HGVS" | Human Genome Variation Society (HGVS) notation |
| "ProteinSeqAccession" | NCBI protein sequence accession ID |
| "ProteinPosition" | position of the amino acid change on the protein sequence |
| "DeletedAminoAcid" | letter of the deleted amino acid |
| "InsertedAminoAcid" | letter of the inserted amino acid |
| "ProteinVarSequenceOntologyAccession" | accession ID of the Sequence Ontology (SO) concept describing the protein sequence variation |
| "ProteinVarSequenceOntologyTerm" | name of the Sequence Ontology (SO) concept describing the protein sequence variation |
| "StudyName" | name of the study |
| "RefSeqAccession" | NCBI refrence sequence accession ID |
| "Position" | position of the allele on the reference sequence |
| "RefAllele" | reference allele |
| "AltAllele" | alternate allele |
| "RefAlleleFrequency" | reported reference allele frequency |
| "AltAlleleFrequency" | reported alternate allele frequency |
| "TotalCount" | total sample size |
| "BioSampleID" | ID |
| "ID" | population ID |
| "Name" | population name |
| "Group" | population group |
| "Description" | population description |
| "RefAllele" | reference allele |
| "AltAllele" | alternate allele |
| "RefAlleleFrequency" | reported reference allele frequency |
| "AltAlleleFrequency" | reported alternate allele frequency |
| "TotalCount" | total sample size |
| "AssociatedGenes" | associated IDs |
| "ClinicalSignificance" | reported clinical significance |
| "DiseaseNames" | names of associated diseases |
| "MedGen" | associated MedGen concepts |
| "ClinVarID" | associated ClinVar ID |
| "AlleleID" | assigned allele ID reported in ClinVar |
| "ReviewStatus" | assigned review status |
For SNP RS429358, which is a genetic variation found in the APOE gene associated with a risk of Alzheimer's disease, list variant details:
| In[1]:= |
| Out[1]= | ![]() |
Retrieve clinical significance information for a given SNP:
| In[2]:= |
| Out[2]= | ![]() |
Retrieve allele frequency data for a given SNP entity:
| In[3]:= |
| Out[3]= | ![]() |
Retrieve aggregated allele frequency data for diverse populations:
| In[4]:= |
| Out[4]= | ![]() |
Compare the reference and the alternate sequences associated with a given SNP:
| In[5]:= |
| Out[5]= |
Use the ImportFASTA ResourceFunction to retrieve the reference sequence:
| In[6]:= |
| In[7]:= |
| Out[7]= | ![]() |
Compute the alternate sequence using "NucleotidePosition", "DeletedSequence" and "InsertedSequence" information:
| In[8]:= | ![]() |
Use the DNAAlignmentPlot function to visualize the allele position:
| In[9]:= |
| Out[9]= | ![]() |
Next, explore how this change impacts the part of translated peptide sequences. Apply the BioSequenceTranslate function to the reference sequence to retrieve the sequence of amino acids:
| In[10]:= | ![]() |
| Out[10]= |
Notice that the reading frame is shifted for the alternate peptide sequence and the stop codon is inserted four amino acids downstream:
| In[11]:= | ![]() |
| Out[11]= |
Compare the molecule plots:
| In[12]:= |
| Out[12]= | ![]() |
Wolfram Language 13.0 (December 2021) or above
This work is licensed under a Creative Commons Attribution 4.0 International License