Wolfram Function Repository
Instant-use add-on functions for the Wolfram Language
Function Repository Resource:
Retrieve features that overlap a given genomic region
ResourceFunction["EnsemblGenomeRegion"][chromosome,{start,end}] retrieve genomic features that overlap a given chromosome region specified by the start and end positions and return information in a Dataset format. | |
ResourceFunction["EnsemblGenomeRegion"][chromosome,{start,end},format] retrieve genomic sequence that overlap a given chromosome region specified by the start and end positions in a specified format. |
| "Dataset" | genomic features returned as a Dataset (default) |
| "BioSequence" | genomic sequence returned as a BioSequence object; applicable to the "sequence" Feature |
| "FASTA" | genomic sequence returned in FASTA format; applicable to the "sequence" Feature |
| "Species" | "Homo sapiens " | species for which to query; genome assemblies of following species are supported: Homo sapiens (human), Mus musculus (mouse), Danio rerio (zebrafish), Caenorhabditis elegans (nematode), Saccharomyces cerevisiae (yeast) |
| "Assembly" | None | assembly version for which to query sequence; if not specified the latest available version is used |
| "Mask" | None | option to query the sequence masked for repeat sequences; "Hard" will mask all repeats as N's and "Soft" will mask repeats as lower cased characters |
| "Strand" | 1 | strand of the nucleotide sequence to retrieve; allowed values are 1 or -1 |
| "Feature" | “sequence" | type of genomic feature to retrieve; list of multiple values are also accepted; allowed values include: "sequence", "band", "gene", "transcript", "cds", "exon", "repeat", "simple", "misc", "variation", "somatic_variation", "structural_variation", "somatic_structural_variation", "constrained", "regulatory", "motif", "mane" |
| "BioType" | None | functional classification of "gene" or "transcript" features to fetch; allowed value includes "protein_coding" |
| "DBType" | "core" | database type to retrieve features from; allowed values include: "core", "otherfeatures" |
| "SOTerm" | None | Sequence Ontology term to restrict the variants found |
| "TrimDownstream" | False | whether to return features which overlap the downstream end of the region |
| "TrimUpstream" | False | whether to return features which overlap the upstream end of the region |
| "VariantSet" | None | short name of a set to restrict the variants found such as "ClinVar" and "ph_uniprot"; list of short names are found here |
Retrieve a nucleotide sequence for a specified region of human chromosome 13:
| In[1]:= |
| Out[1]= | ![]() |
Get a result as the BioSequence object:
| In[2]:= |
| Out[2]= |
Find genes that overlap a specified region of human chromosome 17:
| In[3]:= |
| Out[3]= | ![]() |
Find genomic variations that overlap a specified region of human chromosome 6:
| In[4]:= |
| Out[4]= | ![]() |
Use the NCBIGenomicSNPData resource function to retrieve more information on a selected variation:
| In[5]:= |
| Out[5]= | ![]() |
Find its clinical significance:
| In[6]:= |
| Out[6]= |
Use the Species option to specify the organism of the genomics feature:
| In[7]:= |
| Out[7]= | ![]() |
Use the Assembly option to specify the version of the genomics assembly:
| In[8]:= |
| Out[8]= | ![]() |
Use the Mask option to retrieve the masked genome sequence where repeats are shown as lower cased characters:
| In[9]:= |
| Out[9]= | ![]() |
Use the Strand option to retrieve the complementary DNA sequence:
| In[10]:= |
| Out[10]= | ![]() |
Use the Feature option to selectively retrieve bands and transcripts associated with the given genomic region:
| In[11]:= |
| Out[11]= | ![]() |
Use the BioType option to retrieve protein coding genes associated with the given genomic region:
| In[12]:= |
| Out[12]= | ![]() |
Use the DBType option to retrieve additional gene features associated with the given genomic region:
| In[13]:= |
| Out[13]= | ![]() |
Use the SOTerm option to retrieve missense variants (SO:0001583) associated with the given genomic region:
| In[14]:= |
| Out[14]= | ![]() |
Use the TrimUpstream option to retrieve genes that overlap with the given genomic region, but not with the downstream region:
| In[15]:= |
| Out[15]= | ![]() |
Use the TrimUpstream option to retrieve genes that overlap with the given genomic region, but not with the upstream region:
| In[16]:= |
| Out[16]= | ![]() |
Use the VariantSet option to retrieve variants with ClinVar annotation associated with the given genomic region:
| In[17]:= |
| Out[17]= | ![]() |
Find regulatory regions of the human chromosome 1:
| In[18]:= |
| Out[18]= | ![]() |
Group regions by the type of regulatory features:
| In[19]:= |
| Out[19]= | ![]() |
Visualize regulatory regions using the circular diagram illustrating their chromosome positions:
| In[20]:= | ![]() |
| In[21]:= |
| In[22]:= | ![]() |
| Out[22]= | ![]() |
Wolfram Language 13.0 (December 2021) or above
This work is licensed under a Creative Commons Attribution 4.0 International License