Function Repository Resource:

PANTHERTreeGrafter

Source Notebook

Integrate a protein sequence into an existing phylogenetic tree of a protein family

Contributed by: Keiko Hirayama

ResourceFunction["PANTHERTreeGrafter"][seq, "Dataset"]

gives the annotation result of the input seq in a dataset form.

ResourceFunction["PANTHERTreeGrafter"][seq, "TreeGraphic"]

gives the protein family tree graphic highlighting the location of the grafted seq.

ResourceFunction["PANTHERTreeGrafter"][seq, "Tree"]

gives the protein family tree highlighting the location of the grafted seq.

ResourceFunction["PANTHERTreeGrafter"]["Species"]

gives the dataset of supported species.

Details and Options

ResourceFunction["PANTHERTreeGrafter"] is based on the PANTHER TreeGrafter tool, which classifies the identified protein sequence by determining its most likely position in the protein family tree.
ResourceFunction["PANTHERTreeGrafter"] also annotates the identified protein sequence with relevant information based on its position in the family tree, using the Gene Ontology terms.
ResourceFunction["PANTHERTreeGrafter"] accepts a protein sequence as input, either in string format or as a BioSequence object.
The following option can be given:
"Species"Nonespecified species to filter the result; accepted values include the scientific names, common names or NCBI Taxonomy ID of species found in the dataset retrieved by PANTHERTreeGrafter["Species"]
ResourceFunction["PANTHERTreeGrafter"][seq] is equivalent to ResourceFunction["PANTHERTreeGrafter"][seq,"TreeGraphic"].
Speed of accessing protein data from online sources can take more than a few seconds.

Examples

Basic Examples (3) 

Retrieve the annotation result for a protein sequence:

In[1]:=
ResourceFunction[
 "PANTHERTreeGrafter"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Dataset"]
Out[1]=

Highlight the location of the grafted sequence on the protein family phylogenetic tree graphic. Hover over any nodes to find associated species information:

In[2]:=
ResourceFunction[
 "PANTHERTreeGrafter"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "TreeGraphic"]
Out[2]=

Retrieve the protein family tree, highlighting the location of the grafted sequence. Hover over any tree elements to find associated species information:

In[3]:=
ResourceFunction[
 "PANTHERTreeGrafter"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Tree"]
Out[3]=

Scope (4) 

Retrieve the annotation result for a protein sequence:

In[4]:=
pantherdataset = ResourceFunction["PANTHERTreeGrafter"][BioSequence[
  "Peptide", "ILQHNQNMSGLEKVSKISPCDVSLETSDICKCSIGKLHKSVSSANTCGIFSTASGKSVQVSDASLQNARQVFSEIEDSTKQVFSKVLFKSNEHSD", {}], "Dataset"]
Out[4]=

Explore annotated terms associated with the grafted sequence:

In[5]:=
annotation = pantherdataset[Select[MatchQ[#Accession, "ANGRAFTED"] &]][1, "PANTHER_GO_SLIM_BP"]
Out[5]=

Find the associated human gene:

In[6]:=
genesymbol = pantherdataset[Select[! FreeQ[#, "Homo sapiens"] &], "GeneSymbol"]
Out[6]=

Use the BioDBnetGeneData resource function to get more information on the gene:

In[7]:=
ResourceFunction["BioDBnetGeneData"][genesymbol[1]]
Out[7]=

Options (1) 

Species (1) 

Retrieve the result of the tree analysis filtered for the Mus musculus species:

In[8]:=
ResourceFunction[
 "PANTHERTreeGrafter"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Tree", "Species" -> "Mus musculus"]
Out[8]=

Requirements

Wolfram Language 14.0 (January 2024) or above

Version History

  • 1.0.0 – 02 June 2025

Source Metadata

Related Resources

License Information