Function Repository Resource:

PANTHERTreeGrafter

Source Notebook

Integrate a protein sequence into an existing phylogenetic tree of a protein family

Contributed by: Keiko Hirayama

ResourceFunction["PANTHERTreeGrafter"][seq, "Dataset"]

gives the annotation result of the input seq in a dataset form.

ResourceFunction["PANTHERTreeGrafter"][seq, "Tabular"]

gives the annotation result of the input seq in a tabular form.

ResourceFunction["PANTHERTreeGrafter"][seq, "TreeGraphic"]

gives the protein family tree graphic highlighting the location of the grafted seq.

ResourceFunction["PANTHERTreeGrafter"][seq, "Tree"]

gives the protein family tree highlighting the location of the grafted seq.

ResourceFunction["PANTHERTreeGrafter"]["Species"]

gives the dataset of supported species.

Details and Options

ResourceFunction["PANTHERTreeGrafter"] is based on the PANTHER TreeGrafter tool, which classifies the identified protein sequence by determining its most likely position in the protein family tree.
ResourceFunction["PANTHERTreeGrafter"] also annotates the identified protein sequence with relevant information based on its position in the family tree, using the Gene Ontology terms.
ResourceFunction["PANTHERTreeGrafter"] accepts a protein sequence as input, either in string format or as a BioSequence object.
The following option can be given:
"Species"Nonespecified species to filter the result; accepted values include the scientific names, common names or NCBI Taxonomy ID of species found in the dataset retrieved by PANTHERTreeGrafter["Species"]
ResourceFunction["PANTHERTreeGrafter"][seq] is equivalent to ResourceFunction["PANTHERTreeGrafter"][seq,"TreeGraphic"].
Speed of accessing protein data from online sources can take more than a few seconds.

Examples

Basic Examples (4) 

Retrieve the annotation result for a protein sequence:

In[1]:=
ResourceFunction[
 "PANTHERTreeGrafter", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Dataset"]
Out[1]=

Retrieve the annotation result in a tabular form instead:

In[2]:=
ResourceFunction[
 "PANTHERTreeGrafter", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Tabular"]
Out[2]=

Highlight the location of the grafted sequence on the protein family phylogenetic tree graphic. Each node is labeled with the corresponding accession from the dataset. Hover over or click on any nodes to find associated species information:

In[3]:=
ResourceFunction[
 "PANTHERTreeGrafter", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "TreeGraphic"]
Out[3]=

Retrieve the protein family tree, highlighting the location of the grafted sequence. Hover over or click on any tree elements to find associated species information:

In[4]:=
ResourceFunction[
 "PANTHERTreeGrafter", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Tree"]
Out[4]=

Scope (4) 

Retrieve the annotation result for a protein sequence:

In[5]:=
pantherdataset = ResourceFunction[
  "PANTHERTreeGrafter", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"][BioSequence[
  "Peptide", "ILQHNQNMSGLEKVSKISPCDVSLETSDICKCSIGKLHKSVSSANTCGIFSTASGKSVQVSDASLQNARQVFSEIEDSTKQVFSKVLFKSNEHSD", {}], "Dataset"]
Out[5]=

Explore annotated terms associated with the grafted sequence:

In[6]:=
annotation = pantherdataset[Select[MatchQ[#Accession, "ANGRAFTED"] &]][1, "PANTHER_GO_SLIM_BP"]
Out[6]=

Find the associated human gene:

In[7]:=
genesymbol = pantherdataset[Select[! FreeQ[#, "Homo sapiens"] &], "GeneSymbol"]
Out[7]=

Use the BioDBnetGeneData resource function to get more information on the gene:

In[8]:=
ResourceFunction["BioDBnetGeneData"][genesymbol[1]]
Out[8]=

Options (1) 

Species (1) 

Retrieve the result of the tree analysis filtered for the Mus musculus species:

In[9]:=
ResourceFunction[
 "PANTHERTreeGrafter", ResourceSystemBase -> "https://www.wolframcloud.com/obj/resourcesystem/api/1.0"]["MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQTLSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQARLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVYQAGAREGAERGLSAIRERLGPLVEQGRVRAATVGSLAGQPLQERAQAWGERLRARMEEMGSRTRDRLDEVKEQVAEVRAKLEEQAQQIRLQAEAFQARLKSWFEPLVEDMQRQWAGLVEKVQAAVGTSAAPVPSDNH", "Tree", "Species" -> "Mus musculus"]
Out[9]=

Requirements

Wolfram Language 13.0 (December 2021) or above

Version History

  • 1.1.0 – 01 October 2025
  • 1.0.0 – 02 June 2025

Source Metadata

Related Resources

License Information