Function Repository Resource:

PDBImport

Source Notebook

Import protein data in the Protein Data Bank (PDB) format

Contributed by: Jihyeon Je, Yury Polyachenko and John Cassel

ResourceFunction["PDBImport"][src]

applies a PDB import to the source src.

ResourceFunction["PDBImport"][src,"String"]

imports protein data from the source src as a string of PDB format.

ResourceFunction["PDBImport"][src, "AtomAssociation"]

imports a protein atom Association from the source src in the RCSB Protein Data Bank format.

ResourceFunction["PDBImport"][src,row]

imports a protein atom starting from the specified row.

ResourceFunction["PDBImport"][provideridentifier,]

imports the protein data from the given provider with the identifer appropriate to that provider.

Details and Options

The string src in ResourceFunction["PDBImport"][src,] can refer to a local file or a URL.
ResourceFunction["PDBImport"] with "AtomAssociation" or row returns an Association with elements "Atom", "Mass", "XCoordinate", "YCoordinate" and "ZCoordinate" representing atomic mass and coordinates.
ResourceFunction["PDBImport"] with "String" returns the text of the associated PDB file, suitable for later use in ImportString.

Examples

Basic Examples (1) 

Import a PDB file corresponding to an RCSB ID:

In[1]:=
ResourceFunction["PDBImport"]["RCSB" -> "3J3I"]
Out[1]=

Scope (3) 

Import an Association of atomic properties corresponding to an RCSB ID:

In[2]:=
data = ResourceFunction["PDBImport"]["RCSB" -> "3J3I", "AtomAssociation"]
Out[2]=

Extract the x-coordinates:

In[3]:=
data["XCoordinate"] // Short
Out[3]=

Extract the atom types:

In[4]:=
data["Atom"] // Short
Out[4]=

Extract the atomic masses:

In[5]:=
data["Mass"] // Short
Out[5]=

Import an RCSB data file starting from row 154:

In[6]:=
data = ResourceFunction["PDBImport"][
  "https://files.rcsb.org/download/3J3I.pdb1.gz", 154]
Out[6]=

Extract the x-coordinates:

In[7]:=
data["XCoordinate"] // Short
Out[7]=

Extract the atom types:

In[8]:=
data["Atom"][[1]] // Short
Out[8]=

Import an RCSB data file as a PDB string:

In[9]:=
pdbString = ResourceFunction["PDBImport"][
  "https://files.rcsb.org/download/3J3I.pdb1.gz", "String"]
Out[9]=

Extract the x-coordinates using PDB import (which converts to picometers):

In[10]:=
First[Transpose[
   ImportString[pdbString, {"PDB", "VertexCoordinates"}]]] // Short
Out[10]=

Extract the atom types:

In[11]:=
Flatten[ImportString[pdbString, {"PDB", "ResidueAtoms"}]] // Short
Out[11]=

Neat Examples (2) 

Extract a protein chain and calculate the center of mass of its subunits:

In[12]:=
data = ResourceFunction["PDBImport"][
   "https://files.rcsb.org/download/3J3I.pdb1.gz", "AtomAssociation"];
In[13]:=
centermass[mass_, coor_] := Sum[mass[[i]]* coor[[i]], {i, Length[mass]}]  /
  Sum[mass[[i]], { i, Length[mass]}]
In[14]:=
xset = MapThread[centermass, {data["Mass"], data["XCoordinate"]}];
yset = MapThread[centermass, {data["Mass"], data["YCoordinate"]}];
zset = MapThread[centermass, {data["Mass"], data["ZCoordinate"]}];
In[15]:=
cmass = Transpose[{xset, yset, zset}];

Plot the center of mass of the subunits:

In[16]:=
Graphics3D[ {PointSize[0.02], Pink, Point[cmass]}]
Out[16]=

Version History

  • 2.0.0 – 29 July 2020
  • 1.0.0 – 27 December 2019

Related Resources

License Information