Customize Your RCSB PDB Homepage
RCSB PDB adds customized widgets to their homepage.
(if you are having trouble seeing this video try the full screen option)
RCSB PDB adds customized widgets to their homepage.
(if you are having trouble seeing this video try the full screen option)

BRENDA is a gold mine for those studying enzymes! The database proclaims to be the comprehensive enzyme information system and with 5010 enzymes it looks to be the case. Here is a screenshot of the navigation bar. As you can see BRENDA brings together many different categories such as IC50 values, pH stability range and crystallization.

My only suggestion so far is to change ‘Recommended Name’ to ‘Enzyme name’. I think it would save some confusion in the search entry.
I have never seen another database bring together this much information about a class of proteins. If you have a colleague working in enzymology this is site is definitely worth passing along.
1) Open the desired coordinate files in Coot (click here if you need some help)
(You know you have been looking at structures too long when they start to look like faces.)
2) Under Calculate you have two methods of superimposing:
SSM Superpose (we will go with this option in this example: Calculate -> SSM Superpose…) or LSQ Superpose
Note: SSM Superpose stands for secondary-structure matching and if you need to do it outside of Coot there is a server.
3) Select which PDB you would like to move and apply:

The structures should now be superimposed:

The repetitive nature of editing a PDB file can consume hours of your time and leave you feeling unfulfilled.
What if you could simply and quickly edit a PDB file without hacking together a solution using vim?
The PDB Editor has the ability to do just that and can be downloaded here for free! The manual is really great in that it explains the program’s various functions using screen shots.
The ability to delete certain aspects of a PDB file would have saved me so much time the past, it’s sick.

You can also edit secondary structure which can be outputted into PDB format.

Happy editing!
The University of Alabama at Birmingham (UAB) requested on Dec. 8th for the removal of 12 structures from the PDB. The 12 structures are spread across 9 journals including Nature, Cell and the Journal of Molecular Biology.
Here are some highlights from the request concerning these structures:
1BEF is a synthetic version of 1JXP (I wrote about 1NS3 being the structure used, 1JXP and 1NS3 contain the same origin, orientation, space group and unit cell).
My favorite:
1CMW had its B factors derived by subtracting 16.00 from values found in the structure 1TAQ
1DF9 which replaced 2QID with the update including the removal of 155 waters some of which had hydrogren bonding distances less than 1 Angstrom.
1G40 is without any water despite diffracting to 2 Angstroms also there is a strange update to the unit cell parameters in Feb. of 2007 (fits into the time line previously posted)
1G44 contains 36 chemically impossible close contacts
1L6L has either 2036 residues or 2366 residues, 1011 waters or 1522 waters, etc…
2OU1 also involved heavy drinking
1RID and 1Y8E contain poor geometry, strange B factors and very high solvent content
2A01 has great electron density that correspond to physically impossible features
2HR0 has already been trashed.
————–
I noted that there were 449 citations that referred to these dubious publications on Dec. 10th. A week has passed since the original request made by UAB. Currently, only 1 structure (1BEF) along with the corresponding paper has been retracted. Now according to Google Scholar, not even a week later, 6 more citations have been added.
I hope that other publishers are able to review the data involving these structures quickly and act accordingly as to prevent the further spread of inaccurate information.
Note: there is going to be a delay from when this information was announced to when papers (patents, books, etc) work their way through the system, but the sooner the better
is a good idea if you think people are on to you. Eric points us to the communication in Nature that brought into question PDB entry 2HR0 (background posts: here and here). The reply was also published in Nature with the authors standing by their data. The 30-40 Angstrom gaps are explained by citing a personal communication with W. A. Hendrickson as well as the presence of protein fragments that serve as disordered lattice contacts.
The possible location of the protein fragment is shown in the reply.
Below is a display of the protein packing without the ‘protein fragments’ that served as the lattice contacts.

The figure below, shows additional symmetry related proteins, gaps occur vertically (in the c-direction) and are noted with a red circle.

Artem notes that they could have done a better job fabricating the data. We all Most of us try to learn from our mistakes and correct them, if possible.
Learning from your mistakes:
A glaring problem with the 2HR0 structure was the existence of these gaps. You can come to your conclusions on whether you believe there are disordered lattice contacts. However, what would greatly undermine their credibility would be if they had deposited another structure that contained these unusual gaps.
This brings us to the structure 2OU1, but wait. This structure doesn’t have very large gaps…
They learned.
If you take a close look at the PDB entry you will notice that this structure was updated (see Deposition Summary: right side, under the picture of the molecule). The structure that was initially deposited with the PDB entry 1L6K.
Here is the textual comparison:

The major change is shown with the red arrow above noting that the c-axis was nearly cut in half.
Why update?
Covering their Tracks:
The crystal packing of 1L6K:
The gaps… ~30 Angstroms, sound familiar?
The crystal packing of 2OU1:
The gaps have been significantly reduced.
They learned from their mistake in the 2HR0 entry. The large reduction in the length of the c-axis results in more reasonable crystal packing.
Time line:
The PDB entry 2HR0 was initially released at the end of October in 2006. The initial correspondence questioning the structure and reply were published in August 2007.
The original structure PDB entry 1L6K was deposited in 2002. The update of this structure was in February of 2007. It would be interesting to know when the authors were contacted about 2HR0.
One explanation is that once they were contacted about the 2HR0 structure, they realized there was a similar issue with 1L6K and replaced it with 2OU1.
As with the hypothesis about the entry 1BEF, I do not have any proof that this is what is going on, but definitely thought it was worth mentioning.
What do you think? Sound reasonable?
A tragic week in the crystallographic community (see: 449 Citations maybe Effected by Retracted Structures). The Birmingham News article mentions researchers finding a preponderance of evidence that the structures were incorrect. I do not have any direct proof that the crystallographic data were falsified or fabricated, but let’s take a walk.
How could only one person publish these structures without others knowing?
A lab produces crystals, collects data, but unfortunately is unable to process the data. The grad student is frustrated, the post-doc can’t figure it out and so the data is handed to the PI. The PI works on the data set in their office (over the weekend, at home, etc…) and emerges successful! The paper is written and only one person knows exactly how the structure was solved.
How could the data have been fabricated?
Let us take a look at the one structure that has already been removed from PDB: 1BEF.
The data could have been back generated from a desired protein structure using a tool like mlfsom.
However, I believe there is a better explanation.
Another method to fake the data would be to perform an isomorphous replacement using a related protein. A number of residues would be different and with some help from a homology server you could tweek the structure.
The problem is that reviewers could be experts on those structures and would notice small anomalies. In addition, the protein would need to fold ‘perfectly’ in order to be used for isomorphous replacement so that proper crystal packing is maintained.
To avoid this scenario you could find a structure that is crystallographically unrelated (different space group and unit cell) to the protein of interest and use it as a template.
In order for this hypothesis to be supported, we would need to find the unrelated structure in the PDB.
Needle in a hay stack type of problem.
To save you some time, we are going to tell you the structure used: 1NS3
The figure at right shows 1BEF in light green and 1NS3 in aqua green:

Here is the PyMol script for those that like to play along at home:
fetch 1bef
fetch 1ns3
select 1ns3_A, 1ns3 and chain A+C
hide everything
show ribbon, 1bef
show ribbon, 1ns3_A
The following is the crystal information from the PDB headers:
1BEF:
CRYST1 48.800 62.400 39.600 90.00 96.70 90.00 P 1 21 1
1NS3:
CRYST1 96.960 96.960 167.100 90.00 90.00 120.00 P 63 2 2
1NS3 was used as the starting model and with the addition of some water and noise, bingo.
The unit cell and space groups are totally different and yet the two structures have nearly an identical origin and orientation.
What are the chances of two crystallographically unrelated structures having the same origin and orientation?
Zero.
The structures still don’t look close enough for your liking? Take 1BEF and put it into a homology server like MODELLER then compare.
Follow up: Covering your Tracks
The Birmingham News just reported that former researcher, H.M. Krishna Murthy, may have falsified or fabricated data. The Journal of Biological Chemistry has already retracted the paper in question, which contains PDB entry 1BEF.
If other journals follow suite the impact will be significant. According to Google Scholar, a total of 449 cite the papers in which these structures appear.
The University of Alabama at Birmingham announced that 12 structures were falsified or fabricated. The 12 questionable structures that have been deposited into the PDB are as follows: 1BEF, 1CMW, 1DF9, 2QID, 1G40, 1G44, 1L6L, 2OU1, 1RID, 1Y8E, 2A01, and 2HR0.
The publications involve a wide range of topics including: dengue viruses, serine proteases, Taq DNA polymerase, heparan sulfate proteoglycans, apolipoprotein A-II and A-I, suramin in heparin binding, and complement component 3.
The following table contains links to the structures, pdf of the journal articles and citations. The table is worth exploring if you believe you may draw conclusions based on these structures.
| PDB ID: | Journal: | Cited By: |
| 1BEF | Journal of Biological Chemistry | 94 |
| 1CMW | Acta. D | 2 |
| 1DF9 | Journal of Molecular Biology | 75 |
| 2QID | To be Published | 0 | 1G44, 1G40 | Cell | 89 | 2OU1, 1L6L | Biochemistry | 29 | 1Y8E | Biochemistry | 4 | 1RID | PNAS | 36 | 2A01 | PNAS | 95 | 2HR0 | Nature | 25 |
Dear Protein Data Bank,
It’s not you, it’s me.
We’ve been inseparable for what seems like forever, we have been through a lot. Unfortunately, I don’t think that our relationship is going to work out.
I’ve done my best to be patient and even offered suggestions on how we could make things better. I know that you have been improving and even updated your site. I just feel that I need to be better connected to other resources.
Maybe I’m giving up on you too soon.
I’ll miss you,
Sean
P.S. I thought you should know that I’ve been seeing PDBsum lately.
The catalytic site atlas (CSA) is a database that displays active sites and catalytic residues of enzymes (ref). The database is regularly maintained, but has not been updated since August of this year. The database currently contains 25,537 entries based on 968 literature references.
The site has a number of search options (located at the top of their page) that include PDB, Swiss-Prot code and EC number.
![]()
Below is an example of some of the results that are produced by the CSA. The catalytic residue in this case is an aspartate located at position 93. I would like to see the catalytic residues highlighted differently so that they are can be identified quickly.

The CSA also performs a homology search using Psi-BLAST. This information is very helpful if you are looking for evolutionary relationships between proteins (ref).
The CSA help page is really good (although a number of links to it are dead) if you would like more information or are having trouble interpreting the results.