PDBalert

28 February 2009

PDBalert is a free web-based automatic system that alerts users as soon as a PDB structure with homology to a protein of interest becomes available. Users need to simply upload their personal protein sequences of interest. Once a week, when new proteins are released to the PDB database (on Wednesdays) even if they are on hold, PDBalert compares the new structures with the users’ sequence(s). When a significant match is found, the user is alerted by an email containing a link to the search results. Reminder: have your protein sequence(s) of interest ready when registering.

Why is this application helpful to a crystallographer?

This web application will automatically email you if a protein related to one of your current projects has been deposited in the PDB. PDBalert removes the need to regularly check the new PDB entries and will also highlight relationships that are more distant than what would appear by using the simple key word search in the PDB. This information could help you with phasing via molecular replacement or let you know if you have been scooped.

How is the search performed?

The search is done using the HHpred, which is a remote homology detection server (reference). A pairwise comparision is performed using a profile Hidden Markov Models (HMMs) (reference).

One drawback maybe that you have to register to use this web application. However, since they need to email you if a PDB is deposited of similar sequence then it makes sense that they need some sort of registration process.

The PDBalert system was developed at the Gene Center of the University of Munich. The most recent publication on PDBalert can be found here in pdf format.

As a side note it is nice to see some programs related to crystallography using Ruby instead of Fortran.

 | Posted by Sean | Categories: Uncategorized | Tagged: , |

How can I quickly find an annotated sequence of my protein?

Try using Pfam 23.0

Click on ‘VIEW A STRUCTURE’
Enter the PDB code (if you would like to simply test this without a code, you can click on ‘example’)
Click on ‘Sequence mapping’

Click on the link under ‘UniProt ID’
This will take you to the UniProt summary page.
To get all the information about your protein of interest.
You will see the following and click on the id (see below).
This is the summary of UniProt entry ‘CLICK ID”

General or sequence annotation information and scan the references.

Why is this tool great?

1) It will save time by taking advantage of the PDB ID which you probably already have
2) UniPort searches both Swiss-Prot and TrEMBL (so you don’t have too)

You can find out more information about Uniport here.

 | Posted by Sean | Categories: Uncategorized | Tagged: |

Basic Linux Commands

26 February 2009

First, you need to open a terminal, which can either be done by right clicking on the desktop then selecting open terminal, Applications -> System Tools -> Terminal, or by clicking on the terminal icon (you can hold your cursor over the icons and they will probably tell you what the particular icon represents) that is usually located at the top of screen.

cd
cd is used to change directories hence ‘cd’. If you want to enter a specific directory then follow the command by the name of that directory (folder).
For example: cd Crystallography (press enter after each command) would move into the directory entitled Crystallography
-as a side note the inputs in linux ARE case sensitive
-if you have entered an unique portion of a file or directory you can fill in the rest of name by using the Tab key
For example: cd Crys (enter Tab)
will result in following assuming that name is unique in that particular directory
cd Crystallography

You are always in a directory as soon as you open a terminal (the window that you type/enter commands). You can navigate to directories hierarchically above or below. If you want to move up in directories use the enter the following:
cd ..
If you would like to move up 2 directories then use:
cd ../..
If you like to return to your home directory (the place that you are located when you first open the terminal) then enter the following:
cd ~ (the symbol located at the extreme upper left of most key boards)

If you would like to make a directory use the command:
mkdir XYZ
The XYZ stands for the name would like your directory to be called.

If you would like to list the files and folders in a particular directory type this:
ls
-you may notice that files and directories are color differently.

alias
Will list aliases which are the abbreviations for longer commands

If you need to terminate a program from running (such as simulated annealing in CNS that can take hours) push down the following at the same time (similar to capitalizing a letter with the shift key):
“control” c

You should be able to solve your structures using only the above commands. I don’t want to overwhelm anyone that is new to using linux, but if you have other commands that you feel are critical please leave me a comment.

Crystallographic Movies

26 February 2009

A number of important crystallographic concepts such as phasing, R factor, resolution, completeness, overloads, low resolution, oscillation and refinement have been brought to life with movies.

The movies will greatly help anyone who is looking to quickly understand the implications of a given parameter. Also the movies may help in understanding why certain value ranges have been deemed acceptable for publication. For example, why a completeness of under 20 percent in the outer most shell is unreasonable.

Finally, I would like thank James Holton for taking the time to create and provide these works of art so that you too may enjoy them here.

A good method for determining the number of macromolecules in the asymmetric unit is by calculating the Matthew’s coefficient.  This can be done easily using a jiffy program available in the CCP4i suite of programs and is aptly called MATTHEWS_COEF.

You need the following information to be able to do this calculation:

1) unit cell parameters

2) space group

3) an estimate of the molecular weight of your macromolecule

What to look for?  As a general rule about 50 percent of the volume of the unit cell is composed of solvent.  Look for the number of macromolecules in the asymmetric unit that comprises about 50 percent volume.

For example:

unit cell: a=b=c 80
space group: P 2 3
molecular weight: 20 kDa
This would yield 42 percent solvent for 1 molecule in the asymmetric unit.

how to calculate matthews coefficent5 How to find the number of macromolecules in the asymmetric unit?
This is a picture of the gui from CCP4i (Program List->matthews_coef) displaying the appropriate inputs and outs.

Check out this site if you want to play around with this calculation without using real data or CCP4.
In addition, the above site also compares your calculation against what has been published in the protein data base.

The original paper written by B. W. Matthews, describing this calculation can be found here.