Team ratchets up accuracy
for identifying protein bits
Posted March 30, 2011

Anyone who has tried to match an unfamiliar bird’s features to its field guide portrait knows that reality rarely provides a perfect comparison to the ideal specimen.
Scientists have faced a similar problem when attempting to decode protein patterns found in living cells – a field known as proteomics. Using mass spectrometry, the technology of choice for protein identification, scientists try to match protein fragments, or peptides, against idealized patterns in peptide databases. These databases often provide a poor correspondence – the industry standard for positive peptide identification is usually a dismal 15 to 20 percent.
But using bioinformatics techniques, researchers at Pacific Northwest National Laboratory (PNNL) have developed a pattern-matching algorithm that improves the accuracy of peptide identification by between 50 and 150 percent, compared with standard approaches.
The key to the method, outlined this month in the online edition of the Journal of Proteome Research, was to deconstruct the pattern-matching problem using principles of statistical physics, which mathematically connects the behavior of individual atoms to large groups of molecules that can be observed and measured. The new method allows researchers to compare unknown peptide samples with both a peptide database of ideal samples and a library of experimental peptide samples.
The method is somewhat like having a field guide’s idealized bird plus numerous photographs of real birds in various poses to identify an unfamiliar bird among very similar unfamiliar birds. In the case of mass spectra, the PNNL scientists used a standard procedure for breaking apart proteins into component peptides, then separating the peptides from one another by their mass and charge. The resulting mass spectra are a series of lines of various heights signifying the amount and charge of each peptide fragment in a sample.
