LAHMA

Local Annotation of Homology Matched Amino acids

LAHMA info and usage


The Local Annotation of Homology-Matched Amino acids (LAHMA) website is a tool to help you get the most knowledge out of a protein structure model. LAHMA compares every protein chain in your model of interest with all homologous chains in the PDB-REDO data bank at a sequence similarity level of 70% or more. It also computes many metrics that are not related to homologs, but give more information on the protein.

Usage:
  1. Input a PDB identifier in the input field on the home page. Only PDB-REDO entries are in the LAHMA database.
  2. Information is displayed below the protein sequence(s). Mouseover on the question marks will show additional information on the parameters, or see to the right for a more extensive description of the parameters.
  3. Clicking on any residue in the protein sequence will pop up a new window that compares that residue with its homologous residues.
  4. On the residue page, mouseovers show more information on the residues and highlight the residue in the other plots. Again, mouseover on the question marks show what is plotted.

Some parameters in LAHMA have been derived from DSSP and HSSP. See the credits page for references to these sources.

FAQs


The PDB entry I am looking for is not in the database. Why?

The LAHMA database only contains PDB-REDO entries. Check if your entry is in PDB-REDO here. If the entry was released by the PDB less than a week ago, the entry may not yet have been added to the database.

I cannot solve my issue. How can I contact the developers?

LAHMA was created in the PDB-REDO group of Robbie Joosten and Anastassis Perrakis. They can be contacted at r.joosten at nki.nl and a.perrakis at nki.nl, respectively.

Which browsers are supported?

Currently, this website should work on Firefox, Chrome, Safari and Microsoft Edge. Internet Explorer and Opera are not (yet) supported. Also, the website is not designed for use on small (phone) screens.

This service is awesome. How can I cite it?

At this point, there is no citation available. It will be added once LAHMA has been published.

Explanation of LAHMA parameters


Ramachandran and Rotamer Z-score

The Z-score of the phi/psi or chi1/chi2 torsion angles, adapted from the original algorithm in WHAT_CHECK (Hooft et al., 1997). As this is a Z-score for one residue, it indicates if the angles are in a well-populated part of the distribution or not.

Ramachandran classification

The Ramachandran angles of all homologous residues are classified using k-means clustering. Big outliers are more than 5 sigma from the cluster center, small outliers more than 3 sigma. There are also minimal absolute differences for outliers and between classes: these are given in the paper describing LAHMA.

RSCC scores

The real-space correlation coefficient (RSCC) Z-score gives an indication how good the density fit of a residue is compared to the other residues in the protein. It is calculated as described by Joosten et al., 2014.

Relative Z-scores (Ramachandran, rotamer, and RSCC)

The initial Z-scores are compared across homologs. If a particular residue has for instance a low Ramachandran Z-score, but the homologous residues also have a poor Z-score, it is becomes more likely that the protein is in a somewhat strained conformation at this position and less likely that it is wrong. A high relative Z-score thus indicates how good a residue is compared to its homologous residues.

Rotamer percentages

The rotamers are defined according to the Molprobity convention (Lovell et al., 2000) that chi angles around 60° are named p (gauche plus), around -60° m (gauche minus), and around 180° t (trans). We display the percentage of homologous residues with the same rotamers. Also, a second, homology-based percentage is given that takes into account that takes the number of different rotamers at a position into account. For instance, if a Lys rotamer is found 25% of the time, but there are 20 different rotamers across homologs, it occurs relatively often. Then, this number is high. The opposite is true for a Ser rotamer found in 25% of the cases, where only 2 are found at that position.

Cis/trans angles

If the omega angle is more than 30° from the ideal cis or trans value, it is given as distorted. Otherwise it is cis (close to 0°) or trans (close to 180°). By counting the percentage of homologous residues that is either cis or trans, we find which conformation is most common and thus assign minority conformations, or, if it is very rare, outlier conformations.

C-alpha torsion angles

The torsion angle over four consecutive C-alpha atoms is computed as a measure of the local backbone conformation. For residue i, the four C-alphas involved are the i-2, i-1, i, and i+1. Again, similar to Ramachandran classification, the angles are clustered with k-means clustering to determine outliers and minority conformations.

Hydrogen bonds

Hydrogen bonds are computed as described in van Beusekom et al, 2018. Note that we only compute protein-protein hydrogen bonds. The H-bond donated and accepted are summed to get the total H-bond count.

Symmetry contacts

A symmetry contact between two residues is defined here as a distance smaller than 3.5 Å between any two atoms of the residue pair, one of which is generated by performing symmetry operations.

Secondary structure

The secondary structure is obtained directly from DSSP.

Surface accessibility

The surface accessibility is also obtained from DSSP. The number that is shown is a percentage of the maximum accessibility of that residue (obtained by getting the surface accessibility from DSSP from a PDB file containing that single residue).

Post-translational modifications

Post-translational modifications are found by checking the residue name (in case of phosphorylation, methylation, acetylation, carboxylation, hydroxylation, sulphation, oxidation, and pyroglutamation) and by checking presence of LINK records (in case of glycosylation). An example is of the former is the residue TPO, which is a phosphorylated THR. An example of the latter is an NAG that is linked to ASN in N-glycosylation.

PDB sequence conservation and orderedness

These numbers simply represent the percentage of homologous residues that have the same residue type and that are modeled, respectively.

HSSP sequence conservation and entropy

HSSP performs a BLAST search on PDB entries to search for (remote) homologs both in and out of the PDB. The percentage of homologs with the same residue type per position is given as a percentage. HSSP also computes an entropy measure, which shows how variable the residue composition is. The entropy is maximal if all 20 amino acid types are found in 5% of the homologs; the entropy is far lower if one residue type occurs 81% of the time and the others all 1% of the time.

Ligand contacts

A contact is given if any two atoms in a given residue+ligand pair are within 3.5 Å of one another.

B-factor ratio

The B-factor ratio for a particular amino acid is the average B factor of the atoms of that amino acid divided by the average B factor of the protein structure model.