Structural motif
In an unbranched, chain-like biological molecule, such as a protein or a strand of RNA, a structural motif is a three-dimensional structural element or fold within the chain, which appears also in a variety of other molecules. Motifs exhibit both tertiary and secondary structure, and may be described as a configuration of secondary structures. Such a description is the basis for many of the names that structural biologists give to particular kinds, such as the helix-turn-helix motif. This is not always true, however, as in the case of the EF-hand.
Because the relationship between primary structure and tertiary structure is not straight forward, two biopolymers may share the same motif yet lack appreciable homology to one another in terms of primary structure. In other words, a structural motif need not be associated with a sequence motif. Also, the existence of a sequence motif does not necessarily imply a distinctive structure. In most DNA motifs, for example, it is assumed that the DNA of that sequence does not deviate from the normal "double helical" structure.
Structural motifs in proteins
In proteins, structure motifs usually consist of just a few elements, e.g. the 'helix-turn-helix' has just three. Note that while the spatial sequence of elements is the same in all instances of a motif, they may be encoded in any order within the underlying gene. Protein structural motifs often include loops of variable length and unspecified structure, which in effect create the "slack" necessary to bring together in space two elements that are not encoded by immediately adjacent DNA sequences in a gene. Note also that even when two genes encode secondary structural elements of a motif in the same order, nevertheless they may specify somewhat different sequences of amino acids. This is true not only because of the complicated relationship between tertiary and primary structure, but because the size of the elements varies from one protein and the next.
Helix-turn-helix
This example comes from the paper by Matsuda and colleagues cited below.
The E. coli lactose operon repressor LacI (PDB id 1lccA) and E. coli catabolite gene activator (PDB id 3gapA) both have a helix-turn-helix motif, but their amino acid sequences do not show much similarity, as shown in the table below.
Matsuda and colleagues devised a code called the 3D chain code for representing a protein structure as a string of letters. This encoding scheme reveals the similarity between the proteins much more clearly than the amino acid sequence:
3D chain code | Amino acid sequence | |
1lccA | TWWWWWWWKCLKWWWWWWG | LYDVAEYAGVSYQTVSRVV |
3gapA | KWWWWWWGKCFKWWWWWWW | RQEIGQIVGCSRETVGRIL |
The DSSP Code
The DSSP code is frequently used to describe the most frequent secondary structures as a single letter. DSSP is an acronym for "Dictionary of Protein Secondary Structure".
- B = residue in isolated beta-bridge
- E = beta sheet (extended strand, participates in beta ladder)
- G = 3-helix (3/10 helix)
- H = alpha helix
- I = 5 helix (pi helix)
- T = hydrogen bonded turn
- S = bend
Other types of structure are sometimes designated with C or L.
Compare: structural domain
References
- Hideo Matsuda, Fumihiro Taniguchi, Akihiro Hashimoto. An Approach to Detection of Protein Structural Motifs using an Encoding Scheme of Backbone Conformations. Proc. of 2nd Pacific Symposium on Biocomputing, pp.280-291 (January, 1997).
- W. Kabsch and C. Sander. Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen Bonded and Geometrical Features. Biopolymers 22: 2577-2637 (1983).
- PROSITE Database of protein families and domains