## Secondary Structures of Minimum Free Energies

All nucleic acids share a universal force of structure formation: the stacking of hydrogen-bonded base pairs. Single- and double-stranded RNA and DNA are shaped likewise by base pairing and base pair stacking. Single-stranded molecules have coarse-grained structures, so-called secondary structures (Fig. 2.2), which are physically meaningful as folding intermediates (Thirumalai et al., 2001) and at the same time accessible to analysis by means of mathematical tools. This contrasts the situation in protein structure formation, where secondary structures play a much less important role. Base pairing, indeed, turns structure prediction into a problem of discrete mathematics (Schuster and Stadler, 2004): Two nucleo-tides either do or do not form a base pair. This fact makes the calculation and dis-

Sequence Secondary structure Spatial structure

Fig. 2.2 Structure prediction for tRNAPhe. symbolic parenthesis notation (shown below

Prediction is a mapping from sequence space the secondary structure) that allows for math-

into shape space that is done in two distinct ematical analysis by means of combinatorics steps: (a) from sequence to secondary struc- and other techniques of discrete mathematics ture and then (b) from secondary structure to RNA secondary structures were identified as the full three-dimensional structure. The sec- folding intermediates of RNA molecules ondary structure, in essence, is a listing of (Thirumalai et al., 2001) and can be seen as base pairs in a planar structure graph, which is analogs to molten globules in protein folding free of knots and pseudo-knots. The secondary (Brion and Westhof, 1997). structure can be represented by an equivalent

Sequence Secondary structure Spatial structure

Fig. 2.2 Structure prediction for tRNAPhe. symbolic parenthesis notation (shown below

Prediction is a mapping from sequence space the secondary structure) that allows for math-

into shape space that is done in two distinct ematical analysis by means of combinatorics steps: (a) from sequence to secondary struc- and other techniques of discrete mathematics ture and then (b) from secondary structure to RNA secondary structures were identified as the full three-dimensional structure. The sec- folding intermediates of RNA molecules ondary structure, in essence, is a listing of (Thirumalai et al., 2001) and can be seen as base pairs in a planar structure graph, which is analogs to molten globules in protein folding free of knots and pseudo-knots. The secondary (Brion and Westhof, 1997). structure can be represented by an equivalent cussion of RNA folds fairly easy. Counting problems, for example the determination of the number of secondary structures sharing one or more structural elements, can be solved exactly by means of recursion formulae (Hofacker et al., 1998). For long sequences asymptotic expressions are available (an example is given in Section 2.3).

Computational prediction of RNA minimum free energy secondary structures is an old problem that was solved by means of dynamic programming in the 1980s (Zuker and Stiegler, 1981). The secondary structure is built from substructures, which are assumed to contribute additively to the free energy of the molecule. The free energies are computed from extensive tables of parameters that are derived from the results of thermodynamic and kinetic investigations of model compounds. Steady update of the empirical parameters for this approach improves the quality of the predictions (Matthews et al., 1999, 2004).

0 0