Branched polymers can be represented as tree graphs. A one-to-one correspondence exists between a tree graph comprised of N labeled vertices and a sequence of N 2 integers, known as the Prufer sequence. Permutations of this sequence yield sequences corresponding to tree graphs with the same vertex-degree distribution but (generally) different branching patterns. Repeatedly shuffling the Prufer sequence we have generated large ensembles of random tree graphs, all with the same degree distributions. We also present and apply an efficient algorithm to determine graph distances directly from their Prufer sequences. From the (Prufer sequence derived) graph distances, 3D size metrics, e.g., the polymer's radius of gyration, R-g, and average end-to-end distance, were then calculated using several different theoretical approaches. Applying our method to ideal randomly branched polymers of different vertex-degree distributions, all their 3D size measures are found to obey the usual N-1/4 scaling law. Among the branched polymers analyzed are RNA molecules comprised of equal proportions of the four-randomly distributed-nucleotides. Prior to Prufer shuffling, the vertices of their representative tree graphs, these ``random-sequence'' RNAs exhibit an R-g similar to N-1/3 scaling.
Long RNA molecules are at the core of gene regulation across all kingdoms of life, while also serving as genomes in RNA viruses. Few studies have addressed the basic physical properties of long single-stranded RNAs. Long RNAs with non repeating sequences usually adopt highly ramified secondary structures and are better described as branched polymers. To test whether a branched polymer model can estimate the overall sizes of large RNAs, we employed fluorescence correlation spectroscopy to examine the hydrodynamic radii of a broad spectrum of biologically important RNAs, ranging from viral genomes to long noncoding regulatory RNAs. The relative sizes of long RNAs measured at low ionic strength correspond well to those predicted by two theoretical approaches that treat the effective branching associated with secondary structure formation one employing the Kramers theorem for calculating radii of gyration, and the other featuring the metric of maximum ladder distance. Upon addition of multivalent cations, most RNAs are found to be compacted as compared with their original, low ionic-strength sizes. These results suggest that sizes of long RNA molecules are determined by the branching pattern of their secondary structures. We also experimentally validate the proposed computational approaches for estimating hydrodynamic radii of single stranded RNAs, which use generic RNA structure prediction tools and thus can be universally applied to a wide range of long RNAs.
To optimize binding-and packaging-by their capsid proteins (CP), single-stranded (ss) RNA viral genomes often have local secondary/tertiary structures with high CP affinity, with these ``packaging signals'' serving as heterogeneous nucleation sites for the formation of capsids. Under typical in vitro self-assembly conditions, however, and in particular for the case of many ssRNA viruses whose CP have cationic N-termini, the adsorption of CP by RNA is nonspecific because the CP concentration exceeds the largest dissociation constant for CP RNA binding. Consequently, the RNA is saturated by bound protein before lateral interactions between CP drive the homogeneous nucleation of capsids. But, before capsids are formed, the binding of protein remains reversible and introduction of another RNA species with a different length and/or sequence is found experimentally to result in significant redistribution of protein. Here we argue that, for a given RNA mass, the sequence with the highest affinity for protein is the one with the most compact secondary structure arising from self-complementarity; similarly, a long RNA steals protein from an equal mass of shorter ones. In both cases, it is the lateral attractions between bound proteins that determines the relative CP affinities of the RNA templates, even though the individual binding sites are identical. We demonstrate this with Monte Carlo simulations, generalizing the Rosenbluth method for excludedvolume polymers to include branching of the polymers and their reversible binding by protein.
For many viruses, the packaging of a single-stranded RNA (ss-RNA) genome is spontaneous, driven by capsid protein-capsid protein (CP) and CP-RNA interactions. Furthermore, for some multipartite ss-RNA viruses, copackaging of two or more RNA molecules is a common strategy. Here we focus on RNA copackaging in vitro by using cowpea chlorotic mottle virus (CCMV) CP and an RNA molecule that is short (500 nucleotides (nts)) compared to the lengths (approximate to 3000 nts) packaged in wild-type virions. We show that the degree of cooperativity of virus assembly depends not only on the relative strength of the CP-CP and CP-RNA interactions but also on the RNA being short: a 500-nt RNA molecule cannot form a capsid by itself, so its packaging requires the aggregation of multiple CP-RNA complexes. By using fluorescence correlation spectroscopy (FCS), we show that at neutral pH and sufficiently low concentrations RNA and CP form complexes that are smaller than the wild-type capsid and that four 500-nt RNAs are packaged into virus-like particles (VLPs) only upon lowering the pH. Further, a variety of bulk-solution techniques confirm that fully ordered VLPs are formed only upon acidification. On the basis of these results, we argue that the observed high degree of cooperativity involves equilibrium between multiple CP/RNA complexes.
The comment by Stephen Harvey in this issue of the Biophysical Journal concludes with two statements regarding my recent letter about DNA packaging into viral capsids. Harvey agrees with my interpretation of the origin of the large confinement entropy predicted by the molecular-dynamics simulations of his group, and its sensitive dependence on the molecular parameters of their wormlike chain model of double-stranded DNA. On the other hand, he doubts my assertion that the confinement entropy is already included in the interstrand repulsion free energy derived from osmotic stress measurements, which constitutes the major contribution to the packaging free energy used in recent continuum theories of this process. Harvey suggests instead that the confinement entropy should be added to this free energy as a separate term (using, for instance, the method described in my letter). I will argue that this addition is redundant, and, in a brief discussion of continuum theories, will also discuss his comments as relates to the work of other researchers.
A majority of viruses are composed of long single-stranded genomic RNA molecules encapsulated by protein shells with diameters of just a few tens of nanometers. We examine the extent to which these viral RNAs have evolved to be physically compact molecules to facilitate encapsulation. Measurements of equal-length viral, non-viral, coding and non-coding RNAs show viral RNAs to have among the smallest sizes in solution, i.e., the highest gel-electrophoretic mobilities and the smallest hydrodynamic radii. Using graph-theoretical analyses we demonstrate that their sizes correlate with the compactness of branching patterns in predicted secondary structure ensembles. The density of branching is determined by the number and relative positions of 3-helix junctions, and is highly sensitive to the presence of rare higher-order junctions with 4 or more helices. Compact branching arises from a preponderance of base pairing between nucleotides close to each other in the primary sequence. The density of branching represents a degree of freedom optimized by viral RNA genomes in response to the evolutionary pressure to be packaged reliably. Several families of viruses are analyzed to delineate the effects of capsid geometry, size and charge stabilization on the selective pressure for RNA compactness. Compact branching has important implications for RNA folding and viral assembly.
When two oppositely charged macroions are brought into contact, a large fraction of the mobile counterions that previously surrounded each isolated macromolecule is released into the bulk solution, thereby increasing the counterions' translational entropy. The entropy gain associated with this counterion release mechanism is the driving force for various macroion binding processes, such as protein-membrane, protein-DNA, and DNA-membrane complexation. In this review we focus on the role of counterion release in the interaction between charged macromolecules and oppositely charged lipid membranes. The electrostatic interaction is generally coupled to other degrees of freedom of the membrane, or of the adsorbed macroion. Thus, for example, when a basic protein adsorbs onto a binary fluid membrane comprising anionic and neutral lipids then, in addition to the release of the mobile counterions to the bulk solution, the protein polarizes the membrane composition by attracting the charged lipids to its immediate vicinity. This process, which enhances the electrostatic attraction, is partly hampered by the concomitant loss of two-dimensional (2D) lipid mixing entropy, so that the resulting lipid distribution reflects the balance between these opposing tendencies. In membranes containing both monovalent and multivalent lipids, as is often the case with biological membranes, the peripheral protein preferentially interacts with (and thus immobilizes) the multivalent lipids, because a smaller number of these lipids are needed to neutralize its charge. The monovalent ``counterlipids'' are thus free to translate in the remaining area of the membrane. This entropy-driven counterlipid release mechanism in 2D is analogous to the extensively studied phenomenon of DNA condensation by polyvalent cations in 3D. Being self-assembled fluid aggregates, lipid bilayers can respond to interactions with peripheral or integral (whether charged or neutral) macromolecules in various ways. Of particular interest in this review is the interplay between electrostatic interactions, the lipid composition degrees of freedom mentioned above, and the membrane curvature elasticity, as will be discussed in some detail in the context of the thermodynamic stability and phase behavior of lipid-DNA complexes (also known as ``lipoplexes''). This article is primarily theoretical, but the systems and phenomena considered are directly related to and motivated by specific experiments. The theoretical modeling is generally based on mean-field level approaches, specifically the Poisson-Boltzmann theory for electrostatic interactions, sometimes in conjunction with coarse grained computer simulations.
Inspired by novel single-molecule and bulk solution measurements, the physics underlying the forces and pressures involved in DNA packaging into bacteriophage capsids became the focus of numerous recent theoretical models. These fall into two general categories: Continuum-elastic theories (CT), and simulation studies-mostly of the molecular dynamics (MD) genre. Both types of models account for the dependence of the force, and hence the packaging free energy (Delta F), on the loaded DNA length, but differ markedly in interpreting their origin. While DNA confinement entropy is a dominant contribution to DF in the MD simulations, in the CT theories this role is fulfilled by interstrand repulsion, and there is no explicit entropy term. The goal of this letter is to resolve this apparent contradiction, elucidate the origin of the entropic term in the MD simulations, and point out its tacit presence in the CT treatments.
The equilibrium constants of trans and cis dimerization of membrane bound (2D) and freely moving (3D) adhesion receptors are expressed and compared using elementary statistical-thermodynamics. Both processes are mediated by the binding of extracellular subdomains whose range of motion in the 2D environment is reduced upon dimerization, defining a thin reaction shell where dimer formation and dissociation take place. We show that the ratio between the 2D and 3D equilibrium constants can be expressed as a product of individual factors describing, respectively, the spatial ranges of motions of the adhesive domains, and their rotational freedom within the reaction shell. The results predicted by the theory are compared to those obtained from a novel, to our knowledge, dynamical simulations methodology, whereby pairs of receptors perform realistic translational, internal, and rotational motions in 2D and 3D. We use cadherins as our model system. The theory and simulations explain how the strength of cis and trans interactions of adhesive receptors are affected both by their presence in the constrained intermembrane space and by the 2D environment of membrane surfaces. Our work provides fundamental insights as to the mechanism of lateral clustering of adhesion receptors after cell-cell contact and, more generally, to the formation of lateral microclusters of proteins on cell surfaces.
We show on general theoretical grounds that the two ends of single-stranded (ss) RNA molecules (consisting of roughly equal proportions of A, C, G and U) are necessarily close together, largely independent of their length and sequence. This is demonstrated to be a direct consequence of two generic properties of the equilibrium secondary structures, namely that the average proportion of bases in pairs is similar to 60% and that the average duplex length is similar to 4. Based on mfold and Vienna computations on large numbers of ssRNAs of various lengths (1000-10 000 nt) and sequences (both random and biological), we find that the 5'-3' distance-defined as the sum of H-bond and covalent (ss) links separating the ends of the RNA chain-is small, averaging 15-20 for each set of viral sequences tested. For random sequences this distance is similar to 12, consistent with the theory. We discuss the relevance of these results to evolved sequence complementarity and specific protein binding effects that are known to be important for keeping the two ends of viral and messenger RNAs in close proximity. Finally we speculate on how our conclusions imply indistinguishability in size and shape of equilibrated forms of linear and covalently circularized ssRNA molecules.
We introduce a simple model for folding random-sequence RNA molecules, arguing that it provides a direct route to predicting and rationalizing several average properties of RNA secondary structures. The first folding step involves identifying the longest possible duplex, thereby dividing the molecule into a pair of daughter loops. Successive steps involve identifying similarly the longest duplex in each new pair of daughter loops, with this process proceeding sequentially until the loops are too small for a viable duplex to form. Approximate analytical solutions are found for the average fraction of paired bases, the average duplex length, and the average loop size, all of which are shown to be independent of sequence length for long enough molecules. Numerical solutions to the model provide estimates for these average secondary structure properties that agree well with those obtained from more sophisticated folding algorithms. We also use the model to derive the asymptotic power law for the dependence of the maximum ladder distance on chain length.
Because of the branching arising from partial self-complementarity, long single-stranded (ss) RNA molecules are significantly more compact than linear arrangements (e. g., denatured states) of the same sequence of monomers. To elucidate the dependence of compactness on the nature and extent of branching, we represent ssRNA secondary structures as tree graphs which we treat as ideal branched polymers, and use a theorem of Kramers for evaluating their root-mean-square radius of gyration, (R) over cap (g) = root < R(g)(2)>. We consider two sets of sequences-random and viral-with nucleotide sequence lengths (N) ranging from 100 to 10 000. The RNAs of icosahedral viruses are shown to be more compact (i.e., to have smaller (R) over cap (g) ) than the random RNAs. For the random sequences we find that (R) over cap (g) varies as N(1/3). These results are contrasted with the scaling of (R) over cap (g) for ideal randomly branched polymers (N(1/4)), and with that from recent modeling of (relatively short, N <= 161) RNA tertiary structures (N(2/5)). (C) 2011 American Institute of Physics. [doi: 10.1063/1.3652763]
Membrane-bound receptors often form large assemblies resulting from binding to soluble ligands, cell-surface molecules on other cells and extracellular matrix proteins(1). For example, the association of membrane proteins with proteins on different cells (trans-interactions) can drive the oligomerization of proteins on the same cell(2) (cis-interactions). A central problem in understanding the molecular basis of such phenomena is that equilibrium constants are generally measured in three-dimensional solution and are thus difficult to relate to the two-dimensional environment of a membrane surface. Here we present a theoretical treatment that converts three-dimensional affinities to two dimensions, accounting directly for the structure and dynamics of the membrane-bound molecules. Using a multiscale simulation approach, we apply the theory to explain the formation of ordered, junction-like clusters by classical cadherin adhesion proteins. The approach features atomic-scale molecular dynamics simulations to determine interdomain flexibility, Monte Carlo simulations of multidomain motion and lattice simulations of junction formation(3). A finding of general relevance is that changes in interdomain motion on trans-binding have a crucial role in driving the lateral, cis-, clustering of adhesion receptors.
Intercellullar junctions formed by cadherins, including desmosomes and adherens junctions, comprise two dimensional arrays of ``trans'' dimers formed between monomers emanating from opposing cell surfaces. Lateral ``cis'' interfaces between cadherins from the same cell surface have been proposed to play a role in cadherin clustering. Although the molecular details of cis interactions remain uncertain, they must define an anisotropic arrangement where binding is favorable only in certain orientations. Here we report Monte Carlo simulations performed on a 2D lattice constructed to account for the anisotropy in cadherin cis interactions. A crucial finding is that the ``phase transition'' between freely diffusing cadherin monomers and dimers and a condensed ordered 2D junction formed by dimers alone is a cooperative process involving both trans and cis interactions. Moreover, cis interactions, despite being too weak to be measured in solution, are critical to the formation of an ordered junction structure. We discuss these results in light of available experimental information on cadherin binding free energies that are transformed from their bulk solution values to interaction energies on a 2D lattice.
Cross-linking proteins can mediate the emergence of rigid bundles from a dense branched network of actin filaments. To enable their binding, the filaments must first bend towards each other. We derive an explicit criterion for the onset of bundling, in terms of the initial length of filaments L, their spacing b, and cross-linker concentration f, reflecting the balance between bending and binding energies. Our model system contains actin, the branching complex Arp2/3 and the bundling protein fascin. In the first distinct stage, during which only actin and Arp2/3 are active, an entangled aster-like mesh of actin filaments is formed. Tens of seconds later, when filaments at the aster periphery are long and barely branched, a sharp transition takes place into a star-like structure, marking the onset of bundling. Now fascin and actin govern bundle growth; Arp2/3 plays no role. Using kinetic Monte Carlo simulations we calculate the temporal evolution of b and L, and predict the onset of bundling as a function of f. Our predictions are in good qualitative agreement with several new experiments that are reported herein and demonstrate how f controls the aster-star transition and bundle length. We also present two models for aster growth corresponding to different experimental realizations. The first treats filament and bundle association as an irreversible sequence of elongation-association steps. The second, applicable for low f, treats bundling as a reversible self-assembly process, where the optimal bundle size is dictated by the balance between surface and bending energies. Finally, we discuss the relevance of our conclusions for the lamellipodium to filopodia transition in living cells, noting that bundles are more likely nucleated by ``tip complex'' cross-linkers (e.g. mDia2 or Ena/VASP), whereas fascin is mainly involved in bundle maintenance.
Many cell-cell adhesive events are mediated by the dimerization of cadherin proteins presented on apposing cell surfaces. Cadherin-mediated processes play a central role in the sorting of cells into separate tissues in vivo, but in vitro assays aimed at mimicking this behavior have yielded inconclusive results. In some cases, cells that express different cadherins exhibit homotypic cell sorting, forming separate cell aggregates, whereas in other cases, intermixed aggregates are formed. A third pattern is observed for mixtures of cells expressing either N- or E-cadherin, which form distinct homotypic aggregates that adhere to one another through a heterotypic interface. The molecular basis of cadherin-mediated cell patterning phenomena is poorly understood, in part because the relationship between cellular adhesive specificity and intermolecular binding free energies has not been established. To clarify this issue, we have measured the dimerization affinities of N-cadherin and E-cadherin. These proteins are similar in sequence and structure, yet are able to mediate homotypic cell patterning behavior in a variety of tissues. N-cadherin is found to form homodimers with higher affinity than does E-cadherin and, unexpectedly, the N/E-cadherin heterophilic binding affinity is intermediate in strength between the 2 homophilic affinities. We can account for observed cell aggregation behaviors by using a theoretical framework that establishes a connection between molecular affinities and cell-cell adhesive specificity. Our results illustrate how graded differences between different homophilic and heterophilic cadherin dimerizaton affinities can result in homotypic cell patterning and, more generally, show how proteins that are closely related can, nevertheless, be responsible for highly specific cellular adhesive behavior.