Table of contents for Biological modeling and simulation : a survey of practical models, algorithms, and numerical methods / Russell Schwartz.

Bibliographic record and links to related information available from the Library of Congress catalog.

Note: Contents data are machine generated based on pre-publication provided by the publisher. Contents may have variations from the printed book or be incomplete or contain other coding.


Counter
Contents
1 Introduction 1
1.1 Overview of Topics . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Examples of Problems in Biological Modeling . . . . . . . . . 3
1.2.1 Optimization . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Simulation and Sampling . . . . . . . . . . . . . . . . . 8
1.2.3 Parameter Tuning . . . . . . . . . . . . . . . . . . . . . 15
iii
I Models for Optimization 23
2 Classic Discrete Optimization Problems 25
2.1 Graph Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Minimum Spanning Trees . . . . . . . . . . . . . . . . 29
2.1.2 Shortest Path Problems . . . . . . . . . . . . . . . . . 31
2.1.3 Max Flow/Min Cut . . . . . . . . . . . . . . . . . . . . 36
2.1.4 Matching . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 String and Sequence Problems . . . . . . . . . . . . . . . . . . 42
2.2.1 Longest Common Subsequence . . . . . . . . . . . . . . 43
2.2.2 Longest Common Substring . . . . . . . . . . . . . . . 45
2.2.3 Exact Set Matching . . . . . . . . . . . . . . . . . . . . 47
2.3 Mini Case Study: Intraspecies Phylogenetics . . . . . . . . . . 48
3 Hard Discrete Optimization Problems 63
iv
3.1 Graph Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.1.1 Traveling Salesman Problems . . . . . . . . . . . . . . 66
3.1.2 Hard Cut Problems . . . . . . . . . . . . . . . . . . . . 68
3.1.3 Vertex Cover, Independent Set, and k-Clique . . . . . . 69
3.1.4 Graph Coloring . . . . . . . . . . . . . . . . . . . . . . 72
3.1.5 Steiner Trees . . . . . . . . . . . . . . . . . . . . . . . 73
3.1.6 Maximum Subgraph or Induced Subgraph with Property _ 75
3.2 String and Sequence Problems . . . . . . . . . . . . . . . . . . 76
3.2.1 Longest Common Subsequence . . . . . . . . . . . . . . 76
3.2.2 Shortest Common Supersequence/Superstring . . . . . 78
3.3 Set Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3.1 Minimum Test Set . . . . . . . . . . . . . . . . . . . . 80
3.3.2 Minimum Set Cover . . . . . . . . . . . . . . . . . . . 82
3.4 Hardness Reductions . . . . . . . . . . . . . . . . . . . . . . . 83
v
3.5 What to Do with Hard Problems . . . . . . . . . . . . . . . . 85
4 Case Study: Sequence Assembly 105
4.1 Sequencing Technologies . . . . . . . . . . . . . . . . . . . . . 106
4.1.1 Maxam-Gilbert . . . . . . . . . . . . . . . . . . . . . . 106
4.1.2 Sanger Dideoxy . . . . . . . . . . . . . . . . . . . . . . 110
4.1.3 Automated Sequencing . . . . . . . . . . . . . . . . . . 112
4.1.4 What About Bigger Sequences? . . . . . . . . . . . . . 115
4.2 Computational Approaches . . . . . . . . . . . . . . . . . . . . 117
4.2.1 Sequencing by Hybridization . . . . . . . . . . . . . . . 117
4.2.2 Eulerian Path Method . . . . . . . . . . . . . . . . . . 120
4.2.3 Shotgun Sequencing . . . . . . . . . . . . . . . . . . . 123
4.2.4 Double-Barreled Shotgun . . . . . . . . . . . . . . . . . 127
4.3 The Future? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
vi
4.3.1 SBH Revisited . . . . . . . . . . . . . . . . . . . . . . . 130
4.3.2 New Sequencing Technologies . . . . . . . . . . . . . . 133
5 General Continuous Optimization 137
5.1 Bisection Method . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.2 Secant Method . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.3 Newton-Raphson . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.4 Newton-Raphson with Black-Box Functions . . . . . . . . . . 152
5.5 Multivariate Functions . . . . . . . . . . . . . . . . . . . . . . 154
5.6 Direct Methods for Optimization . . . . . . . . . . . . . . . . 160
5.6.1 Steepest Descent . . . . . . . . . . . . . . . . . . . . . 161
5.6.2 The Levenberg-Marquardt Method . . . . . . . . . . . 163
5.6.3 Conjugate Gradient . . . . . . . . . . . . . . . . . . . . 166
6 Constrained Optimization 171
vii
6.1 Linear Programming . . . . . . . . . . . . . . . . . . . . . . . 173
6.1.1 The Simplex Method . . . . . . . . . . . . . . . . . . . 175
6.1.2 Interior Point Methods . . . . . . . . . . . . . . . . . . 188
6.2 Primals and Duals . . . . . . . . . . . . . . . . . . . . . . . . 194
6.3 Solving Linear Programs in Practice . . . . . . . . . . . . . . . 196
6.4 Non-linear Programming . . . . . . . . . . . . . . . . . . . . . 197
II Simulation and Sampling 203
7 Sampling from Probability Distributions 205
7.1 Uniform Random Variables . . . . . . . . . . . . . . . . . . . . 206
7.2 The Transformation Method . . . . . . . . . . . . . . . . . . . 209
7.2.1 Transformation Method for Joint Distributions . . . . . 213
7.3 The Rejection Method . . . . . . . . . . . . . . . . . . . . . . 216
7.4 Sampling From Discrete Distributions . . . . . . . . . . . . . . 220
viii
8 Markov Models 225
8.1 Time Evolution of Markov Models . . . . . . . . . . . . . . . . 228
8.2 Stationary Distributions and Eigenvectors . . . . . . . . . . . 234
8.3 Mixing Times . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9 Markov Chain Monte Carlo Sampling 245
9.1 Metropolis Method . . . . . . . . . . . . . . . . . . . . . . . . 247
9.1.1 Generalizing the Metropolis Method . . . . . . . . . . 256
9.1.2 Metropolis as an Optimization Method . . . . . . . . . 257
9.2 Gibbs Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . 261
9.2.1 Gibbs Sampling as an Optimization Method . . . . . . 266
9.3 Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . 270
9.3.1 Umbrella Sampling . . . . . . . . . . . . . . . . . . . . 274
9.3.2 Generalizing to Other Samplers . . . . . . . . . . . . . 275
ix
10 Mixing Times of Markov Models 281
10.1 Formalizing Mixing Time . . . . . . . . . . . . . . . . . . . . . 284
10.2 The Canonical Path Method . . . . . . . . . . . . . . . . . . . 285
10.3 The Conductance Method . . . . . . . . . . . . . . . . . . . . 294
10.4 Final Comments . . . . . . . . . . . . . . . . . . . . . . . . . 301
11 Continuous-Time Markov Models 305
11.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
11.2 Properties of CTMMs . . . . . . . . . . . . . . . . . . . . . . 310
11.3 The Kolmogorov Equations . . . . . . . . . . . . . . . . . . . 315
12 Case Study: Molecular Evolution 323
12.1 DNA Base Evolution . . . . . . . . . . . . . . . . . . . . . . . 324
12.1.1 The Jukes-Cantor (One Parameter) Model . . . . . . . 324
12.1.2 Kimura (Two Parameter) Model . . . . . . . . . . . . . 329
x
12.2 Simulating a Strand of DNA . . . . . . . . . . . . . . . . . . . 333
12.3 Sampling from Whole Populations . . . . . . . . . . . . . . . . 336
12.4 Extensions of the Coalescent . . . . . . . . . . . . . . . . . . . 342
12.4.1 Variable Population Sizes . . . . . . . . . . . . . . . . 343
12.4.2 Population Substructure . . . . . . . . . . . . . . . . . 345
12.4.3 Diploid Organisms . . . . . . . . . . . . . . . . . . . . 345
12.4.4 Recombination . . . . . . . . . . . . . . . . . . . . . . 346
13 Discrete Event Simulation 351
13.1 Generalized Discrete Event Modeling . . . . . . . . . . . . . . 354
13.2 Improving Efficiency . . . . . . . . . . . . . . . . . . . . . . . 356
13.3 Real-world Example: Hard-sphere Model of Molecular Collision
Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
13.4 Supplementary Material:
Calendar Queues . . . . . . . . . . . . . . . . . . . . . . . . . 365
xi
14 Numerical Integration 1: Ordinary Differential Equations 369
14.1 Finite Difference Schemes . . . . . . . . . . . . . . . . . . . . 373
14.2 Forward Euler . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
14.3 Backward Euler . . . . . . . . . . . . . . . . . . . . . . . . . . 381
14.4 Higher-Order Single-Step Methods . . . . . . . . . . . . . . . 385
14.5 Multi-Step Methods . . . . . . . . . . . . . . . . . . . . . . . . 389
14.6 Step Size Selection . . . . . . . . . . . . . . . . . . . . . . . . 393
15 Numerical Integration 2: Partial Differential Equations 399
15.1 Problems of One Spatial Dimension . . . . . . . . . . . . . . . 401
15.2 Initial Conditions and Boundary Conditions . . . . . . . . . . 404
15.3 Aside on Step Sizes . . . . . . . . . . . . . . . . . . . . . . . . 409
15.4 Multiple Spatial Dimensions . . . . . . . . . . . . . . . . . . . 411
15.5 Reaction-Diffusion Equations . . . . . . . . . . . . . . . . . . 412
xii
15.6 Convection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
16 Numerical Integration 3: Stochastic Differential Equations 423
16.1 Modeling Brownian Motion . . . . . . . . . . . . . . . . . . . 424
16.2 Stochastic Integrals and Differential Equations . . . . . . . . . 427
16.3 Integrating SDEs . . . . . . . . . . . . . . . . . . . . . . . . . 431
16.4 Accuracy of Stochastic Integration Methods . . . . . . . . . . 435
16.5 Stability of Stochastic Integration Methods . . . . . . . . . . . 438
17 Case Study: Simulating Cellular Biochemistry 445
17.1 Differential Equation Models . . . . . . . . . . . . . . . . . . . 446
17.2 Markov Models Methods . . . . . . . . . . . . . . . . . . . . . 451
17.3 Hybrid Models . . . . . . . . . . . . . . . . . . . . . . . . . . 456
17.4 Handling Very Large Reaction Networks . . . . . . . . . . . . 460
17.5 The Future of Whole-Cell Models . . . . . . . . . . . . . . . . 462
xiii
III Parameter Tuning 467
18 Parameter Tuning as Optimization 469
18.1 General Optimization . . . . . . . . . . . . . . . . . . . . . . . 471
18.2 Constrained Optimization . . . . . . . . . . . . . . . . . . . . 475
18.3 Evaluating an Implicitly Specified Function . . . . . . . . . . . 478
19 Expectation Maximization 485
19.1 â¿¿The Expectation Maximization Algorithmâ¿¿ . . . . . . . . . . 488
19.2 EM Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
19.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
20 Hidden Markov Models 513
20.1 Applications of HMMs . . . . . . . . . . . . . . . . . . . . . . 516
20.2 Algorithms for HMMs . . . . . . . . . . . . . . . . . . . . . . 520
20.2.1 Problem 1: Optimizing State Assignments . . . . . . . 521
xiv
20.2.2 Problem 2: Evaluating Output Probability . . . . . . . 525
20.2.3 Problem 3: Training the Model . . . . . . . . . . . . . 528
20.3 Parameter Tuning Example: Motif Finding by HMM . . . . . 535
21 Linear System Solving 547
21.1 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . 550
21.1.1 Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . 555
21.2 Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . 562
21.3 Krylov Subspace Methods . . . . . . . . . . . . . . . . . . . . 564
21.3.1 Preconditioners . . . . . . . . . . . . . . . . . . . . . . 568
21.4 Over-determined and Under-determined Systems . . . . . . . . 570
22 Interpolation and Extrapolation 577
22.1 Polynomial Interpolation . . . . . . . . . . . . . . . . . . . . . 582
22.1.1 Nevilleâ¿¿s Algorithm . . . . . . . . . . . . . . . . . . . . 583
xv
22.2 Fitting to Lower-Order Polynomials . . . . . . . . . . . . . . . 587
22.3 Rational Function Interpolation . . . . . . . . . . . . . . . . . 589
22.4 Splines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
22.5 Multidimensional Interpolation . . . . . . . . . . . . . . . . . 595
22.6 Interpolation with Arbitrary Families of Curves . . . . . . . . 596
22.7 Extrapolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
22.7.1 Richardson Extrapolation . . . . . . . . . . . . . . . . 601
22.7.2 Aitkenâ¿¿s _2 Process . . . . . . . . . . . . . . . . . . . . 604
23 Case Study: Inferring Gene Regulatory Networks 609
23.1 Coexpression Models . . . . . . . . . . . . . . . . . . . . . . . 612
23.1.1 Measures of Similarity . . . . . . . . . . . . . . . . . . 613
23.1.2 Finding a Union-of-Cliques Graph . . . . . . . . . . . . 616
23.2 Bayesian Graphical Models . . . . . . . . . . . . . . . . . . . . 620
xvi
23.2.1 Defining a Probability Function . . . . . . . . . . . . . 622
23.2.2 Finding the Network . . . . . . . . . . . . . . . . . . . 625
23.3 Kinetic Models . . . . . . . . . . . . . . . . . . . . . . . . . . 630
24 Model Validation 635
24.1 Measures of Goodness . . . . . . . . . . . . . . . . . . . . . . 636
24.2 Accuracy, Sensitivity, and Specificity . . . . . . . . . . . . . . 641
24.3 Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . 646
24.4 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . 649
24.5 Modeling and the Scientific Method . . . . . . . . . . . . . . . 652
Index

Library of Congress Subject Headings for this publication:

Biology -- Simulation methods.
Biology -- Mathematical models.