search for books and compare prices
Tables of Contents for DNA and Protein Sequence Analysis
Chapter/Section Title
Page #
Page Count
List of contributors
xxi
2
Abbreviations
xxiii
 
1. Molecular biology databases
1
30
Christian Burks
1. Overview
1
15
Summary
1
1
Molecular biology databases
2
1
Sequence databases
2
14
What are the uses of databases?
16
1
2. Contributing data to the databases
16
3
Community pipelines
16
1
Direct, electronic submission
17
1
Timeliness of release of data to databanks
17
1
Promulgating data revisions and extensions
18
1
3. Retrieving data from the databases
19
5
Finding databases of interest
19
3
Media
22
1
Mechanisms
22
1
Which databases should I get?
23
1
4. Using the data
24
2
Do I have a current version of the database?
24
1
How often should I repeat routine queries
24
1
How redundant is the database?
25
1
Are there errors in the database?
25
1
How did I get that result?
26
1
5. Queries across multiple databases
26
2
6. Keeping up and going further
28
1
Acknowledgements
29
1
References
29
2
2. The NCBI software tools
31
14
J. M. Ostell
1. Introduction
31
1
2. The software toolkit
31
1
Portable core library
31
1
Data encoding in ASN.1
32
1
3. The NCBI data model
32
7
Introduction
32
1
Pub
33
1
Bioseq
34
5
4. Technical aspects of the NCBI toolkit
39
3
ASN.1 libraries
39
1
Object loader layer
39
1
Utilities layer
39
1
Access libraries
40
1
Vibrant portable graphical interface
40
1
Network client/server libraries
41
1
5. NCBI toolkit applications
42
1
Entrez
42
1
BLAST
42
1
BankIt
42
1
Sequin
42
1
Others
42
1
6. Summary
42
3
3. EBI databases and tools
45
14
Rainer Fuchs
Graham N. Cameron
1. EBI information products
45
1
2. Databases and software on the EBI CD-ROM
46
5
EBI software for DOS computers
48
1
EBI retrieval software for Macintosh computers
49
2
Other software
51
1
3. Network information services
51
5
EBI database and information servers
51
3
On-line database access
54
2
Remote database searches
56
1
4. Contacting the EBI
56
2
Acknowledgements
58
1
References
58
1
4. Networked services
59
16
G. Williams
1. Introduction
59
1
Logging in to the system
59
1
Computer names
59
1
2. Electronic mail
60
1
E-mail
60
1
E-mail servers
60
1
3. File transfer protocol (FTP)
61
3
FTP
61
2
File formats
63
1
Archie
64
1
4. Remote log in
64
2
Telnet
64
1
BIDS
65
1
MEDLARS
65
1
MSDN
66
1
5. Mailing lists and network news
66
4
Mailing lists
67
1
Usenet/network news
68
2
6. Information servers
70
4
Gopher
70
1
WWW
70
4
7. Further information
74
1
References
74
1
5. DNA sequencing methodology and software
75
24
William D. Rawlinson
Barclay G. Barrell
1. Introduction
75
1
2. DNA sequencing methods
76
2
3. Sequence handling software and sequence project design
78
4
Conventions
79
1
Display of trace data from within the database
80
1
Software created to make design of sequencing reactions easier
80
2
4. The software for assembling sequence data
82
12
The database assembly and handling program (xbap)
82
11
Alternative packages
93
1
5. Assessment of sequencing projects
94
1
Recording information about the sequencing templates
94
1
Assessment of the sequence data during assembly
94
1
6. Discussion
95
2
Acknowledgements
97
1
References
97
2
6. Molecular biology software for the Apple Macintosh
99
38
M. Ginsburg
M. P. Mitchell
1. Introduction
99
1
2. GeneWorks
100
11
Overview
100
1
DNA analysis
101
3
Protein analysis
104
4
Special analyses
108
3
3. MacVector suite
111
8
Overview
111
1
DNA analysis
112
3
Protein analysis
115
2
Special analyses
117
2
AssemblyLIGN
119
1
4. DNAStar
119
7
Overview
119
1
Sequence editing
120
1
Pattern analysis
121
1
Protein analysis
122
1
Special analyses
123
3
5. Sequencher
126
2
Overview
126
1
Entering sequences
127
1
Assembling the data
127
1
Editing the data
128
1
6. Amplify
128
2
Overview
128
1
Running the program
129
1
7. MacPattern
130
2
Overview
130
1
Running the program
130
2
8. Other programs
132
2
Suppliers
133
1
Internet Sources
134
1
References
134
3
Further reading
135
2
7. Sequence comparison and alignment
137
32
Stephen F. Altschul
1. Introduction
137
1
2. Global sequence alignment
137
4
Algorithms
137
2
Substitution and gap scores
139
1
Statistics
140
1
3. Global multiple alignment
141
2
Scores
141
1
Algorithms
142
1
4. Local sequence alignment
143
8
Algorithms
143
1
Local alignment statistics
144
4
Local alignment scoring systems
148
3
5. Database search methods
151
3
Parallel architectures
151
1
Heuristic algorithms
152
2
Vector-based comparison methods
154
1
6. Local multiple alignment
154
4
Consensus word methods
155
1
Template methods
156
1
Progressive alignment methods
156
1
Pairwise comparison methods
157
1
Statistically-based methods
157
1
General issues
158
1
7. Sequence motifs
158
4
Weight matrices
159
3
Generalizations
162
1
Acknowledgements
162
1
References
162
7
8. Simple sequences of protein and DNA
169
16
John C. Wootton
1. Introduction
169
1
2. Some practical guidelines to a complex body of theory
170
5
Complexity, pattern, and periodicity are distinct properties of simple sequences
170
1
Terminology
171
1
Local compositional complexity
171
2
Low complexity is more clear-cut for proteins than DNA
173
1
Unbiased inference
173
1
Sources for mathematical background
173
1
Visual inspection is complementary to mathematical analysis
174
1
3. Software and examples of applications
175
5
Available software
175
1
Comparison of different algorithms and programs
175
5
Future software developments
180
1
4. Masking of low-complexity sequences for searching databases
180
1
The problem
180
1
Masking methods
180
1
5. Complexity definitions and segmentation algorithm
181
1
Definition 1
181
1
Definition 2
181
1
Probabilities of complexity states
182
1
Segmentation algorithm based on compositional complexity
182
1
References
182
3
9. Repetitive sequences in DNA
185
12
Jorg T. Epplen
Olaf Riess
1. Introduction
185
2
2. Types of repetitive sequences
187
2
Satellite DNA
187
1
Simple repetitive DNA sequences
188
1
Short and long interspersed nucleotide elements (SINEs and LINEs)
188
1
Minisatellites
189
1
3. Repeats in genomic DNA (and protein) databanks
189
2
Evolutionary aspects
190
1
Expression of repeats
190
1
Repeats as tools
191
1
4. Short consensus motifs for the identification of functional sequences in DNA which appear repetitively in and around genes
191
1
5. Diseases caused by expansion of simple nucleotide repeats
192
1
6. Conclusions
193
1
References
193
4
10. Isochores and synonymous substitutions in mammalian genes
197
12
Giorgio Bernardi
Dominique Mouchiroud
Christian Gautier
1. Introduction
197
1
2. Methods
198
1
3. Results
198
5
4. Discussion
203
4
The frequencies of synonymous substitutions do not exhibit differences related to regions of the mammalian genome
203
1
Differences in repair efficiency do not cause differences in the rates of synonymous substitutions of genes located in different isochore families
204
1
Differences in the process of mutation associated with replication timing do not affect the rates nor the biases of synonymous substitutions of genes located in different isochore families
205
1
5. Conclusions
206
1
References
207
2
11. Identifying genes in genomic DNA sequences
209
16
Eric E. Snyder
Gary D. Stormo
1. Introduction
209
3
Low-level motif identification
210
1
Assembling complete genes using multiple pieces of evidence
211
1
2. Programs
212
3
GeneModeler
212
1
GeneID
213
1
GRAIL
214
1
GeneParser
215
1
3. Performance statistics
215
5
Test data
216
2
Comparison of currently available programs
218
1
Results
219
1
4. Recommendations for users
220
3
5. Conclusions
223
1
References
223
2
12. Prediction of mRNA sequence function
225
6
Keith Vass
1. Introduction
225
1
2. Analysis of sequence data
226
4
Short sequence patterns
226
1
Repeated sequences
226
1
Conserved sequences
227
1
Database searching
228
1
Secondary structure
229
1
Secondary structure searches of sequence databases
230
1
3. Summary
230
1
References
230
1
13. Forecasting protein function
231
24
T. C. Hodgman
1. Introduction
231
1
2. Structure/function relationships
232
1
3. General strategy
233
1
4. Pairwise domain matches
234
5
FASTA and BLAST
235
2
MPSRCH and PROSRCH
237
1
DFLASH
238
1
Assessing retained sequences
238
1
5. Weak domain matches
239
3
General points
239
1
SBASE
240
1
PRODOM
240
1
PLSEARCH
240
2
BLAST3
242
1
6. Motif matches
242
4
Sources
242
1
Definitions
242
2
PROSEARCH
244
1
BLOCKS
244
1
BLA
245
1
LUPES
246
1
7. URF alignments
246
2
PROFILESEARCH
247
1
PIPL
247
1
PTNSRCH
247
1
SCRUTINEER
247
1
8. Assessing candidate matches
248
1
9. Single sequence analyses
249
2
Repeats
249
1
Biased composition
249
1
Secondary structure
250
1
10. Software sources
251
1
Acknowledgements
252
1
References
253
2
14. DNA and RNA structure prediction
255
24
Eric Westhof
Pascal Auffinger
Christine Gaspin
1. Introduction
255
2
2. Molecular mechanics and molecular dynamics methods
257
4
The potential energy function
257
2
Molecular dynamics simulation protocols
259
2
Modelling large nucleic acids
261
1
Analysis of the trajectories
261
1
3. Fine structure and the search for specific regions in DNA
261
1
4. RNA secondary structure prediction
262
10
Representation
263
1
Data necessary for folding RNA molecules
264
2
Methods of prediction
266
6
Limits
272
1
5. RNA tertiary structure construction
272
1
6. Conclusions
273
1
Acknowledgements
273
2
References
275
4
15. Phylogenetic estimation
279
34
Nick Goldman
1. Introduction
279
2
2. Common ground
281
6
Trees
281
1
Data
282
1
Models of evolutionary change
283
3
Estimation
286
1
Heuristics
287
1
3. Phylogenetic estimation methods based on sequences
287
6
Maximum likelihood methods
287
4
Parsimony methods
291
2
4. Phylogenetic estimation methods based on distances
293
4
Sequence distances
294
1
Phylogenetic trees from distance matrices
295
2
5. Comparison of methods
297
2
6. Other phylogenetic estimation methods
299
2
Lake's method of invariants
300
1
Hein's method of simultaneous alignment and phylogenetic tree estimation
300
1
Minimum message length coding
301
1
7. Measuring uncertainty
301
5
Statistical fluctuation
302
3
Systematic errors
305
1
8. The future of phylogenetic estimation
306
2
9. Appendix: computer programs
308
3
PHYLIP
308
1
MEGA
309
1
PAUP
309
1
BASEML and BASEMLG
310
1
PROTML
310
1
TREEALIGN
310
1
Minimum message length encoding
310
1
FASTDNAML
310
1
References
311
2
16. Evolution and relationships of protein families
313
28
William R. Taylor
1. Introduction
313
1
2. Sequence similarity
314
8
Pairwise sequence alignment
314
4
Multiple sequence alignments
318
1
Structure biased alignment
319
2
Sequence threading
321
1
3. Structural comparison
322
6
Recent comparison methods
323
1
Fold classification
324
2
How many protein folds?
326
2
4. Molecular evolution
328
7
Genetic algorithm model
328
1
Gene duplication and fusion
329
1
Introns and exons
330
2
Evolution of function
332
3
5. Theory
335
1
6. Conclusions
336
1
References
336
5
A1. List of suppliers
341
2
Glossary
343
6
Index
349