Contents
Chapter 1 Introduction
1.1 What is Biopython?
1.1.1 What can I find in the Biopython package
1.2 Installing Biopython
1.3 FAQ
Chapter 2 Quick Start – What can you do with Biopython?
2.1 General overview of what Biopython provides
2.2 Working with sequences
2.3 A usage example
2.4 Parsing sequence file formats
2.4.1 Simple FASTA parsing example
2.4.2 Simple GenBank parsing example
2.4.3 I love parsing – please don’t stop talking about it!
2.5 Connecting with biological databases
2.6 What to do next
Chapter 3 Sequence objects
3.1 Sequences and Alphabets
3.2 Sequences act like strings
3.3 Slicing a sequence
3.4 Turning Seq objects into strings
3.5 Nucleotide sequences and (reverse) complements
3.6 Concatenating or adding sequences
3.7 MutableSeq objects
3.8 Transcribing and Translation
3.9 Working with directly strings
Chapter 4 Sequence Input/Output
4.1 Parsing or Reading Sequences
4.1.1 Reading Sequence Files
4.1.2 Iterating over the records in a sequence file
4.1.3 Getting a list of the records in a sequence file
4.1.4 Extracting data
4.2 Parsing sequences from the net
4.2.1 Parsing GenBank records from the net
4.2.2 Parsing SwissProt sequences from the net
4.3 Sequence files as Dictionaries
4.3.1 Specifying the dictionary keys
4.3.2 Indexing a dictionary using the SEGUID checksum
4.4 Writing Sequence Files
4.4.1 Converting between sequence file formats
4.4.2 Converting a file of sequences to their reverse complements
Chapter 5 Sequence Alignment Input/Output
5.1 Parsing or Reading Sequence Alignments
5.1.1 Single Alignments
5.1.2 Multiple Alignments
5.1.3 Ambiguous Alignments
5.2 Writing Alignments
5.2.1 Converting between sequence alignment file formats
Chapter 6 BLAST
6.1 Running BLAST locally
6.2 Running BLAST over the Internet
6.3 Saving BLAST output
6.4 Parsing BLAST output
6.5 The BLAST record class
6.6 Deprecated BLAST parsers
6.6.1 Parsing plain-text BLAST output
6.6.2 Parsing a file full of BLAST runs
6.6.3 Finding a bad record somewhere in a huge file
6.7 Dealing with PSIBlast
Chapter 7 Accessing NCBI’s Entrez databases
7.1 Entrez Guidelines
7.2 EInfo: Obtaining information about the Entrez databases
7.3 ESearch: Searching the Entrez databases
7.4 EPost
7.5 ESummary: Retrieving summaries from primary IDs
7.6 EFetch: Downloading full records from Entrez
7.7 ELink
7.8 EGQuery: Obtaining counts for search terms
7.9 ESpell: Obtaining spelling suggestions
7.10 Examples
7.10.1 Searching and downloading Entrez Nucleotide records
7.10.2 Finding the lineage of an organism
7.10.3 Using the history and WebEnv
Chapter 8 Swiss-Prot, Prosite, Prodoc, and ExPASy
8.1 Bio.SwissProt: Parsing Swiss-Prot files
8.1.1 Parsing Swiss-Prot records
8.1.2 Parsing the Swiss-Prot keyword and category list
8.2 Bio.Prosite: Parsing Prosite records
8.3 Bio.Prosite.Prodoc: Parsing Prodoc records
8.4 Bio.ExPASy: Accessing the ExPASy server
8.4.1 Retrieving a Swiss-Prot record
8.4.2 Searching Swiss-Prot
8.4.3 Retrieving Prosite and Prodoc records
Chapter 9 Cookbook – Cool things to do with it
9.1 PubMed
9.1.1 Sending a query to PubMed
9.1.2 Retrieving a PubMed record
9.2 GenBank
9.2.1 Retrieving GenBank entries from NCBI
9.2.2 Parsing GenBank records
9.2.3 Iterating over GenBank records
9.3 Dealing with alignments
9.3.1 Clustalw
9.3.2 Calculating summary information
9.3.3 Calculating a quick consensus sequence
9.3.4 Position Specific Score Matrices
9.3.5 Information Content
9.3.6 Translating between Alignment formats
9.4 Substitution Matrices
9.4.1 Using common substitution matrices
9.4.2 Creating your own substitution matrix from an alignment
9.5 BioSQL – storing sequences in a relational database
9.6 BioCorba
9.7 Going 3D: The PDB module
9.7.1 Structure representation
9.7.2 Disorder
9.7.3 Hetero residues
9.7.4 Some random usage examples
9.7.5 Common problems in PDB files
9.7.6 Other features
9.8 Bio.PopGen: Population genetics
9.8.1 GenePop
9.8.2 Coalescent simulation
9.8.3 Other applications
9.8.4 Future Developments
9.9 InterPro
Chapter 10 Advanced
10.1 The SeqRecord and SeqFeature classes
10.1.1 Sequence ids and Descriptions – dealing with SeqRecords
10.1.2 Features and Annotations – SeqFeatures
10.2 Regression Testing Framework
10.2.1 Writing a Regression Test
10.3 Parser Design
10.4 Substitution Matrices
10.4.1 SubsMat
10.4.2 FreqTable
Chapter 11 Where to go from here – contributing to Biopython
11.1 Maintaining a distribution for a platform
11.2 Bug Reports + Feature Requests
11.3 Contributing Code
Chapter 12 Appendix: Useful stuff about Python
12.1 What the heck is a handle?
12.1.1 Creating a handle from a string