PyPop User Guide

User Guide for Python for Population Genomics

Alex K. Lancaster

University of California, Berkeley
Department of Integrative Biology

Mark P. Nelson

University of California, Berkeley
Department of Integrative Biology

Diogo Meyer

University of California, Berkeley
Department of Integrative Biology

Richard M. Single

University of Vermont
Department of Medical Biostatistics

Owen D. Solberg

University of California, Berkeley
Department of Integrative Biology

Documenting version 0.7.0 of PyPop

Licence terms for PyPop documentation

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections no Front-Cover Texts and no Back-Cover Texts. A copy of the license is included in: Section A.2, “GNU Free Documentation License”.

11 May 2009


Table of Contents

Preface
1. Introduction
2. How to use this guide
3. Recent changes to PyPop
4. Authors of software components
1. Installing PyPop
1.1. Installing standalone binary
1.1.1. Installing on GNU/Linux
1.1.2. Installing on Windows
1.2. Installing from source
1.2.1. System requirements
1.2.2. Installation
1.2.3. Test suite
1.2.4. Contributions, bug reports
1.2.5. Distribution structure
2. Getting started with PyPop
2.1. Introduction
2.1.1. Interactive mode
2.1.2. Batch mode
2.1.3. What happens when you run PyPop?
2.2. The data file
2.2.1. Sample files
2.2.2. Missing data
2.3. The configuration file
2.3.1. A minimal configuration file
2.3.2. Advanced options
3. Interpreting PyPop output
3.1. Population summary
3.2. Single locus analyses
3.2.1. Basic allele count information
3.2.2. Chi-square test for deviation from Hardy-Weinberg proportions (HWP).
3.2.3. Exact test for deviation from HWP
3.2.4. The Ewens-Watterson homozygosity test of neutrality
3.3. Multi-locus analyses
3.3.1. All pairwise LD
3.3.2. Haplotype frequency estimation
References
A. License terms
A.1. GNU General Public License
A.2. GNU Free Documentation License

List of Examples

2.1. Multi-locus allele-level genotype data
2.2. Multi-locus allele-level HLA genotype data with sample information
2.3. Multi-locus allele-level HLA genotype data with sample and header information
2.4. Multi-locus allele-level HLA genotype and microsatellite genotype data with header information
2.5. Sequence genotype data with header information
2.6. Allele count data
2.7. Minimal config.ini file
3.1. Population summary sample output
3.2. Basic locus information sample output
3.3. Sample output of Hardy-Weinberg genotype table
3.4. Sample output of HW genotype classes
3.5. Sample output for exact test using gthwe
3.6. Sample output for exact test using the Arlequin implementation
3.7. Sample output of homozygosity test from Monte-Carlo implementation
3.8. Sample output of homozygosity test from simulation look-up tables (disabled by default)
3.9. Sample output of all pairwise LD
3.10. Sample output of haplotype estimation parameters
3.11. Sample output of estimated haplotype frequencies

List of Equations

3.0.
3.0.