Popgen for dummies

Posted by Pico on January 26, 2011

Posted in: Uncategorized. Leave a comment

popgen.html

Lecture Notes, Short Course in Evolutionary Quantitative Genetics

Bruce Walsh, University of Arizona. From June 2006 Postgraduate course in Evolutionary Quantitative Genetics, Roenbjerg field station, University of Aarhus, Denmark

pdfs of lecture notes

Table of Contents (10 pages)
Lectures
1. Basic Statistical Machinery (11 pages) minor update on 8 July 2006
2. Linear Algebra and Linear Models (25 pages)
3. Basic Concepts in Mendelian, Population and Quantitative Genetics (23 pages)
4. Resemblance Between Relatives (15 pages)
5. Basic Designs for Estimation of Genetic Parameters (18 pages) minor update on 28 July 2006
6. Inbreeding and Crossbreeding (17 pages)
7. Genetic Drift (26 pages)
8. Tests for Molecular Signatures of Selection (15 pages)
9. Short-term Selection Response (21 pages)
10. Analysis of Short-term Selection Experiments (22 pages)
11. Long-Term Response and Selection Limits (25 pages)
12. Individual Fitness and Measures of Univariate Selection (19 pages)
13. Genetic Correlations and Multivariate Selection Response (26 pages)
14. Measuring Multivariate Selection (20 pages)
15. Phenotypic Evolution Models (15 pages)
16. Major Genes, Polygenes, and QTLs (17 pages)
17. QTL Mapping (17 pages)
18. Quantitative Analysis of Regulatory Variation (16 pages)
19. Power calculations in R. pdf file of powerpoint presentation (29 pages)
References (still missing a few) (9 pages)

Prepare for the risk: Real conservatives ignoring their own principles

Posted by Pico on November 8, 2010

Posted in: Uncategorized. Leave a comment

Prepare for the risk: Real conservatives ignoring their own principles

A terrific op-ed in WashPost from CAP’s Bracken Hendricks: I stopped reading the WP long time ago but from time to time it surprise me in a good way. Hope more people wake up and listen to him. Japan should do more on this.

The best science available suggests that without taking action to fundamentally change how we produce and use energy, we could see temperatures rise 9 to 11 degrees Fahrenheit over much of the United States by 2090. These estimates have sometimes been called high-end predictions, but the corresponding low-end forecasts assume we will rally as a country to shift course. That hasn’t happened, so the worst case must become our best guess….

Today’s conservatives would do well to start thinking more like military planners, reexamining the risks inherent in their strategy. If, instead, newly elected Republicans do nothing, they will doom us all to bigger government interventions and a large dose of suffering – a reckless choice that’s anything but conservative.
Few causes unite the conservatives of the newly elected 112th Congress as unanimously as their opposition to government action on climate change. In September, the Center for American Progress Action Fund surveyed Republican candidates in congressional and gubernatorial races and found that nearly all disputed the scientific consensus on global warming, and none supported measures to mitigate it. For example, Robert Hurt, who won Tom Perriello’s House seat in Virginia, says clean-energy legislation would fail to “do anything except harm people.” The tea party’s “Contract From America” calls proposed climate policies “costly new regulations that would increase unemployment, raise consumer prices, and weaken the nation’s global competitiveness with virtually no impact on global temperatures.” Even conservatives who once argued for action on climate change, such as as Sen. John McCain (Ariz.) and Rep. Mark Kirk (Ill.), have run for cover. But it’s conservatives who should fear climate change the most. To put it simply, if you hate big government, try global warming on for size. Many conservatives say they oppose clean-energy policies because they want to keep government off our backs. But they have it exactly backward. Doing nothing will set our country on a course toward narrower choices for businesses and individuals, along with an expanded role for government. When catastrophe strikes – and yes, the science is quite solid that it will – it will be the feds who are left conducting triage. My economic views are progressive, and I think government has an important role in tackling big problems. But I admire many cherished conservative values, from personal responsibility to thrift to accountability, and I worry that conservatives’ lock-step posture on climate change is seriously out of step with their professed priorities. A strong defense of our national interests, rigorous cost-benefit analysis, fiscal discipline and the ability to avoid unnecessary intrusions into personal liberty will all be seriously compromised in a world marked by climate change. In fact, far from being conservative, the Republican stance on global warming shows a stunning appetite for risk. When faced with uncertainty and the possibility of costly outcomes, smart businessmen buy insurance, reduce their downside exposure and protect their assets. When confronted with a disease outbreak of unknown proportions, front-line public health workers get busy producing vaccines, pre-positioning supplies and tracking pathogens. And when military planners assess an enemy, they get ready for a worst-case encounter. When it comes to climate change, conservatives are doing none of this. Instead, they are recklessly betting the farm on a single, best-case scenario: That the scientific consensus about global warming will turn out to be wrong. This is bad risk management and an irresponsible way to run anything, whether a business, an economy or a planet.

Atmospheric CO2: Principal Control Knob

Posted by Pico on October 18, 2010

Posted in: Uncategorized. Leave a comment

Atmospheric CO₂: Principal Control Knob Governing Earth’s Temperature

http://www.sciencemag.org/cgi/content/short/330/6002/356

Andrew A. Lacis,^* Gavin A. Schmidt, David Rind, Reto A. Ruedy

Ample physical evidence shows that carbon dioxide (CO₂) is thesingle most important climate-relevant greenhouse gas in Earth’satmosphere. This is because CO₂, like ozone, N₂O, CH₄, and chlorofluorocarbons,does not condense and precipitate from the atmosphere at currentclimate temperatures, whereas water vapor can and does. Noncondensinggreenhouse gases, which account for 25% of the total terrestrialgreenhouse effect, thus serve to provide the stable temperaturestructure that sustains the current levels of atmospheric watervapor and clouds via feedback processes that account for theremaining 75% of the greenhouse effect. Without the radiativeforcing supplied by CO₂ and the other noncondensing greenhousegases, the terrestrial greenhouse would collapse, plunging theglobal climate into an icebound Earth state.

Leishmania genome 1

Posted by Pico on June 10, 2010

Posted in: Uncategorized. Leave a comment

PNAS March 16, 1999 vol. 96 no. 6 2902-2906

Leishmania major Friedlin chromosome 1 has an unusual distribution of protein-coding genes

Abstract

Leishmania are evolutionarily ancient protozoans (Kinetoplastidae) and important human pathogens that cause a spectrum of diseases ranging from the asymptomatic to the lethal. The Leishmania genome is relatively small [≈34 megabases (Mb)], lacks substantial repetitive DNA, and is distributed among 36 chromosomes pairs ranging in size from 0.3 Mb to 2.5 Mb, making it a useful candidate for complete genome sequence determination. We report here the nucleotide sequence of the smallest chromosome, chr1. The sequence of chr1 has a 257-kilobase region that is densely packed with 79 protein-coding genes. This region is flanked by telomeric and subtelomeric repetitive elements that vary in number and content among the chr1 homologs, resulting in an ≈27.5-kilobase size difference. Strikingly, the first 29 genes are all encoded on one DNA strand, whereas the remaining 50 genes are encoded on the opposite strand. Based on the gene density of chr1, we predict a total of ≈9,800 genes in Leishmania, of which 40% may encode unknown proteins.

The Kinetoplastidae are flagellated protozoans found in terrestrial and aquatic environments that cause diseases in organisms ranging from plants to vertebrates. These diseases result in widespread human suffering and death, as well as considerable economic loss from infection of livestock, wildlife, and crops. In addition, kinetoplastids have been particularly valuable for the study of fundamental molecular and cellular phenomena, such as RNA editing (1), mRNA transsplicing (2), glycosylphosphatidylinositol-anchoring of proteins (3), antigenic variation (4), and telomere organization (5). The early evolutionary divergence of these organisms makes comparison of their sequences with those of other eukaryotes, as well as prokaryotes, useful for the identification of ancient conserved motifs, and their protein sequences may be a useful source of diversity for protein engineering.

The numerous human-infective Leishmania spp. cause a spectrum of diseases with pathologies ranging from the asymptomatic to the lethal, and there are correlations between species and disease type and severity (6). The Leishmania haploid genome content is ≈34 megabases (Mb; ref. 7), consisting of 36 chromosomes ranging in size from 0.3 Mb to 2.5 Mb (8). It contains ≈30% repeated sequence (9), half of which is a series of telomeric hexamer repeats, whereas the remainder comprises other simple sequence repeats, transposons, as well as tandem and dispersed gene families such as rRNA, spliced-leader, tubulin, and gp63. The Leishmania molecular karyotype is conserved between Leishmania strains and species (10) with most genes syntenic among species (8). There are modest chromosome size polymorphisms between strains and larger size polymorphisms between species. Thus, this organism is an ideal candidate for a genome-sequencing project to elucidate its full genetic complement. The Leishmania Genome Network, established with the support of the World Health Organization, initiated a coordinated effort to map and sequence the Leishmania genome (see www.ebi.ac.uk/parasites/leish.html). Leishmania major MHOM/IL/81/Friedlin (LmjF) was selected as the reference strain to be sequenced and a first-generation contig map of the LmjF genome was constructed by cosmid fingerprinting (7). We report here the complete sequencing of chromosome 1 (chr1), the smallest chromosome.

References

↵
1. Stuart K
(1991) Annu Rev Microbiol 45:327–344, pmid:1720609.
CrossRef Medline Web of Science
↵
1. Perry K,
2. Agabian N
(1991) Experientia 47:118–128, pmid:2001714.
CrossRef Medline Web of Science
↵
1. Krakow J L,
2. Hereld D,
3. Bangs J D,
4. Hart G W,
5. Englund P T
(1986) J Biol Chem 261:12147–12153, pmid:3745182.
Abstract/FREE Full Text
↵
1. Borst P,
2. Rudenko G
(1994) Science 264:1872–1873, pmid:7516579.
FREE Full Text
↵
1. Blackburn E H
(1991) Nature (London) 350:569–573, pmid:1708110.
CrossRef Medline
↵
1. Shaw J J,
2. Lainson R
1. Peters W,
2. Killick-Kendrick R
(1987) in The Leishmaniases in Biology and Medicine, eds Peters W, Killick-Kendrick R (Academic, London), 1, pp 291–361.
↵
1. Ivens A C,
2. Lewis S M,
3. Bagherzadeh A,
4. Zhang L,
5. Chang H M,
6. Smith D F
(1998) Genome Res 8:135–145, pmid:9477341.
Abstract/FREE Full Text
↵
1. Wincker P,
2. Ravel C,
3. Blaineau C,
4. Pages M,
5. Jauffret Y,
6. Dedet J,
7. Bastien P,
8. Dedet J P
(1996) Nucleic Acids Res 24:1688–1694, pmid:8649987.
Abstract/FREE Full Text
↵
1. Ellis J,
2. Crampton J
1. Hart D T
(1989) in Leishmaniasis: The Current Status and New Strategies for Control, ed Hart D T (Plenum, New York), pp 589–596.
↵
1. Bastien P,
2. Blaineau C,
3. Pagès M
(1992) Subcell Biochem 18:131–187, pmid:1485351.
Medline
↵
1. Ryan K A,
2. Dasgupta S,
3. Beverley S M
(1993) Gene 131:145–150, pmid:8370535.
CrossRef Medline Web of Science
↵
1. Ravel C,
2. Macari F,
3. Bastien P,
4. Pages M,
5. Blaineau C
(1995) Mol Biochem Parasitol 69:1–8, pmid:7723776.
CrossRef Medline Web of Science
↵
1. Bouffard G G,
2. Idol J R,
3. Braden V V,
4. Iyer L M,
5. Cunningham A F,
6. Weintraub L A,
7. Touchman J W,
8. Mohr-Tidwell R M,
9. Peluso D C,
10. Fulton R S,
11. et al.
(1997) Genome Res 7:673–692, pmid:9253597.
Abstract/FREE Full Text
↵
1. Wilson R,
2. Ainscough R,
3. Anderson K,
4. Baynes C,
5. Berks M,
6. Bonfield J,
7. Burton J,
8. Connell M,
9. Copsey T,
10. Cooper J,
11. et al.
(1994) Nature (London) 368:32–38, pmid:7906398.
CrossRef Medline
↵
1. Fu G,
2. Barker D C
(1998) Nucleic Acids Res 26:2161–2167, pmid:9547275.
Abstract/FREE Full Text
↵
1. Fu G,
2. Barker D C
(1998) BioTechniques 24:386–390, pmid:9526644.
Medline
↵
1. Myler P J,
2. Lodes M J,
3. Merlin G,
4. deVos T,
5. Stuart K D
(1994) Mol Biochem Parasitol 66:11–20, pmid:7984172.
CrossRef Medline
↵
1. Myler P J,
2. Venkataraman G M,
3. Lodes M J,
4. Stuart K D
(1994) Gene 148:187–193, pmid:7958944.
CrossRef Medline
↵
1. LeBowitz J H,
2. Smith H Q,
3. Rusche L,
4. Beverley S M
(1993) Genes Dev 7:996–1007, pmid:8504937.
Abstract/FREE Full Text
1. Ullu E,
2. Matthews K R,
3. Tschudi C
(1993) Mol Cell Biol 13:720–725, pmid:8417363.
Abstract/FREE Full Text
↵
1. Matthews K R,
2. Tschudi C,
3. Ullu E
(1994) Genes Dev 8:491–501, pmid:7907303.
Abstract/FREE Full Text
↵
1. Swindle J,
2. Tait A
1. Smith D F,
2. Parsons M
(1996) in Molecular Biology of Parasitic Protozoa, eds Smith D F, Parsons M (Oxford Univ. Press, Oxford), pp 6–34.
↵
1. Wong A K C,
2. Curotto de Lafaille M A,
3. Wirth D F
(1994) J Biol Chem 269:26497–26502, pmid:7929372.
Abstract/FREE Full Text
1. Lee M G S
(1996) Mol Cell Biol 16:1220–1230, pmid:8622666.
Abstract
↵
1. Dresel A,
2. Clos J
(1997) Exp Parasitol 86:206–212, pmid:9225771.
Medline
↵
1. Pays E,
2. Vanhamme L
1. Smith D F,
2. Parson M
(1996) in Molecular Biology of Parasitic Protozoa, eds Smith D F, Parson M (Oxford Univ. Press, Oxford), pp 88–114.
↵
1. Ravel C,
2. Wincker P,
3. Bastien P,
4. Blaineau C,
5. Pagès M
(1995) Mol Biochem Parasitol 74:31–41, pmid:8719243.
CrossRef Medline Web of Science
↵
1. Myler P J,
2. Tripp C A,
3. Thomas L,
4. Venkataraman G M,
5. Merlin G,
6. Stuart K D
(1993) Mol Biochem Parasitol 62:147–152, pmid:8114820.
CrossRef Medline
↵
1. Soto M,
2. Requena J M,
3. Garcia M,
4. Gómez L C,
5. Navarrete I,
6. Alonso C
(1993) J Biol Chem 268:21835–21843, pmid:8408038.
Abstract/FREE Full Text
↵
1. Goffeau A,
2. Barrell B G,
3. Bussey H,
4. Davis R W,
5. Dujon B,
6. Feldmann H,
7. Galibert F,
8. Hoheisel J D,
9. Jacq C,
10. Johnston M,
11. et al.
(1996) Science 274:546, pmid:8849441, , 563–567..
Abstract/FREE Full Text
↵
1. Gardner M J,
2. Tettelin H,
3. Carucci D J,
4. Cummings L M,
5. Aravind L,
6. Koonin E V,
7. Shallom S,
8. Mason T,
9. Yu K,
10. Fujii C,
11. et al.
(1998) Science 282:1126–1132, pmid:9804551.
Abstract/FREE Full Text
↵
1. The C. elegans Sequencing Consortium
(1998) Science 282:2012–2018, pmid:9851916.
Abstract/FREE Full Text
↵
1. Tait A
(1983) Parasitol 86:29–57, pmid:6346233.

A Genealogical Interpretation of Principal Components Analysis

Posted by Pico on April 7, 2010

Posted in: Uncategorized. Leave a comment

Author(s): McVean G (McVean, Gil)
Source: PLOS GENETICS Volume: 5 Issue: 10 Article Number: e1000686 Published: OCT 2009

Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2: e190.
Novembre J, Johnson T, Bryc K, Kutalik Z, Boyko AR, et al. (2008) Genes mirror geography within Europe. Nature 456: 98–101.
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, et al. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 38: 904–909.
Cavalli-Sforza LL, Menozzi P, Piazza A (1994) The History and Geography of Human Genes. New Jersey: Princeton.
Reich D, Price AL, Patterson N (2008) Principal component analysis of genetic data. Nat Genet 40: 491–492.
Klopfstein S, Currat M, Excoffier L (2006) The fate of mutations surfing on the wave of a range expansion. Mol Biol Evol 23: 482–490.
Barbujani G, Sokal RR, Oden NL (1995) Indo-European origins: a computer-simulation test of five hypotheses. Am J Phys Anthropol 96: 109–132.
Fix AG (1997) Gene frequency clines produced by kin-structured founder effects. Hum Biol 69: 663–673.
Chikhi L, Nichols RA, Barbujani G, Beaumont MA (2002) Y genetic data support the Neolithic demic diffusion model. Proc Natl Acad Sci USA 99: 11008–11013.
Currat M, Excoffier L (2005) The effect of the Neolithic expansion on European molecular diversity. Proc Biol Sci 272: 679–688.
Novembre J, Stephens M (2008) Interpreting principal component analyses of spatial population genetic variation. Nat Genet 40: 646–649.
Slatkin M (1991) Inbreeding coefficients and coalescence times. Genet Res 58: 167–175.
Wilkinson-Herbots HM (1998) Genealogy and subpopulation differentiation under various models of population structure. J Math Biol 37: 535–585.
McVean GA (2002) A genealogical interpretation of linkage disequilibrium. Genetics 162: 987–991.
Baik J, Ben Arous G, Péché S (2005) Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. Ann Probability 33: 1643–1697.
Debashis P (2007) Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica 17: 1617–1642.
Schaffner S, Foo C, Gabriel S, Reich D, Daly MJ, et al. (2005) Calibrating a coalescent simulation of human genome sequence variation. Genome Res 15: 1576–1583.
The International HapMap Consortium (2005) A haplotype map of the human genome. Nature 27: 1299–1320.

Posted by Pico on April 7, 2010

Posted in: Uncategorized. Leave a comment

Tackling the population genetics of clonal and partially clonal organisms

Trends in Ecology & Evolution
Volume 20, Issue 4, April 2005, Pages 194-201

http://www.maths.lancs.ac.uk/~fearnhea/Software.html

Programs Available

Below are details of code developped as part of my research. Currently this covers methods for estimating recombination rates from population genetic data; algorithms for perfect simulation of non-neutral population genetic models; and a range of computational statistical algorithms.

Population Genetics

Computational Statistics

McVean Group

We are researching various aspects of statistical genetics, population genetics and evolutionary biology, at the University of Oxford. , downloadable software, a list of selected publications, and links to various teaching resources such as slides and handouts.

Posted by Pico on April 7, 2010

Posted in: Uncategorized. Leave a comment

Bio and Geo Informatics

Matplotlib

Regular Expression HOWTO: An introduction to using regular expressions and the re module to process text.

Sorting Mini-HOWTO, by Andrew Dalke: A little tutorial showing a half dozen ways to sort a list with the built-in sort() method.

http://www.amk.ca/python/howto/

http://www.astro.cornell.edu/staff/loredo/statpy/

http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html ==> stats.py

Pysam

The quick start guide is here. More detailed documentation is available here

Comparative genomics and population genetics

Posted by Pico on March 27, 2010

Posted in: Uncategorized. Leave a comment

Comparative genomics is the study of the relationship of genome structure and function across different biological species or strains. Comparative genomics is an attempt to take advantage of the information provided by the signatures of selection to understand the function and evolutionary processes that act on genomes.

Comparative genomics exploits both similarities and differences in the proteins, RNA, and regulatory regions of different organisms to infer how selection has acted upon these elements. Those elements that are responsible for similarities between different species should be conserved through time (stabilizing selection), while those elements responsible for differences among species should be divergent (positive selection).

Population genetics is the study of allele frequency distribution and change under the influence of the four main evolutionary processes: natural selection, genetic drift, mutation and gene flow. It also takes into account the factors of population subdivision and population structure. It attempts to explain such phenomena as adaptation and speciation.

Subversion is a version control system. This is useful to manage a large project.

Online subversion book: Home

A more recent pdf file http://svnbook.red-bean.com/en/1.5/svn-book.pdf

Sequencing small bugs

Illumina, 454 and parasites

Popgen for dummies

http://dorakmt.tripod.com/evolution/popgen.html

Lecture Notes, Short Course in Evolutionary Quantitative Genetics

Bruce Walsh, University of Arizona. From June 2006 Postgraduate course in Evolutionary Quantitative Genetics, Roenbjerg field station, University of Aarhus, Denmark

pdfs of lecture notes

Prepare for the risk: Real conservatives ignoring their own principles

Atmospheric CO2: Principal Control Knob

Atmospheric CO₂: Principal Control Knob Governing Earth’s Temperature

Leishmania genome 1

PNAS March 16, 1999 vol. 96 no. 6 2902-2906

Leishmania major Friedlin chromosome 1 has an unusual distribution of protein-coding genes

Abstract

References

A Genealogical Interpretation of Principal Components Analysis

http://www.maths.lancs.ac.uk/~fearnhea/Software.html

Programs Available

Population Genetics

Computational Statistics

McVean Group

Bio and Geo Informatics

Pysam

Comparative genomics and population genetics

Sequence news

At AGBT, 454, Illumina, ABI Vow to Improve Speed, Yield, Quality

Subversion, please.

Recent Posts

Meta

Lecture Notes, Short Course in Evolutionary Quantitative Genetics

Bruce Walsh, University of Arizona. From June 2006 Postgraduate course in Evolutionary Quantitative Genetics, Roenbjerg field station, University of Aarhus, Denmark

pdfs of lecture notes

Atmospheric CO2: Principal Control Knob Governing Earth’s Temperature

PNAS March 16, 1999 vol. 96 no. 6 2902-2906

Leishmania major Friedlin chromosome 1 has an unusual distribution of protein-coding genes

Abstract

References

Programs Available

Population Genetics

Computational Statistics

McVean Group

Recent Posts

Meta

Atmospheric CO₂: Principal Control Knob Governing Earth’s Temperature