The biggest genome sequencing projects: the uber-list!
03 Dec 2013I am just writing a short presentation for a meeting in Hinxton. I wanted to demonstrate the profound effect that whole-genome sequencing is having on the study of biology, and the size and scope of recent studies.
So I thought it would be fun to catalogue the largest - in terms of samples - genome projects that have been published so far.
A few things are notable here. As expected, many of the biggest studies in terms of numbers are bacterial, enabled partly due to their smaller genome size.
Update: My attention has just been drawn to a study of 2,007 C. elegans genomes!
I found it interesting that all the bacterial studies listed herald from the UK, we are clearly blazing a trail in this field of study!
A PhD for sequencing a gene? A single genome? A hundred genomes? How about a thousand genomes? A million?
</tbody> </table>So, what's coming up that could potentially knock these studies off their perch?
- The Million Human Genome Project
- 100,000 foodborne pathogen genome project
- Up to 100,000 NHS patients
- 50,000 Faroe Islanders Project
- 20,000 Global pneumococcal project
- 10,000 Genome 10k vertebrate sequencing project
- UK 10K human genome project
- 10,000 autism genome project
- 5,000 arthropod genome sequencing project
- 3,000 NCTC culture collection sequencing
Did I miss a study? Please drop a comment below.
Rules for inclusion:
- whole-genome sequencing >10X average per sample (no exome, target capture)
- at least one library per sample (e.g. no pooled species, quasispecies)
- not a meta-analysis, fresh data for the paper
Thanks to: Casey Bergman, Scott Edmunds, Prashant, Liz Batty, Craig Duffy, Cui Yujun, Lex Nederbragt for suggestions!
Update 10-02-2014: Added Chewapreecha et al, Casali et al, now occupying positions 1 and 4 respectively in the uber-list!
Update 15-04-2014: Added Nasser et al, new position 1!
Update 29-05-2014: Added 3,000 rice genome project, new position 3!
Name | Number | Reference |
---|---|---|
S. pyogenes | 3,615 | Nasser et al. 2014</td. </tr> |
S. pneumoniae | 3,085 | Chewapreecha et al. 2014 |
Rice (Oryza sativa) | 3,000 | The 3,000 rice genomes project |
C. elegans | 2,007 | Thompson et al. 2013 |
Clostridium difficile | 1,250 | Eyre et al. 2013 |
The thousand genome project | 1092 human genomes | 1000 Genome Project Consortium, 2013 |
Mycobacterium tuberculosis | 1,000 | Casali et al. 2014 |
Plasmodium falciparum | 825 | Miotto et al, 2013 |
Streptococcus pneumoniae | 616 | Croucher et al. 2013 |
Mycobacterium tuberculosis | 390 | Walker et al. 2013 |
Salmonella in cattle and humans | 373 | Mather et al. 2013 |
Shigella sonnei | 263 | Holt et al. 2013 |
Mycobacterium tuberculosis | 259 | Comas et al. 2013 |
Streptococcus pneumoniae | 240 | Croucher at al. 2011 |
Methicillin-resistant Staphylococcus aereus | 193 | Holden et al. 2013 |
Campylobacter jejuni | 192 | Sheppard et al. 2013 |
Mycobacterium abscessus in CF | 170 | Bryant et al. 2013 |