Sample preparation and DNA extraction in the field for nanopore sequencing

The nanoporati are currently thrilling to a bevy of new announcements from Oxford Nanopore Technologies (ONT). More information over at the “wafer-thin update” and insightful commentary from Keith Robison on his blog.

But amongst the noise and excitement of future products, there are three important updates we are focusing on right now:

  • the release of the 5-10 minute 1D rapid prep (Mu transposase based)
  • coupled with the new R9 (now R9.4) chemistry that produces usable and high accuracy 1D reads (both discussed in this previous post)
  • and, new updates to the pore, membrane, motor and loading protocol which suggest 5-10Gb output may now be achievable.

We just received our first R9.4 double-speed (450 b/s) kits and so we will see how it looks soon, but as of now, we are able to get up to 3Gb of output on the vanilla R9.

The significance for our work: we can now start to consider using MinION for metagenomic sequencing (previously we have restricted our ambitions to sequencing individual viruses and bacterial cultures due to relatively low outputs).

Ultimately our research group would like to get to culture-free diagnosis of infectious diseases, with full genomic coverage, as a near-patient assay. There have been a few proof of principle papers here including use on Ebola and chikungunya (from Charles Chiu) and on bacterial urinary tract infections (from Justin O’Grady).

However, for portable metagenomics sequencing to really become a viable prospect, the sample needs to be rapidly prepared at point of collection (from near the patient, in diagnostics, or from water, food, animals, the natural environment, etc.).

In my view, local sample preparation, DNA extraction and local bioinformatics analysis are now the major open issues for portable sequencing.

To illustrate this point, we recently saw the exciting news that the nanopore had been run on the International Space Station - surely a landmark moment in genomics. But yet the sample was still not prepared or DNA extracted in space.

And sadly, you cannot get away without proper DNA extraction. We saw in the past few days the bizarre spectacle of David Eccles and Chris Mason at a conference in Australia attempting to sequence various food samples (coffee, strawberries and cream, etc.) on the nanopore, using the 1D prep. A valiant experiment, but the output was effectively noise due to a lack of pure DNA prep. We experienced similar results when attempting to sequence from a virtually DNA free sample on the beach in Cornwall.

So, DNA extraction remains a fact of life for sequencing.

For single molecule sequencing it’s even trickier: you need high purity, high molecular weight, high concentration DNA to get good results from single molecule sequencers like the nanopore (current input 500ng for the transposon prep).

This should not be problematic for many environmental samples. When dealing with low concentration samples, the easiest way of doing this is via PCR (targeted or untargeted WGA, although fragment length can suffer without significant optimisation).

Solutions for portable sample preparation

Whilst presumably not a big market yet, a few companies have started producing solutions for ‘in-field’ sample preparation and DNA extraction. In the rest of this post I want to explore some of the available options for portable sample preparation and DNA extraction.

Just as a reminder, the steps for sample preparation are to a) make the sample safe (particularly important e.g. in Ebola) b) homogenise the sample and lyse cells c) extract DNA and then d) make a sequencing library.

Microbiome maven Elizabeth Bik has a very nice review of this in a recent article which is focused around microbiome studies but applies equally to other types of study:

Bead Beating/Tissue Lysis

Many samples, and particularly environmental samples, need homogenisation and cellular disruption before DNA extraction can proceed efficiently. One of the most popular methods is bead beating, which usually requires a benchtop instrument. Luckily, there is a portable, battery-powered method available in the form of the TerraLyzer (we have one). This device available, available from Zymo Research, uses a converted power tool to act as a portable bead beater. It’s a solid bit of kit, but costs about $1000. If that’s too rich for you, Russell Neches has developed a template that can be 3D printed to turn a Craftsman automatic hammer into a portable bead beater.

Here is a video of Russell using the TerraLyzer to extract DNA from cat poo:

DNA extraction

DNA can be extracted very simply from a variety of foods like bananas or strawberries and is method probably familiar to school children; First fruit is blended to break up the tissues, washing up liquid is then added to breakdown the cell membranes before being straining to remove solids. DNA is then precipitated by adding alcohol and spooled off using a toothpick. More detailed instructions here: http://biology.about.com/od/biologylabhowtos/ht/dnafromabanana.htm.

100% ethanol is a problematic substance to ship (it is banned from aircrafts), so it is an open research question about whether a lower proof alcohol that is readily available, e.g. vodka, would be an acceptable substitute. Please fund this important project.

Portable devices

Claire Lonsdale brought along an interesting device to Porecamp in Cornwall called the PureLyse from Claremont Bio. This device combines bead-beating with DNA capture using silica beads which are agitated by a small motor. They have built a small disposable device which combines a syringe and a reusable battery pack. The sample, ideally bacterial culture, is aspirated via the syringe then the motor is turned on for a minute to burst the cells/bind the DNA. Claire presented results at London Calling demonstrating that the DNA extracted is probably suitable for PCR but may be too fragmented for single-molecule sequencing.

Screenshot from Claire’s London Calling talk, showing rapid extraction on the right compared to a regular spin-column extraction on the left.

The announced, but not currently available Zumbador from ONT looks to take this syringe concept further, by including reagents for lysis, purification and potentially library preparation in a single pre-loaded cartridge. This looks appealing but the worry for those of us who deal with a lot of DNA extractions from different organisms is which cell lysis solution is likely to be universally applied to all manner of organisms with quite different cell wall compositions - Gram positives and spore forming bacteria are notoriously tough shells to crack, this may need to be combined with the bead beating step above.

Zymo also offer the Xpedition kit range, which are designed with field work in mind and contain a stabilisation solution which will preserve your DNA (after bead beating with the TeraLyzer) for up to a month at room temperature.

However, you can also use traditional column-based extraction method in the field, if you have a:

Portable centrifuge

From my research, microcentrifuges are nearly all mains powered, which limits their utility in the field.

A homebrew solution is simply to modify a cordless drill with a 3D printed centrifuge adaptor, one example being the DremelFuge, that offers up to 52,000g/rcf acceleration.

However one should be extremely careful here because a flying, solid object at these rotations could cause serious harm, please take appropriate safety precautions if you are thinking of using this solution. Disclaimer! More generally, if in doubt about any safety aspects of field sample preparation, please first get in contact with your local safety officer for advice.

An alternative is to adapt a regular lab microcentrifuge that can take DC input, as that means they can be easily powered from a Lithium-Ion battery pack.

Portable PCR Thermocyclers

The MiniPCR is a fantastic (we’ve got one) biohacker/kickstarter product which costs £500 from Cambio in the UK. It is programmed via a laptop or phone but then must be plugged into a mains adapter or battery pack to start the program. We bought a LiPo powerbank off Amazon for £70 which can provide the 19V, 3.7A power requirement. They also produce a small electrophoresis and visualisation system to go with it.

An alternative is the Bento Lab from Bento Bio. This device caught many people’s attention with its Fisher-Price toy looks and intriguing functionality - it is a PCR thermocycler, gel visualiser block and minifuge all in one! Although mains powered, it should draw sufficiently little power that it could be powered via a car battery or possibly a Lithium pack. We had the pleasure of seeing a prototype box and it kicks ass - the only problem at the moment is that it’s still not available to buy. I hope it will ship soon and we’ll be first in the queue to test it out.

## Portable Liquid Handler

Pipetting is only accurate at relatively large volumes (>1 ul) which both increases reagent costs and can be a major source of errors with multi-step protocols. The Voltrax is an interesting device that was announced by ONT at London Calling 2015 and has not yet been seen in the wild, although the access programme was recently announced. The basic principle is the movement of ultra low liquid volumes around a matrix through an applied electrical current - a process called electrowetting. The appeal of such a process is that complex pipetting and mixing steps could be automated (apparently via a scriptable Python interface).

There may well be more that I have not mentioned … feel free to drop your suggestions in the comments box below!

Conflict of interests

I have received an honorarium to speak at an Oxford Nanopore meeting, and travel and accommodation to attend London Calling 2015 and 2016. I have ongoing research collaborations with ONT although I am not financially compensated for this and hold no stocks, shares or options. ONT have supplied free-of-charge reagents as part of the MinION Access Programme and also generously supported our infectious disease surveillance projects with reagents. Cambio sent us some free reagents to go with the MiniPCR instrument we purchased.

Credits

Thanks to Josh Quick for contributing to this post, and to Matt Loose and John Tyson for reading a draft version.

Nanopore R9 rapid run data release

R9 data

A long promised addition to the nanopore sequencing repertoire is the rapid sequencing kit. This kit significantly reduces the effort required to make a sequencing library - down from 2-3 hours to a few minutes. We’ve actually played with this kit several times before, once very early on in the MAP (I think using R7 chemistry as long ago as July 2014). More recently, Matt Loose and I tried it out in a hotel room before a famous genomics conference in February of this year. We can both vouch for how easy it is to use - no specialist equipment is required other than pipettes and a source of heat to neutralise the transposase after a short incubation at room temperature. The recommended starting DNA input is 500ng. In our hotel room we used a freshly brewed cup of coffee which provided the required 70 degrees.

However, until recently this kit was really mainly a curiosity rather than a serious proposition because it only produces so-called “1D” data. To remind you, 1D data is when only the template strand of the double-stranded molecule is read. With the 1D kit because there is no hairpin ligation the complement strand does not pass through the pore.

And for R7.3 data this was a significant drawback: sequence accuracy on the template strand is in the low 70s, accuracy-wise, which makes basic tasks like de novo assembly and variant calling computationally very difficult (although probably not impossible, and assemblers like Canu can cope, with a bit of tweaking). It also makes polishing extremely slow.

The release a few months back of the R9 chemistry has changed the game – it’s a game-changer! – and suddenly made 1D reads very usable. This is ascribed to the more discriminatory read head of the CsgG pore employed, where fewer nucleotides in the pore abrogate the flow of ions across the membrane. The spread of electrical current levels is about twice as wide as seen in R7. However it is hard to know exactly how much of the improved accuracy is caused by the pore as this coincided with the introduction of a new style of basecaller that employs ‘deep learning’ (technically a recurrent neural network) rather than the Hidden Markov Model of before. A third change is the introduction of ‘fast mode’, currently running at 250 bases / second, or four times the translocation speed employed with the R7 chemistry. Because all these changes were introduced at once, it is hard to know the relative contribution of each. However, our early access experiences with R7.3 demonstrated that ‘fast mode’ did not seem to have a significant detrimental effect on quality. In fact, the theory is it may improve handling of long homopolymeric tracts by introducing more signal into the ‘dwell’ times.

Other changes: Notably, the sequencing files now record raw current sample data (at 5kHz) by default, and the previous process of linearising the signal into ‘events’ is now performed by the cloud base caller Metrichor rather than MinKNOW on the laptop. Excitingly there are now three local basecallers available - one is built into MinKNOW 1.0.0 (the next release). There is also a separate download called nanonet (available to MAPpers). We tried out nanonet during the ZiBRA bus trip and it worked well, albeit it could not quite keep up with data generation on a standard laptop. Jared Simpson and Matei David also have an open source basecaller called nanocall.

We’ve done two runs of this protocol. The first was on a flowcell that was delivered, erroneously frozen for 36 hours at -10 degrees in our Stores, and then left at room temperature for a week or so (we’d assumed it was completely knackered). We thought we’d just try it out for fun and to our surprise it actually generated a decent yield of data, around 600mb. Data here is from a second flowcell that was correctly stored at fridge temperature.

The final new thing here is that this is a SpotON flowcell; which means the total volume loaded onto the flowcell is halved, and you in fact ‘drip, drip’ the library straight onto the flowcell surface via a small hole that is protected by a plastic clip. What difference this makes to performance is currently unknown:

The results from the better flowcell are presented here with links to data at the bottom:

E. coli stats

stats

Type Total Reads Base Pairs Mean Median Min Max N25 N50 N75
pass:template 164472 1.48Gb 9009 5944 117 131969 25244 14891 8074
fail:template 74465 467Mb 6271 3544 5 328471 21903 12033 6047

This is the highest yielding flowcell we’ve ever had, with just shy of 2Gb of base called sequence, and 1.48Gb in the pass bin. Over 99% of the reads map to the reference, meaning the goodput is equivalent to the output.

Read length

The transpososome method gives a very different size distribution to the Gaussian distribution expected with the traditional Covaris G-tube fragmentation. There are more shorter reads, but the N50 is improved to nearly 15kb (from around 8kb). The maximum length read in this dataset is 131kb and aligns completely to the reference genome at 85% identity.

Read length (greater than 50kb)

Zooming into this plot it is obvious there are plenty of super long reads - 953 of the passing reads are greater than 50kb comprising 57.5Mb of sequence.

Read length (greater than 50kb)

Gratifyingly the data gives a single contig assembly with miniasm and Canu without any custom parameterisation. We’ll pass it over to Jared to see what kind of consensus accuracy he can get out of nanopolish which now has alpha support for R9 data.

Accuracy

The 1D accuracy is a quantum leap from previous pores, with mean read accuracy at 83%.

We’ll do more analysis on this dataset and hope to write it up as a manuscript in future, but are releasing the dataset for the community to play with.

E. coli 2D kit data

We’ve also previously generated 2D data and this is available below.

Stats

668Mb of passing 2D data (template+complement) results in 244mb of 2D data.

pass stats

Type Total Reads Base Pairs Mean Median Min Max N25 N50 N75
template 50277 328543190 6534.66 6448 9 78622 11688 9063 6665
complement 50277 340285012 6768.2 6427 5 144661 12555 9280 6732
twodirections 31858 244275647 7667.64 7603 99 64218 11754 9244 7135

ipython notebook

I have posted up the IPython notebook detailing the commands to reproduce this analysis.

Credits

Josh Quick did the laboratory work and sequencing. We are grateful to John Tyson for supplying his tuning scripts for the 1D R9 run.

Conflict of interests

I have received an honorarium to speak at an Oxford Nanopore meeting, and travel and accommodation to attend London Calling 2015 and 2016. I have ongoing research collaborations with ONT although I am not financially compensated for this and hold no stocks, shares or options. ONT have supplied free-of-charge reagents as part of the MinION Access Programme and also generously supported our infectious disease surveillance projects with reagents.

Balti and Bioinformatics: 28th September 2016

Balti and Bioinformatics returns …….

University of Birmingham 28th September 2016

How to get here

Location: Room WG04, Biosciences Building, University of Birmingham

(From University Station, turn left, walk down hill, Biosciences is 3 minutes walk and on your left. Walk in and follow the signs, we are on the ground floor).

Agenda

12.30 - Samosas and cha(a)t

1.30 - Science session

1.30 - tbc: Aaron Darling, iThree Institute, Sydney, Australia

2.00 - Doing bioinformatics: a user’s perspective: Lex Nederbragt, University of Oslo, Norway

2.30 - Tea and coffee

3.00 - Bioinformatics pipeline session

3.05 - Ansible versus Docker for packaging hard to run pipelines, Nick Loman, University of Birmingham

3.15 - Marius Bakke, University of Warwick, GUIX for Bioinformatics

3.25 - Shovill: The Spades Optimiser, Torsten Seemann, University of Melbourne

3.45 - Degust: RNA-Seq visualisation, David Powell, Monash

4.05 - Open discussion about pipelines

5.00 - Finish, taxis, balti at Dosa Mania, Harborne

Sign-up form here

Links for IMMEM talk

References for IMMEM 2016 talk

1. Joseph Bore operating a MinION in Nongo, Guinea.

2. Make research open access

3. Size of the MinION

4. Behind the scenes

5. Packing up

6. Lab-in-a-suitcase

7. Sierra Leone project

8. Portable Internet

9. Duration of sequencing runs

10. Validation

11. Real-time sequencing / Outbreak in context

12. Ebola.nextstrain.org by Trevor Bedford and Richard Neher

13. Sierra Leone analysis

14. Tracking chains of transmission

15. Frozen in time evolution

15. Real-time digital pathogen surveillance

16. Portable systems

17. Transposome / offline base calling

  • Simpson J, David M nanocall, in preparation
  • Data will be uploaded when I get a better Internet connection.

Thanks for listening ;)

Links for AGBT talk

References for AGBT 2016 talk

1. Joseph Bore operating a MinION in Nongo, Guinea.

2. Make research open access

3. Size of the MinION

4. Behind the scenes

5. Packing up

6. Lab-in-a-suitcase

7. Sierra Leone project

8. Portable Internet

9. Duration of sequencing runs

10. Validation

11. Real-time sequencing / Outbreak in context

12. Ebola.nextstrain.org by Trevor Bedford and Richard Neher

13. Sierra Leone analysis

14. Tracking chains of transmission

15. Frozen in time evolution

15. Real-time digital pathogen surveillance

16. Portable systems

17. Transposome / offline base calling

  • Simpson J, David M nanocall, in preparation
  • Data will be uploaded when I get a better Internet connection.

Thanks for listening ;)