Verily / Google

Google open source tool DeepVariant achieves unprecedented accuracy in human genome sequencing

From Google Is Giving Away AI That Can Build Your Genome Sequence | Wired:

On Monday, Google released a tool called DeepVariant that uses deep learning—the machine learning technique that now dominates AI—to assemble full human genomes.

And now, engineers at Google Brain and Verily (Alphabet’s life sciences spin-off) have taught one to take raw sequencing data and line up the billions of As, Ts, Cs, and Gs that make you you.


Today, you can get your whole genome for just $1,000 (quite a steal compared to the $1.5 million it cost to sequence James Watson’s in 2008).

But the data produced by today’s machines still only produce incomplete, patchy, and glitch-riddled genomes. Errors can get introduced at each step of the process, and that makes it difficult for scientists to distinguish the natural mutations that make you you from random artifacts, especially in repetitive sections of a genome.

See, most modern sequencing technologies work by taking a sample of your DNA, chopping it up into millions of short snippets, and then using fluorescently-tagged nucleotides to produce reads—the list of As, Ts, Cs, and Gs that correspond to each snippet. Then those millions of reads have to be grouped into abutting sequences and aligned with a reference genome.

That’s the part that gives scientists so much trouble. Assembling those fragments into a usable approximation of the actual genome is still one of the biggest rate-limiting steps for genetics.


DeepVariant works by transforming the task of variant calling—figuring out which base pairs actually belong to you and not to an error or other processing artifact—into an image classification problem. It takes layers of data and turns them into channels, like the colors on your television set.

After the FDA contest they transitioned the model to TensorFlow, Google’s artificial intelligence engine, and continued tweaking its parameters by changing the three compressed data channels into seven raw data channels. That allowed them to reduce the error rate by a further 50 percent. In an independent analysis conducted this week by genomics computing platform, DNAnexus, DeepVariant vastly outperformed GATK, Freebayes, and Samtools, sometimes reducing errors by as much as 10-fold.

DeepVariant is now open source and available here:

Google competes with many other vendors on many fronts. But while his competitors are focused on battling for today’s market opportunities, Google is busy in a solitary race to control the battlefield of the future: the human body.

The human body is the ultimate data center.

Progress in Smart Contact Lenses

From Smart Contact Lenses – How Far Away Are They? – Nanalyze

The idea of smart contact lenses isn’t as far away as you might think. The first problem that crops up is how exactly do we power the electronics in a set of “smart” contact lenses. As it turns out, we can use the energy of motion or kinetic energy. Every time the eye blinks, we get some power. Now that we have the power problem solved, there are at least several applications we can think of in order of easiest first:

  • Level 1 – Multifocal contact lenses like these from Visioneering Technologies, Inc. (VTI) or curing color blindness like these smart contact lenses called Colormax
  • Level 2 – Gathering information from your body – like glucose monitoring for diabetics
  • Level 3 – Augmenting your vision with digital overlay
  • Level 4 – Complete virtual reality (not sure if this is possible based on the eye symmetry but we can dream a dream)

So when we ask the question “how far away are we from having smart contact lenses” the answer isn’t that simple. The first level we have already achieved.