Researchers Beef Up DNA Storage Density By Adding More Letters

Researchers Beef Up DNA Storage Density By Adding More Letters

We have become very good at storing data with hard drives closing in on 20 terabytes, but even our best 21st-century engineering can’t come close to the elegance and density of DNA. Most of the cells in your body contain a complete genetic copy of what makes you a human being, and DNA is surprisingly durable compared to chips and spinning platters that will probably end up in a landfill inside of a decade. DNA might even be viable for storing digital data, but we’re not limited by the way human DNA works. Researchers from the University of Illinois Urbana-Champaign have expanded the capabilities of DNA data storage by adding more letters to its alphabet.

The genetic information in your cells relies on four primary base pairs, also known as nucleotides or nucleic acid. There’s adenine, guanine, cytosine, and thymine — the A, G, C, and T you’ve seen when genetic information is written out. The human body also uses another base called uracil in place of thymine when translating genetic information into RNA to make proteins.

Even without any modifications, DNA is a very dense storage medium. The researchers note that the world creates several petabytes of new data every day, and a single gram of DNA could store it all. That’s what you get with the standard four-base system from life on Earth, but there are plenty more nucleotides in chemistry that can link up to form a DNA strand. The team created an encoding scheme relying on 11 different bases, which gives the synthetic DNA much higher data density than a system of just four bases.

Researchers Beef Up DNA Storage Density By Adding More Letters

So why aren’t we all using DNA hard drives? While DNA can last for thousands of years without irreparable data loss, it’s difficult to encode and decode that data. You need advanced laboratory equipment, and most tools can’t even interpret the 11-base DNA strands created in the new study. The team found that ring-like proteins known as MspA nanopores, which are commonly used in DNA sensing, could correctly read the synthetic and natural DNA. Interpreting the recovered data required machine learning and artificial intelligence, but the result is a system that correctly read all 77 different combinations of bases used in the study. They believe this system could roughly double the data density of DNA, which is already much higher than any technology we’ve devised.

This work is still very early, but it’s a fascinating proof of concept. The addition of synthetic chemistry to natural biological storage mechanisms could unlock functionally unlimited data storage. And it works, with just a little AI assistance. Such a technology would be limited to long-term archival storage at first, but no one knows what the future may bring.

Continue reading

Scientists Confirm the Presence of Water on the Moon
Scientists Confirm the Presence of Water on the Moon

Scientists have confirmed the discovery of molecular water on the moon. Is there any of it in a form we can use? That's less clear.

NASA Discovers Vital Organic Molecule on Titan
NASA Discovers Vital Organic Molecule on Titan

In the latest analysis, researchers from NASA have identified an important, highly reactive organic molecule in Titan's atmosphere. Its presence suggests the moon could support chemical processes that we usually associate with life.

Intel Launches New Xe Max Mobile GPUs for Entry-Level Content Creators
Intel Launches New Xe Max Mobile GPUs for Entry-Level Content Creators

Intel has launched a new consumer, mobile GPU — but it's got a very specific use-case, at least for now.

Voyager 2 Probe Talks to Upgraded NASA Network After 8 Months of Silence
Voyager 2 Probe Talks to Upgraded NASA Network After 8 Months of Silence

NASA just said "hello" to Voyager 2, and the probe said it back.