NHGRI logo
Long-Read DNA Sequencing

Long-Read DNA Sequencing

updated: January 22, 2025

Definition

DNA sequencing technologies determine the order of the base pairs in fragments of DNA known as “reads”. Scientists must then piece these reads together to assemble the sequences of full chromosomes. While some sequencing technologies produce reads that are only a few 100 nucleotides long, some methods can generate reads that are thousands to hundreds of thousands of nucleotides long known as long-read DNA sequencing. These long reads are easier to assemble because the sequence is broken into fewer fragments.

 Long-Read DNA Sequencing


Narration

Long-read DNA sequencing refers to a group of technologies that are very aptly named, allowing scientists to read much longer pieces of DNA. The human genome is extremely long. It's actually a three billion-long continuous chain of DNA bases. The genome also has a sequence of DNA bases that is extremely complicated. Regions of the genome contain repeats of the same sequence or chunk of sequence over and over again, making determining the full sequence of a genome from end to end quite the puzzle for scientists. This is where long-read sequencing technologies play a major role. When we use technology to sequence DNA, the DNA must first be chopped up into many small pieces. These pieces are then sequenced on a DNA sequencer, generating what we call “reap”. The read length of a technology refers to the longest piece of DNA it can sequence. A typical next gen sequencing read (sometimes referred to as “short-read sequencing”) is around 300 to 400 base pairs. If we think of a genome as a puzzle individuals sequencing reads are the pieces in the puzzle that must be assembled to create full picture. The shorter reads the number of pieces that we have to assemble is quite large taking time and complicated algorithms to do work. Long-read sequencing technologies allowing for sequencing pieces of DNA that are much, much longer from a thousand to hundreds of thousands of base pairs in one go. This means we have a much smaller set of puzzle pieces to assemble to create our full genome. Using long read sequencing, scientists are fully uncovering the most complicated regions of the genome, filling in the gaps within our knowledge of genomics.

Ian Nova
Ian C. Nova, Ph.D.

Program Director

Division of Genome Sciences