A Brief History Of DNA Sequencing
A Brief History of DNA Sequencing Technology


DNA sequencing is one of those technologies that quietly changed almost everything in biology.
At its simplest, sequencing means reading the order of DNA bases that make up our genome: A, T, C, and G. Those letters carry biological information, but for a long time, we could only read them slowly, one small piece at a time. Today, we can sequence entire genomes, compare thousands of individuals, track cancer evolution, identify infectious disease outbreaks, and study regions of the genome that were once nearly impossible to access.
The story of sequencing is really a story about scale: how biology moved from reading fragments of DNA to reading whole genomes.
Sequencing started with small pieces of DNA
Early DNA sequencing was dominated by Sanger sequencing, developed in the 1977. Sanger sequencing was accurate and powerful, but it worked best for relatively short DNA fragments. It was not designed for reading an entire human genome in one pass.
Still, it gave scientists something revolutionary though... The ability to determine the exact order of DNA bases in a controlled and reliable way. For many years, Sanger sequencing was the gold standard for reading DNA and validating genetic changes.
The limitation was not that Sanger sequencing was bad. The limitation was scale. A human genome contains about 3 billion bases in one representative copy, or about 6 billion bases if you consider both chromosome copies in a person. Reading that much DNA required a completely different level of organization, automation, and computation [1].
The Human Genome Project changed the question
The Human Genome Project was one of the first major attempts to sequence and assemble an entire human reference genome. Instead of asking, “What is the sequence of this one gene?”, scientists began asking, “Can we map and read the whole human genome?”
That was a huge shift.
The Human Genome Project officially began in 1990 and was completed in 2003. It required international collaboration, large sequencing centers, physical and genetic maps, computational assembly, and major technology development. The first human reference genome was not a simple readout from one machine. It was a massive scientific infrastructure project [1].
The cost also shows how large the challenge was. The first human genome sequence produced by the Human Genome Project cost at least hundreds of millions of dollars to generate, while the total U.S. contribution to the project was about $2.7 billion [1]. By comparison, modern sequencing has made it possible to generate human genome data at a tiny fraction of that cost.
The Human Genome Project did more than produce a reference sequence. It created the foundation for modern genomics. Albeit the ethical issues with this project are (in my opinion) heavily important to consider.
Next-generation sequencing made sequencing massively parallel
After the Human Genome Project, sequencing changed quickly. New technologies, often called next-generation sequencing or NGS, made it possible to sequence millions or billions of DNA fragments at the same time.
This is the basic idea behind short-read sequencing.
Instead of reading one long DNA molecule from beginning to end, the genome is broken into many small fragments. Each fragment is sequenced, producing short reads. Then computational tools align those reads to a reference genome or assemble them together [1], [2].
This made sequencing much faster and cheaper. It also made genomics scalable. Researchers could now sequence many samples, compare genomes across people, study cancer mutations, identify inherited variants, and measure genetic variation at a population scale [2], [3].
Short-read sequencing became powerful because it is high-throughput, relatively accurate, and cost-effective. But it also has a weakness: short reads can struggle in repetitive or structurally complex regions of the genome. If the same sequence appears in many places, a short read may not clearly tell you where it came from.
That matters because the genome is not just a neat list of genes. It contains repeats, duplications, structural variation, and difficult regions that are biologically important.
Long-read sequencing helped open the difficult parts of the genome
Long-read sequencing technologies, including Pacific Biosciences and Oxford Nanopore approaches, changed what scientists could see.
Instead of producing very short fragments, long-read sequencing can read much longer stretches of DNA. This makes it easier to assemble genomes, detect structural variants, resolve repetitive regions, and study parts of the genome that short reads often miss [2], [3].
Long reads are especially useful when the question depends on genomic structure. For example, they can help researchers study repeat expansions, large insertions or deletions, rearrangements, centromeres, and other regions where short reads may be ambiguous.
This does not mean long reads simply replace short reads. Short-read sequencing is still widely used because it is accurate, scalable, and cost-effective. Long-read sequencing offers different strengths, especially when researchers need continuity across complex DNA regions [2], [3].
Modern genomics often benefits from using both.
Sequencing is now about more than reading DNA
Today, sequencing is not just about producing DNA letters. It is also about interpretation.
A sequencing experiment creates data, but the biological meaning comes from analysis. Researchers ask questions like:
- Which variants are present?
- Are they inherited or acquired?
- Do they affect genes or regulatory regions?
- Are they associated with disease?
- Are they found in difficult regions of the genome?
- Do they help explain a patient’s diagnosis, tumour evolution, or treatment response?
This is why modern genomics is both experimental and computational. The lab generates the reads, but bioinformatics turns those reads into biological insight.
Sequencing now shapes cancer genomics, infectious disease surveillance, rare disease diagnosis, evolutionary biology, ancestry studies, and population-scale genomics [2], [3].
The future is more complete, more diverse, and more personal
The first human reference genome was a milestone, but it was not the final version of human genetic diversity. A single reference cannot represent all people equally.
That is why current genomics is moving toward more complete and diverse references, including pangenome approaches that better capture variation across populations. Institutions involved in the original Human Genome Project have emphasized that the future of genomics depends not only on sequencing more DNA, but on making genomic resources more representative and useful across global communities [4].
In that sense, sequencing has come full circle.
The early goal was to read the human genome. The modern goal is to understand human genomes: plural, diverse, structurally complex, and biologically dynamic.
DNA sequencing began as a way to read small fragments of genetic code. It has become one of the central tools for understanding life, disease, and evolution.
References
[1] National Human Genome Research Institute. The Cost of Sequencing a Human Genome. https://www.genome.gov/about-genomics/fact-sheets/Sequencing-Human-Genome-cost
[2] Satam, H., Joshi, K., Mangrolia, U., et al. (2023). Next-Generation Sequencing Technology: Current Trends and Advancements. Biology, 12(7), 997. https://doi.org/10.3390/biology12070997
[3] Akintunde, O., Tucker, T., & Carabetta, V. J. (2023). The evolution of next-generation sequencing technologies. https://pmc.ncbi.nlm.nih.gov/articles/PMC10246072/
[4] Gunn, S. (2025). How our beginning is shaping our future: 25 years on from the Human Genome Project. Wellcome Sanger Institute Blog. https://sangerinstitute.blog/2025/06/26/how-our-beginning-is-shaping-our-future-25-years-on-from-the-human-genome-project/