How many protein-coding genes (genes that encode proteins) are in the human genome?

Medical Advisory BoardAll articles are reviewed for accuracy by our Medical Advisory Board
Educational purpose only • Exercise caution as content is pending human review
Article Review Status
Submitted
Under Review
Approved

Last updated: January 24, 2026View editorial policy

Personalize

Help us tailor your experience

Which best describes you? Your choice helps us use language that's most understandable for you.

Number of Protein-Coding Genes in the Human Genome

The human genome contains approximately 20,000-25,000 protein-coding genes, with current estimates placing the number at fewer than 20,000 genes. 1, 2

Current Gene Count Estimates

The most recent and authoritative sources provide converging estimates:

  • GENCODE release 27 (2018) reports 19,836 protein-coding genes in the human genome, representing the most conservative and well-validated count 3
  • The 2004 completion of the euchromatic human genome sequence estimated 20,000-25,000 protein-coding genes 2
  • A 2023 review confirms that protein-coding genes are "currently estimated to number fewer than 20,000" 1

Database Variations and Why Numbers Differ

Different gene annotation databases report varying numbers due to their distinct inclusion criteria and validation standards:

  • Ensembl (2017): 26,998 protein-coding genes with 81,787 mRNA transcripts, using an inclusive approach 3
  • GenBank RefSeq (2017): 21,104 protein-coding genes with 34,799 mRNA transcripts, using more conservative criteria requiring peer-reviewed evidence 3
  • These discrepancies arise because gene models are often incomplete and change over time, with the precise structure of many genes still under debate 3

Historical Context and Refinement

The gene count has been progressively refined downward from initial estimates:

  • Early estimates ranged from 50,000 to over 140,000 genes 4
  • A 2007 analysis reduced the catalog to approximately 20,500 protein-coding genes by excluding non-conserved ORFs that likely represent random occurrences rather than functional genes 5
  • The 2000 Exofish analysis estimated 28,000-34,000 genes 4

The reduction in gene count reflects improved methodology for distinguishing true protein-coding genes from functionally meaningless open reading frames (ORFs) present by chance in RNA transcripts 5

Non-Coding RNA Genes

Beyond protein-coding genes, the genome contains a substantial number of non-coding RNA genes:

  • GENCODE release 27 reports 23,347 non-protein-coding RNA genes, with 15,778 classified as long non-coding RNAs (lncRNAs) 3
  • The total number of non-coding RNA genes now exceeds the number of protein-coding genes 3
  • However, for most non-coding RNAs, functional relevance remains unclear 1

Clinical Implications

For whole exome sequencing (WES) applications:

  • The human exome comprises approximately 180,000 exons representing about 1% of the genome (approximately 30 million nucleotides) 6
  • Each exome contains about 13,500 single nucleotide variants affecting amino acid sequences 6
  • More than three-quarters of known disease-causing variants are located in the exome 6

Key Caveats

The human gene catalog remains incomplete despite decades of effort 1:

  • Gene models are incomplete and continue to evolve 3
  • The number of distinct protein-coding isoforms continues to expand through alternative splicing 1
  • No universal annotation standard exists that includes all medically significant genes 1
  • Different databases may place the same variant in a coding exon versus an intron depending on their gene models 3

References

Guideline

Guideline Directed Topic Overview

Dr.Oracle Medical Advisory Board & Editors, 2025

Research

Distinguishing protein-coding and noncoding genes in the human genome.

Proceedings of the National Academy of Sciences of the United States of America, 2007

Research

Sequencing your genome: what does it mean?

Methodist DeBakey cardiovascular journal, 2014

Professional Medical Disclaimer

This information is intended for healthcare professionals. Any medical decision-making should rely on clinical judgment and independently verified information. The content provided herein does not replace professional discretion and should be considered supplementary to established clinical guidelines. Healthcare providers should verify all information against primary literature and current practice standards before application in patient care. Dr.Oracle assumes no liability for clinical decisions based on this content.

Have a follow-up question?

Our Medical A.I. is used by practicing medical doctors at top research institutions around the world. Ask any follow up question and get world-class guideline-backed answers instantly.