Skip to contents

A dataset of 31,110 medical subject headings from the NLM MeSH (Medical Subject Headings) 2026 thesaurus, including preferred terms, entry terms (synonyms), hierarchical tree numbers, scope notes, and category classifications. MeSH is the controlled vocabulary used for indexing PubMed and other NLM databases.

Format

A data frame with 31110 rows and 9 variables.

Source

NLM MeSH 2026 (https://www.nlm.nih.gov/mesh/); public domain. Courtesy of the U.S. National Library of Medicine.

Variables

  • term. preferred MeSH heading name

  • mesh_id. MeSH Descriptor UI (e.g., "D000001")

  • synonyms. entry terms / cross-references, comma-separated

  • tree_number. primary hierarchical tree number (e.g., "D02.092")

  • tree_numbers_all. all tree numbers, comma-separated

  • category. top-level category name (e.g., "Diseases", "Chemicals and Drugs")

  • category_name. same as category (for consistency)

  • scope_note. definition / scope note from MeSH (may be NA)

  • source. data source attribution