Skip to contents

A corpus of 193,623 metal music lyrics filtered to English-language songs, with prepped lyrics and word-position annotations. Raw lyrics were collected from Genius, filtered to English using language detection, and prepped through transliteration, contraction expansion, lowercasing, punctuation and number removal, and section tag removal. Word-position annotations assign sequential position numbers to words, resetting at sentence boundaries.

Format

A data frame with 193,623 rows and 15 variables.

Variables

  • song_number. Numeric song identifier

  • song_title. Song title

  • song_lyrics_url. URL to the lyrics source (Genius)

  • artist. Artist name

  • album. Album name

  • lyrics. Raw lyrics text (including section tags like [Verse], [Chorus])

  • year. Release year

  • score. Genius score

  • subgenre. Detailed subgenre labels

  • score_number. Number of raters

  • genre. Broad genre category (e.g., Alt Metal, Death Metal)

  • language. Detected language code

  • lyrics_prepped. Prepped lyrics: transliterated, lowercased, punctuation and numbers removed, contractions expanded, section tags removed

  • title_prepped. Prepped song title: transliterated and lowercased

  • word_position. Word-position indexed lyrics (sequential position numbers resetting at sentence boundaries)