Includes all surnames in 2010 that occur at least 100 times, with rank, frequency and percentage by racial category. Descriptions of variables are taken from the SSA dataset.

us_ssa_surnames

Format

A data frame with 162254 rows and 10 variables.

Source

https://www.census.gov/topics/population/genealogy/data/2010_surnames.html

Variables

Variables:

  • name. surname

  • rank. rank in the year

  • freq. frequency in the year

  • prop100k. proportion per 100k

  • pct_white. percent White alone

  • pct_black. percent Black or African American alone

  • pct_api. percent Asian and Native Hawaiian and other Pacific Islander alone

  • pct_aian. percent American Indian and Alaska Native alone

  • pct_multiracial. percent two or more races

  • pct_hispanic. percent Hispanic or Latino origin