Namograph

Large-scale diversity estimation through surname origin inference

Supplemental materials to the article published in Bulletin of Sociological Methodology (July 2018)
by Antoine Mazières et Camille Roth.

Datasets (France)
Parliament Members (1958-2016)
Mayors
Veterinarians
Researchers at CNRS
Accountants
Pharmacists
École Polytechnique (1958-2016)
Brevet d'Études Professionnelles
Parisian Lawyers
Professional Baccalauréat
Certificat d'Aptitude Professionnelle
Brevet de Technicien Supérieur
Baccalauréat

For details on method and datasets, see the full paper (raw data).

The study of surnames as both linguistic and geographical markers of the past has proven valuable in several research fields spanning from biology and genetics to demography and social mobility. This article builds upon the existing literature to conceive and develop a surname origin classifier based on a data-driven typology. This enables us to explore a methodology to describe large-scale estimates of the relative diversity of social groups, especially when such data is scarcely available. We subsequently analyze the representativeness of surname origins for 15 socio-professional groups in France.


DOWNLOAD FULL ARTICLE PDF


Mazieres, A. and Roth, C. (2018). Large-scale diversity estimation through surname origin inference. Bulletin of Sociological Methodology, 139.

@article{mazieres2018names,
  title={Large-scale diversity estimation through surname origin inference},
  author={Mazieres, Antoine and Roth, Camille},
  journal={Bulletin of Sociological Methodology},
  number={139},
  year={2018},
}

( DOWNLOAD PYTHON NOTEBOOK )

For details on method and datasets, see the full paper (raw data).