Libby Hemphill, PhD

Director, Resource Center for Minority Data at ICPSR
Associate Professor at UMSI

Me Elsewhere Online

Google Scholar
Lab GitHub
Personal GitHub
ORCID: 0000-0002-3793-7281

Headshot of Libby-a woman with short dark brown hair, glasses, wearing a burgundy top with tan polka dots

Theme by orderedlist

About Me

I study politicians, non-profit organizations, and television fans to understand how people use social media to organize, discuss, and enact social change. I also study data curation, especially how we evaluate (a) the impacts of data reuse and (b) investments in curating and disseminating research data. My work has been funded by the National Science Foundation, the Institute for Museum and Library Services, the Nayar Prize, Mozilla, the Anti-Defamation League, Amazon, and DiscoverText.

Social media has some very real toxicity challenges, like harassment and cyberbullying, that diminish its utility for collective and democratic efforts, so I work on curbing those too (Belfer Fellowship, Mozilla Grant, Nayar Prize).

I have a few titles at the University of Michigan:

From 2010 - 2017, I was a faculty member in the Humanities Department and directed the Collective Action and Social Media Lab at Illinois Institute of Technology.

Work with Me

Post Docs

I’m hiring a faculty research fellow. The Fellow will join a team of researchers from ICPSR and the University of Michigan School of Information (UMSI) working to understand the impacts of data archiving and sharing in social science. The position is funded by a grant from the National Science Foundation to (1) understand how to responsibly allocate resources to data archiving so that we can (2) articulate data archiving policies that efficiently and effectively achieve innovation and transparency in social science.

The project will construct two measures of data’s scholarly impact—secondary impact and diversity—that depend on citations of the data. The ICPSR Bibliography of Data-Related Literature (the “Bibliography”) links over 80,000 research publications to the ICPSR data on which they are based. Generating the bibliography for a given study is currently a manual process, and datasets are often cited informally. The focus of the fellowship will be developing a predictive model that can assist staff in identifying informal and incomplete data citations. Given a set of publications, the model will (1) identify informal or incomplete dataset references and (2) determine whether the datasets match any in ICPSR’s collection.

Apply through Interfolio; review of applications beings September 1, 2019 and will continue until the position is filled.

PhD Students

I’m recruiting PhD students for fall 2020. Check out some of my research, and contact me if you’re interested.

PhD students will be admitted through the School of Information; applications are due December 1. I’m looking for 1-2 PhD students who are interested in

I’m especially interested in students who are curious about the impact of social media on democracy and civic engagement and how we can use computation and automation to make conversations and participation (in politics, in science, in society) more just and accessible. You should have stats and computational expertise or be willing to gain some quickly. You should also have expertise or a strong interest in political science, critical race and feminism studies, and/or communication.

For instance, one of the papers I’m writing now uses a classification model we developed in Python using scikit-learn and nltk. We label tweets with that model and then use multinomial logisitic regression to understand differences between policy tweeting patterns in Congress. I use this approach—build and train a model, label some content, analyze the patterns, and explain the implications for political science and communication—in most of my work, and you should be interested in those steps and experienced in at least one of them.


I welcome undergrads in my research group! Right now, the best way to get involved is through the Undergraduate Research Opportunity Program (UROP) at Michigan. I’ll start interviewing for Fall/Winter on September 11, 2019. This year’s project is called “Using Machine Learning and Experiments to Detect and Address Extremism”:

Extremist groups are especially adept at hiding in plain sight by using language that differs only slightly from acceptable speech (e.g., “blame on both sides”) or employing thinly-veiled phrases that mask nefarious intent (e.g., “preserve our culture”). The subtleties of white supremacist language, especially, are not effectively captured by existing computational approaches to detecting and addressing it online. This project attempts to address this challenge by building an adaptive language model to detect white supremacist speech online. We will build machine learning models that can detect extremist speech and its changes (either over time or from innocuous speech) and create an API to help partners and users rate content they encounter.

Current Funded Research Projects

Get in Touch

Email is best.


Here’s a PDF of my CV, and a few recent publications:

Working Papers

Some of my work that’s not (yet) published. These are workshop papers, papers under review, and/or longer versions of papers for conferences that review only abstracts during submission.

You can find more at