In this video from the 2016 HPC User Forum in Austin, Ruby Mendenhall from the University of Illinois presents: Rescuing Lost History: Using Big Data to Recover Black Women’s Lived Experiences.
“A lot of times when people think about big data, they think about it in ahistorical times…outside of this political context,” said Ruby Mendenhall, an associate professor of sociology at UIUC. “It’s really important to think about whose voice is digitized, in journals and newspapers. A lot of that for black women has been lost and you need to make a concerted effort to recover it.”
Mendenhall’s study employs Latent Dirichlet allocation (LDA) algorithms and comparative text mining to search 800,000 periodicals in JSTOR (Journal Storage) and HathiTrust from 1746 to 2014 to identify the types of conversations that emerge about Black women’s shared experience over time.
“We used MALLET to interrogate various genres of text (poetry, science, psychology, sociology, African American Studies, policy, etc.). We also used comparative text mining (CTM) to explore latent themes across collections written in different time periods by analyzing the common and expert models. We used data visualization techniques, such as tree maps, to identify spikes in certain topics during various historical contexts such as slavery, reconstruction, Jim Crow, etc. We identified a subset of our corpus (20,000) comprised of articles about or by or Black women and compared patterns of words in the subset against the larger 800,000 corpus. Preliminary findings indicate that when we pulled 300,000 volumes, about 800,000 (~27%) do not have subject metadata. This appears to suggest that if a researcher searched for volumes about Black women, they may not have access to a significant amount of data on the topic. When volumes are not tagged properly, researchers would have to know that these texts exists when they do their searches. The recovery nature of this project involves identifying these untagged volumes and making the corpus publicly available to librarians and others with copyright considerations.”
Ruby Mendenhall is an Associate Professor in Sociology, African American Studies, Urban and Regional Planning, and Social Work at the University of Illinois at Urbana-Champaign. She is also an affiliate of the Institute for Genomic Biology and the Institute for Computing in Humanities, Arts and Social Sciences. In 2004, Mendenhall received her Ph.D. in Human Development and Social Policy program from Northwestern University in Evanston, Illinois. For her dissertation, Black Women in Gautreaux’s Housing Desegregation Program: The Role of Neighborhoods and Networks in Economic Independence, she used administrative welfare and employment data, census information, and in-depth interviews to examine the long-run effects of placement neighborhood conditions/resources on economic independence.