Comparing Digital Humanities Tools

For my Introduction to Digital Humanities class, for the past several weeks I been wrestling with a variety of different digital humanities tools as a way to explore the WPA Slave Narratives. Using three different tools—Voyant, kepler.gl, and Palladio—with the same source base has offered a wonderful introduction to what each can offer a researcher. Each of them highlights different aspects of a source base. Voyant, a textual analysis tool, illustrates text patterns within a corpus and offers visualizations of those patterns. Kepler.gl, a mapping tool, offers visualization of spatial patterns within a dataset and allows users to create a variety of interactive maps based on the geographic information in in their dataset. Palladio instead focuses on the relationships and connections between items in a dataset. This web-based platform develops visualizations of the networks that exist in data based on categories that you as a user define.

That last point brings me to my first big point about using these three programs: Voyant offers the most open-ended way to explore a dataset. With Voyant, you can upload a corpus of works in a variety of formats (word documents, pdfs, in .txt format). While you might need to do some advance preparation of taking out coding language if you are uploading text from the web, for the most part you can upload any text you want without formatting or preparing it in any special way. With Kepler.gl and Palladio, on the other hand, you have to do a lot of groundwork first to prepare data to be explored with these tools. Data for Kepler.gl must be uploaded as .csv files with categories related to the data already defined and with geographic locations already listed in terms of longitude and latitude coordinates (in short, you cannot enter a place name like Montgomery, Alabama in Kepler and expect it to translate it to a geographic location). Similarly, data for Palladio must be prepared and formatted in a specific way, with separate categories for each attribute of a node in the dataset.

What this means in terms that even a novice like me can understand is that Voyant is searching an entire document or corpus to find patterns in a text that do not relate to predetermined categories. Kepler and Palladio, in contrast, are mapping the data in relation to categories defined by the researcher. Our data set for Palladio, for example, included the following information for each WPA interview: Name of Interviewer, Name of Interviewee, Gender and Age of Interviewee; Gender of Interviewer; Location of Interview; Location where Interviewee was enslaved; Date of Interview; and whether the interviewee had a been a house or a field slave. This information had to be scraped or culled from the interview by researchers before the data was uploaded to Palladio. Thus I could ask Palladio to graph the connections between any two of these categories—how did the location of the interview relate to where interviewees had been enslaved; which interviewers interviewed whom—but Palladio cannot “read” the entire corpus in the way that Voyant does to suggest categories that might be interesting bases for comparison. If you want to try using all of these tools with a single data set, you might want to start with Voyant to try to get a better sense of what categories would be interesting to map geographically or as networks using Kepler or Palladio.

But all of these tools allow you to see your data in new ways and they help uncover patterns in any given dataset using different visualization tools. Voyant’s word clouds (which show the most frequently used words in any document or corpus of documents) and its trend graphs (which illustrate the use of a term or terms within a specific document) made it easy to explore the frequency of the use of different terms in these interviews and to see differences in term usage by region. You can also quickly explore the context in which a term was used throughout a document with Voyant’s context section, that shows the words that come before and after a chosen term. Kepler’s mapping capabilities, which allows users to not only make different kinds of maps (point maps, heat maps, cluster maps), but also to link space and time through a timeline map, allowed me to think about the significance of place and location in understanding and using WPA narratives. Mapping the interviews with Kepler made clear that more interviews were conducted in some regions than others within a state and showed the mobility of freedpeople by mapping their journey from where they had been enslaved to where they were interviewed. Palladio’s network analysis graphs helped me better understand those mobility patterns: by making a network graph that linked every interview site to every site where an interviewee had been enslaved, it became obvious that formerly enslaved people had moved to some places more than others. Palladio also offered a useful way to visualize the work done by particular interviewers: how many people had they interviewed, how many men or women, how many house or field slaves.

While none of these tools offered me concrete answers to a specific historical question, the patterns they identified raised new questions and the three tools offer complementary ways to explore those questions further. For example, exploring the WPA slave narratives in Voyant, I was struck that the term “whip” was used much more frequently by former slaves who were interviewed in Ohio than it was by former slaves who were interviewed in Mississippi. Had Ohio interviewees experienced more violence while enslaved? Or did they feel freer to speak about the violence they experienced than former freedpeople who remained in the deep South? Taking that question to Voyant, I could map where Ohio interviewees had been enslaved to get an overview of what region they came from. I could return to Voyant to see if freedpeople who still lived in those regions also used the term “whip” more frequently or not. With Palladio, I could create a category for the term “whip” which pulled out which specific interviewees used the term (Voyant would make that relatively easy to figure out). Then Palladio could be used to explore the connections between the use of the term “whip” and many other attributes of the dataset—it could graph whether the term was used more by men or women, whether it correlated to the age of the interviewee, or whether it had a relationship to the kind of work that the interviewee had done while enslaved.

I don’t know where all of this research would lead, but using all three tools to further explore this question would lead to a better understanding of the sources and to new avenues for investigation. And I’d certainly have some fun along the way.

Published by admin

Leave a Reply Cancel reply