Text Analysis with Voyant Tools

Voyant Tools offers an easy-to-use, accessible web application for performing textual analysis. Users can upload their own corpus of documents in a variety of formats—plain text, pdfs, word documents—and then can use the program to get an overview of the text in the entire corpus or in a single document within that corpus. Voyant has five major tools which used together enable a user to see both the big textual patterns in the corpus and to explore specific uses of particular terms.

  • The Cirrus tool creates a word cloud based on the most frequently used words in a document or a corpus, with the size of the words in the cloud reflecting their frequency. Users can easily inform of the program of words that should not be included in the analysis (“a,” “the,” “http,” eg), and can easily shift the scale of the word cloud from the 25 most frequently used terms to the 500 most common terms. The tool also allows you to easily toggle between word clouds that read across the entire corpus and word clouds that relate to a single document in the corpus.
  • The Summary tool offers an overview of basic information about the document or corpus, including the total number of words in a document, the average length of its sentences, and which words are distinctive within a particular document.
  • The Reader tool displays text for reading based on terms that you select or search for. It shows the frequency of any term that you hover over with the cursor and arrows at the bottom of the reader enable you to jump 1000 words backwards or forwards in the text. A prospect viewer at the bottom of the Reader pane offers an overview of the corpus with each document in the corpus represented by bars that reflect the various lengths of the document. Clicking anywhere along the prospect viewer will bring up text from that location.
  • The Trends Tool creates graphs of different types related to the frequency with which terms appear in each of the different documents within a corpus or in different sections of a single document. Clicking on a word in the word cloud will display the graph for that term in the Trends tool. Users can also enter multiple search terms to compare their use across a corpus or document.
  • The Context Tool offers an easy way to see how specific words are being used. The context tool displays the words that appear before and after the search term. Repeated uses of the term appear in a list form, so users can easily discern if there are patterns in how certain terms are being used. Clicking on the “plus” icon next to any line will immediately bring up a longer excerpt of the text so as to make it even easier to understand the context surrounding the use of a specific term.

Voyant also offers a variety of different ways to visualize results. The word cloud can easily be changed into a graphic that displays links between different term, or a bubblegraph where the frequency of the term is reflected by the size of the bubble. Trends graph can be viewed as bar graphs, bubble lines, or knot lines. These visualizations can be downloaded as urls or in HTML snippets that can be embedded on other sites.

Voyant can be glitchy—not every tool works every time in the web interface and users may find themselves frustrated in trying to complete certain tasks. Yet the interface is so easy to use that it makes it possible for just about anyone to engage in textual analysis. With Voyant, you can quickly identify patterns in texts that would be much harder, if not impossible, to recognize through close readings. I found Voyant particularly helpful in highlighting patterns that raised interesting questions for further research. Using Voyant’s tools to explore segments of the WPA Slave Narratives, for example, revealed that the use of certain terms varies widely by state. References to whippings are far more frequent in interviews done in Ohio than in those done in Mississippi. Large scale textual analysis alone cannot explain why that is the case, but it offers a starting point for further research.


P.S. I could not get some of the Voyant graphs to download when I did the exercises, but I was able to capture some screenshots of the graphs I could not initially capture. I did not want to go back into the course system to upload these because they were from the very last step of one of the exercises and I would have had to upload graphs again on all of the pages. So I am attaching the link to the missing graphs here. I’m sorry to include them in this format: http://reneeromano.net/screenshots-for-voyant-graphs/

