As we’ve entered the second half of the semester, I’m increasingly thinking about how to arrange, present, analyze and set up my project for the Digital History course. This document introducing tabular data analysis provides some interesting options and considerations. I’m working with about 30-35 magazine articles from the 1940s to 1960s. They are either photocopied, photographed, or scanned (among the ones I physically have). For this semester, I’m realistically going to set up a flat file database to put the material up. The text analysis software, is an interesting option to explore as well considering, my analysis will look at discourse and language used to describe menopause in the postwar years. However, my biggest barrier to using text analysis is the quality of my copies. When I first collected these articles, I was working on a standard seminar paper. My intent wasn’t to digitize them and build a small database. As a result, many copies are barely legible, sentences are separated into different scans and photos, and depending on my photocopying skills that day, text closest to the magazine’s spine stretches and fades. One option is to put these documents through optical character recognition anyway and see what comes up. Another option is to transcribe these articles, either through typing them or reading them into a voice transcribing software. I’ll have to brainstorm further options to move my projects beyond a flat file database.
In the meanwhile, I played around with Voyant Tools and searched words in this blog. The results are not as exciting as I’d hope. My most frequently used words tend to not be significant ones such as “and,” “the,” “to,” “a,” “my,” etc. But I do notice “women” is moderately large in the word cloud. I do find it useful to select words and trace its frequency. Perhaps this blog isn’t the most useful document to analyze, but it was an exercise worth doing to see the tool’s potential for my own research.