rw-book-cover

Metadata

Highlights

  • We’re now two years into the Generative AI Era in higher education. For those of us in the humanities, much of the conversation around Large Language Models (LLMs) has revolved around teaching - the tenor of which is captured in headlines like “The College Essay is Dead” or “Another Disastrous Year of ChatGPT School Is Beginning.” Which, fair enough! LLMs pose all sorts of new pedagogical challenges for writing-based disciplines. But their impact on humanities research has received far less attention.In history there are , but the balance of the conversation around Generative AI has tilted overwhelmingly towards teaching. (View Highlight)
  • Today’s post focuses on using LLMs to transcribe and interpret primary sources. For most of my career as a digital historian, computational analysis has focused on “machine-readable” sources, or documents that can be easily transformed into data. A typed government report can be automatically processed with Optical Character Recognition (OCR), whereas a handwritten letter has to be transcribed by a human. offers a specialized AI-based handwriting transcription tool. While effective, there is a higher barrier to entry than more general tools like ChatGPT - it requires a paid account, can take a long time to process documents, and there’s a learning curve to understanding its interface. Other projects like From the Page are (View Highlight)
  • But what counts as “machine-readable” is rapidly changing. To illustrate what I mean, I’ll use a concrete example from my own work. While researching my book, I spent a lot of time mapping large-scale spatial patterns in the 19th-century postal system as it spread across the American West; what was missing from this approach was a more qualitative, intimate look at how individual people and families actually used the US Post. (View Highlight)
  • Take a peek through any historian’s hard drive and you will probably find thousands of similar photos form archive trips. Transcribing these documents has long been a bottleneck in the research process: a tedious but necessary step between photographing a primary source and actually using it in our work. When I transcribed Benjamin’s letter at the Huntington Library, it wasn’t particularly difficult work, but it was laborious - especially after three days of hunching over the same desk squinting at letter after letter. (View Highlight)
  • Document transcription is exactly the sort of time-consuming, boring task we would want to hand off to a Large Language Model. But can it actually do the job? To find out, I chose (womewhat at random) a Custom GPT that was designed specifically for transcribing handwritten text. I uploaded my photos of Benjamin’s 1886 letter and, a few seconds later, had a shockingly accurate, well-formatted transcription. Here’s a sample of that transcription: (View Highlight)
  • This example points to the potential of using LLMs for historical research. Their growing ability to process both natural language (text) and “multimodal” sources (non-textual sources like photographs, video, or audio), makes them a powerful tool for working with archival material. And unlike other kinds of digital tools, ChatGPT doesn’t require a user to learn any sophisticated technical skills. (View Highlight)
  • Now for the caveats. Benjamin’s letter is in some ways an ideal source for this kind of analysis. I’ve tried using ChatGPT to transcribe more challenging documents (ex. a 17th-century handwritten letter in Spanish) and the transcription quality degrades dramatically, sometimes to the point of gibberish. This plug-and-play approach won’t work for every type of source. It’s also not yet scalable. Don’t expect to be able to dump hundreds of documents into ChatGPT and receive flawless, neatly formatted transcriptions. (View Highlight)
  • The Curtis letter demonstrates why using LLMs for historical research requires careful attention and disciplinary expertise. These tools are potentially quite powerful for aiding historians with some of the mechanics of primary source analysis, but they work best when we actively guide and verify their output. Which is why I would encourage you to take the plunge and start experimenting with these tools for your own research. (View Highlight)
  • Think of an LLM as a research assistant. An extremely knowledgeable, well-read, and over-caffeinated research assistant. They can accomplish quite remarkable things but they will also make mistakes - like mixing up “Selia” and “Delia”. Unlike most research assistants, LLMs will occasionally hallucinate (ie. make up) things that sound quite convincing. Don’t take their output at face value and always check their work. (View Highlight)
  • Start small and be specific. Begin with shorter, discrete sources. These are easier for the model to work with and easier for you to evaluate. Just as you would with a research assistant, provide detailed guidance to the LLM about what, exactly, you want it to do. Generally speaking, it helps to assign it a role (ie. “You are a historical research assistant…”) before carefully spelling out the task that you want it to do.I tend to think prompt engineering is an overrated skill. But starting with something like the (or any of the out there) can help you get the hang of writing effective prompts. (View Highlight)
  • Choose the right tool. This is tricky, given how quickly things change in Generative AI. As of October 2024, if I had to choose one off-the-shelf model for working with primary sources I would probably go with the paid version of GPT-4o (which requires a $20/month subscription). This lets you analyze “multimodal” sources (like photographs) along with using Custom GPTs tailored for specific tasks (like handwriting transcription).Ethan Mollick wrote a a nice breakdown of (View Highlight)
  • Keep the human in the loop. LLMs work best when you actively work alongside them as a form of “co-intelligence”. Remember: you are the expert. Iterate back and forth. If the model misses something, point out what it got wrong and ask it to revise. I’ve found that this process of iteration (the “chat” part of ChatGPT) isn’t just about fixing mistakes; it’s actually a useful way for me to refine my own thoughts and interpretations. (View Highlight)