Texts as Data—Data as Texts


Rather than get lost in the semantic battle of defining disciplines (What is/are the digital humanities?), this presentation explores how we as humanists can use data to help us think through our humanities questions, evidence, and argument. Drawing from ‘digital’ and ‘data science’ methods of experimental design and operationalizing, I shared my data science project on the library of congress collection of Vietnamese materials.


Video of presentation

This talk was part of the “Texts as Data—Data as Texts” Seminar and Workshop at Yonsei University in Seoul on January 12, 2017.

Texts as Data—Data as Texts: This informal workshop is an opportunity for participants to consider how texts might be read as data and, conversely, how data might be read as texts. What opportunities for insight, discovery, and inspiration are presented by investigating the “weave” of what is “given” as data and the givens of what has been woven into texts? Presenters will offer ideas about the relationship between texts and data in the morning session and present new tools for de/constructing data and con/texts in the afternoon.



Sponsors: UIC History Research Institute  • Korea Text Initiative, Cambridge Institute for the Study of Korea

Panel Presentations and Discussion



“Interactive Writing and Reading: Digital Humanities in the Classroom”

Chad Denton

Associate Professor of History, Underwood International College, Yonsei University

Hyunju Bae

Economics, BA (class of 2011), Underwood International College, Yonsei University


“Computational Bibliography and the Sociology of Data”

Wayne de Fremery

Associate Professor, Department of Global Korean Studies, Sogang University

Director, Korea Text Initiative, Cambridge Institute for the Study of Korea


“Operationalizing Historical Questions: Datafying the Library of Congress Vietnam Collection”

Cindy A. Nguyen

Ph.D. Candidate, History Department, University of California, Berkeley

Tools for De/constructing Data and Con/texts



Reading by Machines: Open Refine on OCR Output Data

Cindy A. Nguyen


Mo文oN_Cut&Search, Toward an Alternative to OCR

Wayne de Fremery


Creating an Interactive Paper: HTML with Notepad++

Chad Denton and Hyunju Bae

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s