Digital Humanities


1. Vietnamese Social Library

[Database, Visualization, Prosopography, History of the Book, Publishing and Library Data]

I am the principal investigator of the Vietnamese Social Library, an open database on twentieth century Vietnamese intellectuals and their publications. This is an ambitious project to make legible the social and cosmopolitan world of writing, reading, and thinking. Vietnamese Social Library compiles data on Vietnamese intellectuals (prosopography) in conversation with Vietnamese publishing and library data (history of the book). The project will have three components: an online Drupal database to query specific authors, locations, and dates, examples of data visualizations using charts.js and highcharts.js, and access to a comprehensive Git Repository for further text and statistical analysis.

2. Invisible Authorship: Critical Analysis of Collaborative Colonial Texts

[Critical Digital Humanities, Indigenous Knowledge, Content Analysis, Visual Data, Pedagogy]

I am currently working on a research article analyzing a rare visual encyclopedia of Vietnamese crafts, cultural practices, and technologies produced in 1909 by Henri Oger, an unknown colonial administrator and a team of Vietnamese woodblock engravers and contributors. The encyclopedia includes visual sketches of Vietnamese crafts and practices as well as annotations in both French and Vietnamese (in chữ nôm, a logographic Chinese writing system of Vietnamese language). My critical analysis of the data reveals the complex intersection and contestedness of colonial knowledge and indigenous authorship.

Theoretical Intervention and Methodology: I employ content analysis of the annotations and sketches in order to reveal patterns of visual and textual description/classification within colonial ethnographic knowledge. I coded based on gender, age, professions, and semantic representation of object, person, or action. Furthermore, I uncover the moments of descriptive differentiation between the 4,462 French annotations and 3,006 Vietnamese nôm annotations to explore the authorial contributions of the anonymous laborers (Vietnamese annotators, wood engravers, and draftsman) involved in the production of the text. I contribute new findings on the collaborative, visible-invisible production of colonial knowledge. I have integrated this primary source data into an innovative critical digital humanities and data analysis teaching module, “Print and Power.”

Poster of Project – Presented at Digital Humanities Faire at UC Berkeley, 2015
Blog Post – Introduction to Primary Source and Historical Analysis


[Virtual Reality, Pedagogy, Critical Digital Humanities, Multimedia Analysis, Experiential Learning]

Virtual Angkor project (SensiLab, University of Texas, Monash University, Flinders University, Brown University) is an immersive virtual reality and 3D simulation of 13th century Angkor metropolis for teaching history, archaeology, and visual art. The project won the 2018 Rosenzweig Prize for Innovation in Digital History by the American Historical Association. Since Fall 2019, I have been an affiliated faculty on the project and worked with the Virtual Angkor team to bring the VR scenes into a teaching module on visual representation in my courses at Brown. Read and download my teaching module> 

4. Deconstructing Libraries: Predicting Titles, Topics, and Publication City

[Computational Text Analysis, Library, NLP, Semantic Models, Experimental Design]

This project analyzes a complex non-English language historical data source—bibliographies of the United States Library of Congress collections of Vietnamese language materials retrospectively collected up to 1979 and 1979-1985. We employed a dual approach of 1) contextualized historical reading and 2) machine learning methods (frequency counts, topic models, Naive Bayes, permutation tests) to understand library collecting patterns, the relationship between topics and publication location, and change over time. This originated as the final project for ”Deconstructing Data Science” course taught by Professor David Bamman (School of Information, UC Berkeley 2016), where I collaborated with co-principal investigator Jordan Shedlock to examine the relationship between book titles and their city of publication.

Research Findings: We used Naive Bayes to analyze the difference in word distributions between Saigon and Hanoi book titles. Through our approach we were able to answer: Which words from titles most characterize the city of publication? We calculated the probability of the words’ appearance conditioned upon its publication city. Among the most likely tokens for Hanoi were words associated with Communist rhetoric, such as cách mạng (revolution), nhân dân (people), xây dựng (build), and anh hùng (hero). In comparison, the Saigon tokens included more words that could be seen as democratic or nationalist, such as công dân (citizen), phật giáo (Buddhism), quê hương (homeland), and hiện đại (modern). For validation, we predicted unknown cities (due to OCR/Regex) and cross-validated that with human-reading of the original bibliography. These results suggest a semantic model of Vietnamese titles, its content, style, and relationship to place of publication.

Future work: This data science project was a proof of concept to demonstrate the value of experimental design, critical inquiry, and probabilistic thinking for my larger digital humanities research on the history of libraries, collections, and print control in Vietnam. I will continue to develop semantic models and statistical analysis in my ongoing Vietnamese Social Library Databaseproject.


1. Vietnamese Intellectual Networks Database – Digital Humanities at Berkeley

Co-Principal Investigator

PBC and Cuong De

2. MSU Vietnam Group Archive – Collaborative digitization project funded by the National Endowment for the Humanities

Research Assistant, Translator, and Digital Humanities Consultant

Screen Shot 2013-10-01 at 12.28.38 PM

MSU Vietnam Group Map Search Interface

Digital Mapping Consultant for MSU Vietnam Group Archive

Screen Shot 2015-06-04 at 11.15.59 PM

MSU Vietnam Group Archive Timeline

Historical Content Developer for MSU Vietnam Group Archive

Screen Shot 2013-10-01 at 12.15.59 PM

3. Detroit Digital – a data intensive visualization team project created in the Cultural Heritage Informatics Field School

Screen Shot 2013-10-01 at 12.22.11 PMScreen Shot 2013-10-01 at 12.20.29 PM

Screen Shot 2013-10-01 at 12.22.55 PM