CURRENT DIGITAL HUMANITIES PROJECTS
Vietnamese Visual Texts: Critical Analysis of Collaborative Colonial Texts
[Critical Digital Humanities, Computer Vision, Content Analysis, Virtual Reality, Pedagogy, Digital Reading]
I am the principal investigator on “Vietnamese Visual Texts” which critically examines indigenous knowledge production within colonial visual texts. In the first phase of the project (2019-2020, Brown University), I led a team of undergraduate researchers to content code a rare visual encyclopedia of Vietnamese crafts, cultural practices, and technologies commissioned in 1909 by a French colonial administrator and produced by a team of unnamed Vietnamese contributors (draftsmen, researchers, annotators, translators, and woodblock printers). The encyclopedia includes visual sketches of Vietnamese crafts and social practices as well as annotations in both French and Vietnamese (in Chữ Nôm, an endangered logographic Chinese writing system of Vietnamese language). I apply content analysis, visual and textual analysis to investigate the invisible authors and representation of race, gender, and labor. In the current stage of research, I collaborate with computer science professor Dr. David Laidlaw (Brown University) to data model descriptive patterns according to languages (Vietnamese Nôm script, French), visual depiction, and aesthetic style (visual archetypes, emotion, modularity). I use these patterns to uncover a plurality of authorship and the production of racialized and gendered hierarchies of knowledge. Our team is also developing a virtual reality tool for visualizing multilingual visual texts and historic data in spatial non-linear formats. As a close cultural analysis and computational investigation, this study offers novel contributions to the fields of science and technology studies, history of the book, Vietnamese history, labor history, and colonial studies. Furthermore, the virtual reality tool seeks to offer a virtual environment for research and teaching through virtual immersion and spatial organization of historic data.
Virtual Angkor project (SensiLab, University of Texas, Monash University, Flinders University, Brown University) is an immersive virtual reality and 3D simulation of 13th century Angkor metropolis for teaching history, archaeology, and visual art. The project won the 2018 Rosenzweig Prize for Innovation in Digital History by the American Historical Association. Since Fall 2019, I have been an affiliated faculty on the project and worked with the Virtual Angkor team to bring the VR scenes into a teaching module on visual representation in my courses at Brown as well as workshops on Virtual World Building at UCSD. Read and download my teaching module>
PREVIOUS DIGITAL HUMANITIES PROJECTS
Deconstructing Libraries: Predicting Titles, Topics, and Publication City
[Computational Text Analysis, Library, NLP, Semantic Models, Experimental Design]
This project analyzes a complex non-English language historical data source—bibliographies of the United States Library of Congress collections of Vietnamese language materials retrospectively collected up to 1979 and 1979-1985. We employed a dual approach of 1) contextualized historical reading and 2) machine learning methods (frequency counts, topic models, Naive Bayes, permutation tests) to understand library collecting patterns, the relationship between topics and publication location, and change over time. This originated as the final project for ”Deconstructing Data Science” course taught by Professor David Bamman (School of Information, UC Berkeley 2016), where I collaborated with co-principal investigator Jordan Shedlock to examine the relationship between book titles and their city of publication.
Research Findings: We used Naive Bayes to analyze the difference in word distributions between Saigon and Hanoi book titles. Through our approach we were able to answer: Which words from titles most characterize the city of publication? We calculated the probability of the words’ appearance conditioned upon its publication city. Among the most likely tokens for Hanoi were words associated with Communist rhetoric, such as cách mạng (revolution), nhân dân (people), xây dựng (build), and anh hùng (hero). In comparison, the Saigon tokens included more words that could be seen as democratic or nationalist, such as công dân (citizen), phật giáo (Buddhism), quê hương (homeland), and hiện đại (modern). For validation, we predicted unknown cities (due to OCR/Regex) and cross-validated that with human-reading of the original bibliography. These results suggest a semantic model of Vietnamese titles, its content, style, and relationship to place of publication.
Future work: This data science project was a proof of concept to demonstrate the value of experimental design, critical inquiry, and probabilistic thinking for my larger digital humanities research on the history of libraries, collections, and print control in Vietnam. I will continue to develop semantic models and statistical analysis in my ongoing Vietnamese Social Library Databaseproject.
- Research Report
- Blog Post Summary
- Slides and Images from the Project
- GitHub Repository
- Presentation Video – Demo of project as part of my public talk and workshop, “Operationalizing Historical Questions: Datafying the Library of Congress Vietnam Collection.” This talk was part of the “Texts as Data – Data as Texts” Seminar and Workshop at Yonsei University in Seoul on January 12, 2017, co-organized by Chad Denton, Wayne de Fremery, and myself.
Vietnamese Intellectual Networks Database – Digital Humanities at Berkeley
MSU Vietnam Group Archive – Collaborative digitization project funded by the National Endowment for the Humanities
Research Assistant, Translator, and Digital Humanities Consultant
MSU Vietnam Group Map Search Interface
Digital Mapping Consultant for MSU Vietnam Group Archive
MSU Vietnam Group Archive Timeline
Historical Content Developer for MSU Vietnam Group Archive