Rather than get lost in the semantic battle of defining disciplines (What is/are the digital humanities?), this presentation explores how we as humanists can use data to help us think through our humanities questions, evidence, and argument. Drawing from ‘digital’ and ‘data science’ methods of experimental design and operationalizing, I shared my data science project on the library of congress collection of Vietnamese materials.
I recently had the opportunty to present my research and research methods at my Fulbright host institution, Vietnam National University – Social Sciences & Humanities University (Đại học Quốc gia Hà Nội – Trường Đại học Khoa học Xã hội và Nhân văn). The audience included professors, lecturers, researchers, and students from the department of history and libraries and information, senior professors on libraries, and a few archives personnel from the Hán-Nom research institute (Viện nghiên cứu Hán nôm).
Spring of 2016 I enrolled in my first ever graduate level data science course at the School of Information at UC Berkeley. The course ‘Deconstrucing Data Science’ investigated quantitative methods of machine learning and data analysis. Coming from a humanist background, the course challenged me to think in drastically different ways about evidence, data, and argument. In the process of learning new data science methods, we reflected on experimental design and challenged the underlying assumptions of empirical methods. These critical reflections resonated with similar debates around the ‘scientific’ character of history and the social sciences to draw informed conclusions about the past and society.
While writing about my data science course at the School of Information in the spring of 2016, I realized that I needed a long preface to explain why it was that a historian of Vietnam was using computational methods in their research. My long engagement with the world of ‘tech’ has become less of a dabbling and more of a blurry (exciting) amalgamation where all of my work in history, digital humanities, quantitative methods, data science, and information science have converged.
Graduate school perpetuates a nebulous concept of ‘work.’ In the academy we are always working—from research to teaching, grant writing to meetings, emails to professional networking. But for me, this concept of always working weighs me down. It is easy for me to forget why I’m doing this whole academy thing, and what it is I’m actually doing at the moment.
Thus, for the past two years of graduate school, I have quantified my labor. It started as a personal challenge if I could maintain a ’40-hour work week’ and have some resemblance to work-life balance. But over the years, I found that quantifying my labor was both personally revelatory and an affirmation of my work. Much like the ‘quantified-self movement,’ I wanted to know what I do with my time, so that I could more efficiently use my time. But most importantly, quantifying my labor reminded me why I was pursuing a Ph.D. in Vietnamese history.
Geoffrey Bowker and Susan Leigh Star undertake the challenging and encompassing topic of ‘classification’ in Classification and Its Consequences. The authors argue the that 1) classification is a ubiquitous human activity (“human artifacts”) and 2) the consequences of classificatory architecture influence and ‘torque’ human lives politically, socially, linguistically, and cognitively. The authors provide investigate infrastructure of classification schemes in the medical and social realm such as the International Classification of Diseases (ICD), the Nursing Intervention Classification (NIC), and racial classification in South Africa.