digital humanities /

Network showing trilingual alignment on Ugarit (Armenian, Greek, and Latin).

Reading Texts in Digital Environments: Applications of Translation Alignment for Classical Language Learning

Chiara Palladino, Furman University

Abstract

This paper illustrates the application of translation alignment technologies to empower a new approach to reading in digital environments. Digital platforms for manual translation alignment are designed to facilitate a particularly intensive and philological experience of the text, which is traditionally peculiar to the teaching and study of Classical languages. This paper presents the results of the experimental use of translation alignment in the context of Classical language teaching, and shows how the use of technology can empower a meaningful and systematic approach to information. We argue that translation alignment and similar technologies can open entirely new perspectives on reading practices, going beyond the opposed categories of “skimming” and traditional linear reading, and potentially overcoming the cognitive limitations connected with the fruition of digital content.

Reading and Digital Technologies: A New Challenge

It seems impossible to imagine a world where digital technologies are not a substantial part of our intellectual activities. The use of technology in teaching and learning is becoming increasingly prominent, even more now, as the massive public health crisis of COVID-19 created the need to access information without physical proximity. Yet, the way information is processed on digital platforms is substantially different from the cognitive standpoint, and not exempt from concerning consequences: recently, it has been emphasized that accessing content digitally stimulates superficial approaches and “skimming”, rather than reading, which may have a longstanding impact on the ways in which human brains understand, approach, and articulate complex information (Wolf 2018).

Therefore, we must ask ourselves if we are using digital technologies in the right way, and what can be done to address this problem. Instead of eliminating digital methods entirely (which in current times seems especially unrealistic), maybe the solution resides in using them to empower a different way of approaching information. In this paper, I will advocate that the practices embedded in the study of Classical texts can offer a new perspective on reading as a cognitive operation, and that, if appropriately empowered through the use of technology, they can create a new and meaningful approach to reading on digital platforms.

The study of Classical languages implies a very peculiar approach to processing information (Crane 2019). The most relevant aspect of studying Classical texts is that we cannot consult a native speaker to verify our knowledge: instead, “communication” is achieved through written sources and their interaction with other carriers of information, such as material culture and visual representations. On the other hand, we must never forget that we are engaging with an alien culture to which we do not have direct access. This necessity of navigating uncertainty requires a much more flexible approach to information, and a very different way of engaging with written sources, where the focus is on mediated cultural understanding through reading, rather than immediate communication.

Engaging with an ancient text is a deeply philological operation: a scholar of an ancient language never simply goes from one word to another with a secure understanding of their meaning. Their reading mode is much more immersive. It is an operation of reconstruction through reflection, pause, and exploration, which requires several skills: from the ability of active abstraction of the language and its mechanics, to the recognition of linguistic patterns that coincide with given models, to the reflection on what a word or expression “really means” in etymological, stylistic, and cultural terms, to the philological reconstruction of “why” that word is there, as a result of a long process of transmission, translation, and error.[1] Yet, the implications of this intensive reading mode, in the broader context of the cognitive transformations to reading and learning, are often overlooked.

The operations embedded in the reading of Classical languages respond to a different cognitive process, that is beyond the opposed categories of “skimming” and traditional linear reading. Because of this peculiarity, some of the technologies designed in the domain of Classical languages are created specifically to empower this approach, bringing it at the center of the reader’s experience.

Translation Alignment: Principles and Technologies

Digital technologies are widely used in Classics for scholarship and teaching, thanks to the widespread use of digital libraries like Perseus (Crane et al. 2018) and the Thesaurus Linguae Graecae (2020), and to the consolidation of various methods for digital text analysis (Berti 2019) and pedagogy (Natoli and Hunt 2019). One of the most interesting recent developments in the field is the proliferation of platforms for manual and semi-automated translation alignment.

Translation alignment is a task derived from one of the most popular applications in Natural Language Processing. It is defined as the comparison of two or more texts in different languages, also called parallel texts or parallel corpora, by means of automated or semi-automated methods. The main purpose is to define which parts of a source text correspond to which parts of a second text. The result is often a list of pairs of items – words, sentences, or larger texts chunks like paragraphs or documents. In Natural Language Processing, aligned corpora are used as training data for the implementation of machine translation systems, but also for many other purposes, such as information extraction, automated creation of bilingual lexica, or even text re-use (Dagan, Church, and Gale 1999).

The alignment of texts in different languages, however, is an exceptionally complex task, because it is often difficult to find perfect overlap across languages, and machine-actionable systems are often inefficient in providing equivalences for more sophisticated rhetorical or literary devices. The creation of manually aligned datasets is especially useful for historical languages, where available indexes and digitized dictionaries often do not provide a sufficient corpus to develop reliable NLP pipelines, and are remarkably inefficient for automated translation. Therefore, creating aligned translations is also a way to engage with a larger scholarly community and to support important tasks in Computer Science.

In the past few years, three generations of digital tools for the creation and use of aligned corpora have been developed specifically with Classical languages in mind. First, Alpheios provides a system for creating aligned bilingual texts, which are then combined with other resources, such as dictionary entries and morphosyntactic information (Almas and Beaulieu 2016; “The Alpheios Project” 2020). Second, the Ugarit Translation Alignment Editor was inspired by Alpheios in providing a public online platform, where users could perform bilingual and trilingual alignments. Ugarit is currently the most used web-based tool for translation alignment in historical languages: since it went online in March 2017, it has registered an ever-increasing number of visits and new users. It currently hosts more than 370 users, 23,900 texts, 47,600 aligned pairs, and 39 languages, many of which ancient, including Ancient Greek, Latin, Classical Arabic, Classical Chinese, Persian, Coptic, Egyptian, Syriac, Parthian, Akkadian, and Sanskrit. Aligned pairs are collected in a large dynamic lexicon that can be used to extract translations of different words, but also as a training dataset for implementing automated translation (Yousef 2019).

The alignment interface offered by Ugarit is simple and intuitive. Users can upload their own texts and manually align them by matching words or groups of words. Alignments are automatically displayed on the home page (although users can deactivate the option for public visibility). Corresponding aligned tokens are highlighted when the pointer hovers on them. The percentage of aligned tokens is displayed in the color bar below the text: the green indicates the rate of matching tokens, the red the rate of non-matching tokens. Resulting pairs are automatically stored in the database, and can be exported as XML or tabular data. For languages with non-Latin alphabets, Ugarit offers automatic transliteration, visible when the pointer hovers on the desired word.[2]

Overview of a trilingual alignment on Ugarit (Armenian, Greek, and Latin). The mouse pointer triggers the highlighting of aligned pairs, and activates the transliteration for the Armenian text. A color bar below the text shows the percentage of aligned pairs in green, and of non-aligned tokens in red. — Figure 1. Overview of a trilingual alignment on Ugarit (Armenian, Greek, Latin), with active transliteration for Armenian.

The structure of Ugarit was also used to display a manually aligned version of the Persian Hafez, in a study that tested how German and Persian speakers used translation alignment to study portions of Hafez using English as a bridge language. The results indicated that, with the appropriate scaffolding, users with no knowledge of the source language could generate word alignments with the same output accuracy generated by experts in both languages. The study showed that alignment could serve as a pedagogical tool with a certain effect on long-term retention (Palladino, Foradi, and Yousef forthcoming; Foradi 2019).

The third generation of digital tools is represented by DUCAT – Daughter of Ugarit Citation Alignment Tool, developed by Christopher Blackwell and Neel Smith (Blackwell and Smith 2019), which can be used for local text alignment and can be integrated with the interactive analysis of morphology, syntax, and vocabulary. The project “Furman University Editions” shows the potential of these interactive views, which are currently part of the curriculum of undergraduate Classics teaching at Furman and elsewhere.

This proliferation of tools shows that there is potential in the pedagogical application of this method: translation alignment can provide a new and imaginative way of using translations for the study of Classical texts, overcoming the hindrances normally associated with reading an ancient work through a modern-day version.

Text Alignment in the Classroom

The use of authorial translations to approach Classical texts is normally discouraged in the classroom, being perceived as “cheating” or as unproductive for a true, active engagement with the language. Part of this phenomenon is explained by the fact that, as “passive” readers, we don’t have any agency in assessing the relationship between a translation and the original, and reading them side by side on paper is rarely a systematic or intentional operation (Crane 2019). However, translations are an integral part of ancient cultures.[3] They are a crucial component of textual transmission, as they represent witnesses of the survival and fortune of Classical texts. Translations are also important testimonies of the scholarly problem of transferring an alien culture and its values onto a different one, to ensure effective communication, or to pursue a cultural and political agenda through the reshaping and recrafting of an important text (Lefevere 1992).

Translations are a medium between cultures, not just between languages. Engaging in an analytical comparison between a translation and the original means to have a deeper experience of how a text was interpreted in a given time, what meanings were associated to certain words, and, at the same time, how certain expressions can display multifaceted semantics that are often not entirely captured by another language. This is also an exercise in cultural dialogue and reflection, not only upon the language(s), but upon the civilization that used it to reflect its values. In other words, it is a philological exploration that resembles much of the reading mode of a Classicist.

Digital platforms for translation alignment offer an immersive and visually powerful environment to perform this task, where the reader can analytically compare texts token by token, and at the same time observe the results through an interactive visualization. It is the reader who decides what is compared, how, and to what extent: the comparison of parallel texts becomes an analytical, systematic operation, which at the same time encourages reflection and debate regarding the (im)perfect matching of words and expressions. In this way, translation alignment provides a way to navigate between traditional linguistic mastery and the complete dependence upon a translation. Not only this stimulates an active fruition of modern translations of ancient texts, but the public visibility of the result on a digital platform also provides a way to be part of a broader conversation on the reception and significance of an ancient text over time.

However, it is also important to apply this tool in the right way. For example, translation alignment needs to be coupled with some grammatical input, to encourage reflection on structural linguistic differences. Mechanical approaches, all too easy with the uncontrolled use of a clickable “matching tool”, should be discouraged by emphasizing the importance of focused word-by-word alignment. In practice, translation alignment needs to be used with caution and in meaningful ways, as a function of the goal and level of a course.

The following sections illustrate examples of application of translation alignment in the context of beginner, intermediate, and upper level classes in Ancient Greek and Latin. Translation alignment was structurally used during the courses to emphasize semantic and syntactic complexities through analytical comparison with English or other languages. Later, students were assigned various alignment tasks and exercises, designed to empower an analytical approach to the text.

Beginner Ancient Greek, first and second semester

The students were given two assignments, performed iteratively in two consecutive semesters, with variations in quantity (more words and sentences were assigned in the second semester):[4]

Individuate specific given words in a chosen passage, and align them with the corresponding words in one translation. The goal of this exercise was to set the groundwork to develop a rough understanding of the depth of word meanings, by assessing how the same word in the source text could appear in different ways across the same translation.
Use alignment to evaluate two translations of a shorter text chunk (1–3 sentences, or 10 verse lines). Identify precisely the corresponding sections of text in the source and in the translations. Assess which translation is most effective by using two criteria: 1, combination of number and quality of matched tokens; 2, pick particularly problematic words and look them up in a dictionary, to assess their meanings; compare the dictionary explanation with the general context of the passage, and assess how translations relate to the dictionary entries and how closely they render the “original sense” of the word.

The results were two short essays where the students articulated their considerations. Grading was based on the ability to give insightful analysis of how word choices impacted the tone and meaning of the translations, and discuss the semantic depth of the words in the original language. Bonus points were provided if the student was able to identify tangential aspects, such as word order, changes in cases, and syntax. Minor weight was given to the overall accuracy of the alignment, in consideration of the level: the design of the exercises was deliberate in discouraging the creation of longer alignments, which often result in the student doing the work without thinking about their alignment choices. Essay questions focused instead on close-reading, analytical, in-depth investigation into the semantics of the source language.

The Ancient Greek text is located at the center, and the two translations at the sides. The translation on the left displays a 75% of aligned pairs, the translation on the right a 73%. — Figure 2. Two aligned English translations of *Odyssey* 9.105–9.115.

Intermediate level of Ancient Greek and Latin, third semester

Students used translation alignment in the context of project-based learning. They were responsible for the alignment of a chosen text chunk against translations that they had selected, ranging from early modern to contemporary translations. The assignment was divided in phases:

Alignment of the source text against two chosen translations in English, and systematic evaluation of both translations. The students were asked to focus on chosen phenomena of syntax, morphology, grammar and semantics, that were particularly relevant in the text: e.g. word order, participial constructs, adjectival constructs, passive/active constructions, changes in case, transposition of allusion and semantic ambiguity. The students used their knowledge of syntax and grammar to critically assess the performance of different translators, focusing on the different ways in which complex linguistic phenomena can be rendered in another language. This assignment was combined with side analysis of morphology and syntax: for example, the students of Ancient Greek designed a morphological dataset containing 200 parsed words from the same passage.
Creation of a new, independent translation, with a discussion of where it distanced itself from the original, which aspects of it were retained, and how the problems individuated in the authorial translations were approached by the student.

The result was a written report submitted at the midterm or end of the semester, indicating: the salient aspects of the texts and its most relevant linguistic features; an analytical comparison of how those linguistic features appeared in the competing aligned translations, and an evaluation of the translator’s strategy; the student’s translation, with a critical assessment of the chosen strategy to approach the same problems. These aspects constituted the backbone of the grading strategy, with additional attention to the alignment accuracy.

Section of two aligned translations of Hesiod, Works and Days vv. 42–105, with the original ancient Greek at the center, and the English translations on the sides. — Figure 3. Section of two aligned translations of Hesiod, *Works and Days* vv. 42–105. The student used a comparison between two translations from the same period (Hugh G. Evelyn-White, 1914, and David W. Tandy and Walter C. Neale 1996) but with very different styles, and used adjective-noun combinations and participle constructions to systematically evaluate them. The 1996 translation was judged more literal than the other, and more useful for a student.

Upper level Ancient Greek and Latin, fourth to seventh semester (graduate and undergraduate)

The exercises assigned for the upper level were a more articulated version of the project-based ones given to the intermediate level. The students were assigned a research-based project where alignment would be one component of an in-depth analysis of a chosen source. At an intermediate stage of the semester, the students would submit a research proposal indicating: an extensive passage they chose to investigate, and why they chose it; the topic they decided to investigate, and a short account of previous literature on it; methodologies applied to develop the project; desired outcomes. The final result would be a project report submitted at the end of the semester, indicating: if the desired outcomes had been reached, what kinds of challenges were not anticipated, and what new results were achieved; strategies implemented to apply the chosen methodology, e.g. which alignment strategy was applied to ensure that the research questions were answered; what the student learned about the source, its cultural context and/or language. The results were graded as proper projects: the students were evaluated according to their ability to clearly delineate motivation and methodology, use of existing resources, and critical discussion of the outcomes.

Many students creatively integrated alignment in their projects. For example:

Creation of an aligned translation for non-expert readers, alongside a commentary and morphosyntactic annotations. To facilitate reading, the student developed a consistent alignment strategy that only matched words corresponding in meaning and grammatical function. This project was published on GitHub.
Trilingual alignment of English-Latin-German to investigate the matching rate between two similarly inflected languages. The student noted that, even if their knowledge of German was inferior to English and Latin, matching Latin against German proved easier and more streamlined, while the English translation was approached with more criticism for its verbosity (Figure 4).[5]
Trilingual alignment to compare different texts. The student conceived a project aimed at gathering systematic evidence of the verbatim correspondences between the so-called Fables of Loqman and the Aesopic fables: according to existing scholarship, the former would be an Arabic translation of the latter. The student used a French translation of the Loqman fables to leverage on the challenges of the Arabic, and examined the overall matching rate across the texts (see this sample passage).

Sample passage of Tacitus, Germania 1.1, with two aligned translations in English and German, located on the left and at the center respectively. The German translation at the center displays identical matching rate as the Latin text on the right (93%), while the English translation on the left only has 89% matching rate. — Figure 4. Sample passage of Tacitus, *Germania* 1.1, with two aligned translations in English and German.

Results

The students reported how alignment affected their understanding of the source and its linguistic features, and how approaching the original by comparing it against a modern translation gave them a deeper understanding and respect for the content. While the alignment process often resulted in some criticism of available translations, the students who had to discuss the challenges faced by translators (or who had to translate themselves) gained a stronger understanding of the issues involved in “transferring” not only words and constructs but also underlying cultural implications and multiple meanings. The students who used alignment in the context of research projects also benefited from the publication of their aligned translations, and some presented them as research papers at undergraduate conferences. Many students even reported to have used alignment independently afterwards in other courses, often to facilitate the study of new languages, both ancient and modern.

Some overarching tendencies in the evaluation of concurrent translations emerged, particularly at the Intermediate and Upper Level. This feedback was extremely interesting to observe, because most of it can only be explained as the result of a systematic comparison between target and source language, in a situation where the reader is an active operator and not a passive content consumer.

The students observed analytically the various ways in which translations cannot structurally convey peculiar aspects of the original: for example, dialectal variants, metrical arrangement, wordplays, or syntactic constructs. Most of them were still able to appreciate skillful modern translations, and even to diagnose why translators would distance themselves from the original. They definitely understood the challenge by engaging in translation tasks themselves. For many, however, the discovery that they could not fully rely on translations to understand what is happening in a text was astonishing. Students tend to be educated to the idea that authorial translations are necessarily “right” (and therefore “faithful”[6]) renderings of Classical texts, to the point where they often trust them over their own understanding of the language. With this exercise, they learned that “right” and “faithful” may not be the same thing, and that the literature of an ancient civilization preserves a depth and complexity of meaning that cannot be fully encompassed in a translation.

Interestingly, students often had a more positive judgement of translations that rendered difficult syntactic constructs more closely to the original without fundamentally altering the structure, or shifting the emphasis (e.g. by changing subject-object relations or by altering verb voices). Students at the Intermediate level, in particular, judged such translations more “literal”, as they found them more helpful in understanding important linguistic structures: Figure 3 shows an alignment of Hesiod’s Works and Days, where the student extrapolated adjective-noun combinations and participle constructs to draw a systematic comparison between two concurring translations. The translation that was judged “more literal”, and therefore more useful for a student, was the one that kept these structures closer to the way they appeared in the original. This phenomenon intensified with texts that had a strong amount of allusions and wordplay, which are often conveyed by means of very specific syntactic constructs: students who dealt with this kind of texts were merciless judges of translations that completely altered the original syntax and recrafted the phrasing to adapt it to a modern audience. The students indicated how such alterations regularly failed to convey the depths of sophisticated wordplay, where the syntax itself is not an accessory, but a structural part of the meaning.

The omission of words in the source language was considered particularly unforgiving: even though some words like adverbs and conjunctions are omitted in translations to avoid redundancies, some translations were found to leave out entire concepts or expressions for no apparent reason. The visualization of aligned texts on Ugarit certainly accentuated this aspect, as it tends to emphasize the relation between matched and non-matched tokens through the use of color, and it also provides matching rates to assess the discrepancy between texts. Almost all the students seem to have intensively taken advantage of this aspect, by emphasizing how translations missed entire expressions that appeared in the original and shaped its message: in other words, even if the omission only regarded one adjective or a particularly intensive adverb, they felt that translations did not convey the full meaning of the text they were reading.

The implications of such observations are interesting: the translations in question were “bad” translations not because they were not understandable or efficient in conveying the sense of a passage in English, but because they hindered the student’s understanding of the original. Readers, even classically-trained ones, normally enjoy translations that, while taking some liberties, are more efficient in conveying the content and artistic aspects of a text in a way that is more familiar to a modern audience. Students who read a text in translation often struggle with versions that try to be close to the original language (sometimes with rather clumsy results), and they also make limited use of printed aligned translations that used to be very popular in school commentaries of the past. However, when students became active operators of translation alignment, the focus shifted to the understanding of the original through the scaffolding provided by the translation. In other words, the focus was on how the translation served the reader of the source text: this suggests an extremely active engagement with the original, through the critical lenses of systematic linguistic comparison.

With the guidance provided by the exercises, the students used translation alignment to engage with linguistic and stylistic phenomena, and the assessment of the ineffectiveness of translations in conveying such complex nuances often made them more confident in approaching the original language. In their own translations, they became extremely self-aware of their position with respect to the text, and tried to justify every perceived variation from the structure and the style of the original. Some of them opted for very literal, yet clumsy, translations, which they reflected upon and elaborated more thoroughly in a commentary to the text; others, particularly at the Upper level, built upon aspects that they liked or disliked in the translations to create better versions of them, depending on their intended audience.

We can conclude that, if appropriately embedded in reflective exercises, translation alignment did not result in a mechanical operation of word matching, but nurtured an active philological approach to the text, and an exploration of it in all its different aspects, from linguistic constructs to word meanings, to the role of wordplays in a literary context. Despite growing skepticism in the ability of translations to convey the “full” meaning of a text, the students still believed in the necessity of using them in a thoughtful manner.

In fact, the students advocated for more and more varied use of digital tools, to compensate for the deficiencies of aligned corpora. At the Upper level in particular, many students complemented their translation alignments with additional data gathered through other digital resources: for example, while creating translation alignments directed at non-expert readers, they integrated the resource with a complete morphosyntactic analysis performed with treebanking (Celano 2019), with the intention of making up for the limitations of incomplete matching of word functions in specific linguistic constructs.

In this regard, it is important to emphasize that translation alignment is just one of the tools at our disposal. In a future where learning and reading are going to be prevalently performed through digital technologies, we need to create environments where readers can meaningfully engage in a philological exploration of texts at multiple levels: translation alignment, but also detection of textual variants, geospatial mapping, social network analysis, morphosyntactic reconstruction, up to the incorporation of sound and recording that can compensate for reading and visual disabilities (Crane 2019).

Conclusions

Overall, the experiment showed that a meaningful use of translation alignment can empower a reflective and active approach to Classical sources, by means of the continuous, systematic comparison of the cultural and semantic depths embedded in the language. Of course, translation alignment should not be the only option: digital technologies offer many opportunities of enhancing the reading experience as a philological exploration, through the interaction of many different data types, allowing a sophisticated approach to information from multiple perspectives. Even though these tools have been created to empower the reading processes specific of Classical scholars, their application promises new ways of approaching digital content in a much wider context, going beyond the categories of “close reading” and “skimming.”

Translation alignment is a tool that can empower a thoughtful and meaningful approach to reading on digital platforms. But more than that, it can also stimulate a deeper respect for cultural differences. In an increasingly globalized world, translations as means of communicating through cultural contexts and languages are increasingly important: automated translations, as well as interpreters and professional translators, represent a response to a generalized need of fast and broad access to information produced in different cultural contexts. However, being able to access translated content so easily can result in oversimplification, and in the overlooking of cultural complexities. Aligned translations offer an alternative. By discouraging the idea that every word has an exact equivalence, aligned translations add value to the original, rather than subtracting it, through a continuous dialogue between cultural and linguistic systems. Engaging with a translation meaningfully means so much more than merely establishing equivalences: by emphasizing the depth of semantic differences, it can promote better attitudes to cultural diversity and acceptance.

Notes

[1] In this sense, reading an ancient text is much closer to literary criticism than to the study of a foreign language. This is the reason why Classical languages are never fully embedded in current practices of foreign language teaching and assessment. This topic was recently treated, among others, by Nicoulin (2019).
[2] This feature is currently available for Greek, Arabic, Persian, Armenian, and Georgian.
[3] Translations were continuously used to ensure communication between different cultures and communities in the ancient world (Bettini 2012; Nergaard 1993). The practice of multi-lingual aligned texts as means of cultural communication was normal, if not frequent, in antiquity, with famous examples like the inscriptions of Behistun, the edicts of Ashoka the Great, and the Rosetta Stone.
[4] A variant of this assignment was also tested on a group of students with no knowledge of Greek, enrolled in courses of literature in translation (Palladino, Foradi, and Yousef forthcoming).
[5] Interestingly, trilingual alignment was used by some students to improve their mastery of a third language, often a modern one, by leveraging on their knowledge of their native tongue and the ancient language (Palladino, Foradi, and Yousef forthcoming).
[6] Incidentally, the “faithfulness” of a translation as a value judgement was introduced by the Christians: since God imprints his image on the text, every version of that text needs to be a faithful reproduction of it. Here resides the miraculous character of the translation of the Septuagint, which, according to tradition, came to be when seventy savants independently wrote an identical translation of the Bible (Nergaard 1993).

Bibliography

Almas, Bridget, and Marie-Claire Beaulieu. 2016. “The Perseids Platform: Scholarship for All!” In Digital Classics Outside the Echo-Chamber, edited by Gabriel Bodard and Matteo Romanello, 171–86. Ubiquity Press. https://doi.org/10.26530/OAPEN_608306.

Berti, Monica, ed. 2019. Classical Philology. Ancient Greek and Latin in the Digital Revolution. Berlin: De Gruyter Saur.

Bettini, Maurizio. 2012. Vertere. Un’antropologia della traduzione nella cultura antica. Torino: Einaudi.

Blackwell, Christopher, and N. Smith. 2019. “DUCAT – Daughter of Ugarit Citation Alignment Tool.” Accessed October 4, 2020. https://github.com/eumaeus/ducat.

Celano, Giuseppe. 2019. “The Dependency Treebanks for Ancient Greek and Latin”. In Digital Classical Philology. Ancient Greek and Latin in the Digital Revolution, edited by Monica Berti, 279–298. Berlin: De Gruyter Saur.

Cook, Guy. 2009. “Foreign Language Teaching.” In Routledge Encyclopedia of Translation Studies, edited by Monica Baker and Gabriela Saldanha, Second Edition, 112–15. London; New York: Routledge, Taylor & Francis Group.

Crane, Gregory. 2019. “Beyond Translation: Language Hacking and Philology.” Harvard Data Science Review 1, no. 2. https://doi.org/10.1162/99608f92.282ad764.

Crane, Gregory, Alison Babeu, Lisa Cerrato, Bridget Almas, Marie-Claire Beaulieu, and Anna Krohn. 2018. “Perseus Digital Library.” Accessed March 5, 2020. http://www.perseus.tufts.edu/hopper/.

Dagan, I., K. Church, and W. Gale. 1999. “Robust Bilingual Word Alignment for Machine Aided Translation.” In Natural Language Processing Using Very Large Corpora, edited by Susan Armstrong, Kenneth Church, Pierre Isabelle, Sandra Manzi, Evelyne Tzoukermann, and David Yarowsky, 209–24. Text, Speech and Language Technology. Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-017-2390-9_13.

Foradi, Maryam. 2019. “Confronting Complexity of Babel in a Global and Digital Age. What Can You Produce and What Can You Learn When Aligning a Translation to a Language That You Have Not Studied?” In DH2019: Digital Humanities Conference, Utrecht University, July 9–12. Book of Abstracts. Utrecht.

Lefevere, André. 1992. Translation, Rewriting, and the Manipulation of Literary Fame. London; New York: Routledge.

Natoli, Bartolo, and Steven Hunt, eds. 2019. Teaching Classics with Technology. London; New York: Bloomsbury Academic.

Nergaard, Siri, ed. 1993. La teoria della traduzione nella storia. Milano: Bompiani.

Nicoulin, Morgan. 2019. “Methods of Teaching Latin: Theory, Practice, Application.” Arts & Sciences Electronic Theses and Dissertations, May. https://doi.org/10.7936/znvz-zd20.

Palladino, Chiara, Maryam Foradi, and Tariq Yousef. forthcoming. “Translation Alignment for Historical Language Learning: A Case Study.”

———.“The Alpheios Project.” 2020. Accessed June 23, 2020. https://alpheios.net/.

———.“TLG – Thesaurus Linguae Graecae.” 2020. Accessed June 18, 2020. http://stephanus.tlg.uci.edu/.

Wolf, Maryanne. 2018. Reader, Come Home: The Reading Brain in a Digital World. New York: Harper.

Yousef, Tariq. 2019. “Ugarit: Translation Alignment Visualization.” In LEVIA’19: Leipzig Symposium on Visualization in Applications 2019. Leipzig.

About the Author

Chiara Palladino is Assistant Professor of Classics at Furman University. She works on the application of digital technologies to the study of ancient texts. Her current main interests are in the use of semantic annotation and modelling for the analysis of ancient spatial narratives, and in the implementation of translation alignment platforms for reading and investigating historical languages.

Assignments

Introducing GIS in the History Classroom: Mapping the Legacies of the Industrial Era in Postindustrial America

Camden Burd, Eastern Illinois University

Dr. Burd planned this lesson for a 100 level course titled History of American Capitalism. Students built digital maps using ArcGIS Online and later reflected on the benefits of the technology as an educational tool.

Introduction

During the fall 2018 semester, I taught a coursed titled History of American Capitalism. In addition to a history of economic trends, the course examined the ways in which American capitalism has influenced a set of ideas and cultural attitudes about wealth, citizenship, identity, gender, and the use of natural resources. The course structure was mostly traditional, as a vast majority of instruction blended lecture and seminar-style discussion around several readings. Though I followed this structure for much of the semester, I intentionally designed one module to introduce students to GIS mapping believing that the spatial tool could be an asset in instruction. My decision to choose GIS mapping grew from a wealth of scholarship that demonstrates that spatial tools can improve understanding, critical thinking, and cultural empathy (Hawthorne 2011; Johanson et al. 2012; Kelley 2017; Sinha et al. 2017). By incorporating GIS technology into the course, I introduced students to new digital tools while enabling participants to engage with the course material in a unique way in order to improve historical understanding, critical thinking, and digital literacy.

Overview

During the 75-minute session, I designated 15 minutes to reviewing themes from the previous course sessions. Leading up to the course module, I used lecture and assigned readings based on the work of historians that examined ecological and economic realities of postindustrial America (Hurley 1995; Neumann 2019). In class, we reviewed how the closures of industrial sites left many Americans unemployed, often forcing laid-off employees to find employment elsewhere—often at a fraction of their previous salary. The lesson was designed to teach students that although certain businesses may go bankrupt, move, or dissolve, the firms’ legacies long outlive their corporate existence. We can track America’s postindustrial era both through the ecological footprint of industrialization as well as its long-lasting economic void in countless manufacturing communities across the landscape. Tasking students to construct a GIS map of that historical legacy offered students the opportunity to become active learners in the course content.

After reviewing the course materials, I asked students to open ArcGIS Online and provided a brief overview of the tool. Students came to learn that ArcGIS Online is a free, open-access mapping tool that allows users to upload, visualize, analyze, and share geographic-based information. I chose to use ArcGIS Online in place of similar programs such as QGIS and Carto because the platform is free and also provides users quick access to the tool upon registration. I demonstrated the basic functions of the online version of the program including how to add features to a map and change the base layer. I then directed students to the feature that allows users to add new datasets or additional layers to maps based on information published to the web. Informed by the preceding lectures and knowledgeable about the basic functions of ArcGIS Online, students had to search, find, and add a dataset published by the Environmental Protection Agency outlining the National Priorities List of Superfund sites. The dataset provides information on several hundred hazardous sites, each with brief descriptions including the environmental issues and historic company responsible for the waste. The data includes Superfund sites whose addition to the list can date back to the early 1980s (Congress passed the Comprehensive Environmental Response, Compensation, and Liability Act in 1980) as well as data compiled within months of the class meeting. I asked students to pick a site, examine the scope of the waste, and do some basic searching on the web about the history of the company. Students often chose Superfund sites close their homes—often surprised to find that such toxic waste ever existed so close to their childhood home. At this point, we had a brief conversation where students shared a specific site and its industrial history. As expected, many of the Superfund sites derived from companies that either closed down, moved, or went bankrupt during the peak deindustrialization years. The visualization created a space for discussion among students as they quickly realized that the toxic legacy of many industrial companies outlived the firms’ years of operation.

The lesson not only tracked the historical roots of modern Superfund sites, it also pressed students to think about how modern populations are still affected by a company’s actions several decades ago. After a vibrant discussion about the Superfund dataset, I asked students to add another dataset that maps poverty ratios based on recent Census statistics (the published data was based on the US Census Bureau’s American Community Survey for 2013). As the students built confidence in their ability to navigate the ArcGIS Online tool, they began to realize the potential of map building as an instrument for sharing information and demonstrating visual evidence. Upon adding the new dataset, students explored the geographic relationship between modern poverty rates and the location of toxic waste sites. We began to discuss how adding datasets as separate layers influenced the first set of data and how the correlation between the two might contribute to a larger story of deindustrialization. Students began to imagine how visualizing both sets of data in geographic terms creates visual correlations as both an argument about the information as well as a vehicle to share this information with external audiences. After self-guided exploration, the students came back together for discussion. I asked several questions using the visualization as a source of conversation and critical engagement with the history of the postindustrial era. “Did you know that the American landscape contains this many toxic landscapes? Are there any sites close to your hometown? As you explore the map, do you get a sense of any correlation between poverty rates and current Superfund sites?” The corresponding discussion was strengthened as students navigated the spatial visualization they recently created.

Figure 1. Map with Superfund site and poverty rate information displayed.

Student Reactions

Using an anonymous and voluntary questionnaire, I asked students to reflect on the use of ArcGIS Online and its effectiveness during the course session. In addition to gauging students’ reactions to the specific lesson, I also encouraged students to think more broadly about GIS technology and imagine its possibilities outside of this particular course. Below are sample items from the questionnaire:

What past experiences have you had with ArcGIS or other data-visualization technology?
When you partook in the historical data-visualization learning module, what did you think you were learning?
How could you imagine using ArcGIS and other data-visualization software in the future?

Of the twenty-one students in the course, six volunteered to participate in the survey, and only one student noted previous experience with the tool. The responses could be organized into two basic themes. First, students reflected on how ArcGIS aided in learning the specific content affiliated with the History of American Capitalism course. Second, students demonstrated an understanding how the tool could be applied to other research and writing.

Based on student responses, it became apparent that ArcGIS Online improved student learning and comprehension of the specific lesson. One student noted the connection between the GIS learning module and the larger course themes: “We used two layers on the US map—households below the poverty line and superfund sites—to determine whether there was a particular correlation between the two.” Another student noted that the module provided the “same historical research and content as one would [get] through reading a book or paper.” The students’ comments also revealed how students welcomed the course module. The use of ArcGIS Online broke with more traditional forms of engagement such as journal articles, books, and other text-based course material providing students with different learning styles new opportunities to participate in the class.

One respondent especially appreciated the spatial focus of the exercise noting that the lesson made them aware of “how History and Geography are linked,” and that the “data visualization allow[ed] for patterns to be observed.” The comments reveal ways in which ArcGIS Online might be harnessed as a powerful tool for student learning in the history and humanities classroom. For those lessons that involve geographic information, the careful use of mapping technology offers students with an opportunity to become active learners. The process of building the map allowed students to critically engage with course material by using visualizations and geographic information as a form of historical argumentation.

The exercise also exposed students to a technology not commonly used in a history classroom. Participants expressed an enthusiasm to use ArcGIS in future assignments or courses, including one student who wrote, “With regards to writing historical research papers that [focus] a lot on specific data, ArcGIS would be an amazing source to back up particular claims within a study.” Another student echoed that message noting, “I would imagine using ArcGIS or another data visualization software as a tool for presenting research to an audience, i.e. giving them something more interesting to look at rather [than] just writing on a page and describing findings using only words.” Though their exposure to ArcGIS Online was brief, students who participated in the ArcGIS Online module and responded to the questionnaire noted an interest in the technology and noted its value in teaching the course content.

Conclusion

After reviewing the participants’ comments, I was struck with the eagerness to use ArcGIS Online more often in the classroom. It became clear throughout the classroom activity, as well as subsequent reviews, that students found the mapping software a helpful tool to learning. As the instructor, I was both affirmed by the comments and curious about the ways that I might be able use student feedback for future course design. For example, I could plan courses with more mapmaking and data visualization as a form of active learning. In this scenario, students would design and build maps with course material with geographic information. By designing individual course modules in this way, I could help students become more familiar with ArcGIS Online and feel emboldened to use the technology in other classes and outside coursework. Additionally, I could imagine providing students with more sustained interaction by using ArcGIS StoryMaps—a related program that integrates images, video, long-form writing, and traditional mapmaking—to design and publish longer histories as a final assignment. Both scenarios allow students to further engage with the mapping software and increase active learning time in the course.

Overall, the responses strengthen the claims of digital humanists who advocate for the use of technology in the classroom (Bonds 2014; Clement 2012; Iantorno 2014; Jakacki 2016; Locke 2017). As many digital humanists argue, the use of new technologies offers an opportunity to diversify curriculum, expand the ways in which students engage with course content, and introduce thoughtful engagement with new digital tools of the twenty-first century. It is my hope that this course module and related student feedback provide a roadmap for educators who wish to incorporate more hands-on and active-learning activities into humanities education. Given the students’ eagerness to engage more with ArcGIS Online, as well as their abilities to envision future applications for the tool, I believe the use of digital mapping tools will enhance student engagement and learning in the humanities classroom.

Bibliography

Bonds, E. Leigh. 2014. “Listening in on the Conversations: An Overview of Digital Humanities Pedagogy.” The CEA Critic 76, no. 2 (July): 147–157.

Clement, Tanya. 2012. “Multiliteracies in the Digital Humanities Curriculum: Skills, Principles, and Habits of Mind.” In Digital Humanities Pedagogy: Practices, Principles, and Politics, edited by Brett. D. Hirsch, 365–388. Cambridge, UK: Open Book Publishers.

Hawthorne, Timothy L. 2011. “Communities, Cartography and GIS: Enhancing Undergraduate Geographic Education with Service Learning.” International Journal of Applied Geospatial Data 2, no. 2: 1–16.

Hurley, Andrew. 1995. Environmental Inequalities: Class, Race, and Industrial Pollution in Gary, Indiana, 1945–1980. Chapel Hill, North Carolina: The University of North Carolina Press.

Iantorno, Luke A. 2014. “Introducing Digital Humanities Pedagogy.” The CEA Critic 76, no 2 (July): 140–146.

Jakacki, Diane. 2016. “Doing DH in the Classroom: Transforming the Humanities Curriculum through Digital Engagement.” In Doing Digital Humanities: Practice, Training, Research edited by Constance Crompton, Richard J. Lane, and Ray Siemens, 358–372. New York: Routledge.

Johanson, Chris, Elaine Sullivan, Janice Reiff, Diane Favro, Todd Presner, and Willeke Wendrich. 2012. “Teaching Digital Humanities through Digital Cultural Mapping.” In Digital Humanities Pedagogy: Practices, Principles, and Politics, edited by Brett D. Hirsch, 121–150. Cambridge, UK: Open Book Publishers.

Kelley, Shannon. 2017. “Getting on the Map: A Case Study in Digital Pedagogy and Undergraduate Crowdsourcing.” DHQ: Digital Humanities Quarterly 11, no. 3: http://www.digitalhumanities.org/dhq/vol/11/3/000330/000330.html.

Locke, Brandon. 2017. “Digital Humanities Pedagogy as Essential Liberal Education: A Framework for Curriculum Development.” DHQ: Digital Humanities Quarterly 11, no. 3: http://www.digitalhumanities.org/dhq/vol/11/3/000303/000303.html.

Neumann, Tracy. 2019. Remaking the Rust Belt: The Postindustrial Transformation of North America. Philadelphia, Pennsylvania: University of Pennsylvania Press.

Sinha, Gaurev, Thomas A. Smucker, Eric J. Lovell, Kgosietsile Velempini, Samuel A. Miller, Daniel Weiner, and Elizabeth Edna Wangui. 2017. “The Pedagogical Benefits of Participatory GIS for Geographic Education.” Journal of Geography 116, no. 4 (August): 165–179.

About the Author

Camden Burd holds a PhD in History from the University of Rochester. From 2016–2018 he was an Andrew W. Mellon Fellow in the Digital Humanities. In 2018, the Center for the Integration of Research, Teaching, and Learning at the University of Rochester awarded him a Teaching-as-Research Fellowship to study student reactions to digital technologies in the humanities classroom. He also was named a Humanities, Arts, Science, and Technology Alliance and Collaboratory Scholars Fellow from 2017–2019. Beginning in fall 2020 Burd will begin as an Assistant Professor of History at Eastern Illinois University.

Assignments / Short Form Pieces

Computational Thinking-Centered Pedagogy: A Collecting Data with Web Scraping Workshop

Zach Coble, New York University

Introduction: The Challenges of Library Instruction

Library-based instruction can be a tricky thing. We usually only have one chance to give a workshop or to visit a class, so there is a pressure to get it right the first time. This is hard enough when teaching first-year students the basics of information literacy, and it presents an additional set of challenges for technology-based instruction. New technical skills are rarely acquired in 60- or 90-minute sessions. They more often require longer periods of study and build on a foundation of other technical skills that one cannot assume all participants will have (Shorish 2015; Locke 2017).

NYU Libraries offers a range of technical workshops designed to provide this technical foundation, and sessions cover topics such as quantitative and qualitative software, GIS and data visualization, research data management, and digital humanities approaches. While these workshops can be taken as one-off sessions, they are designed as part of an interwoven curriculum that introduces technical skills and concepts in an incremental way. The Collection Data with Web Scraping workshop discussed here is offered as a digital humanities course and, while there are no prerequisites and it is open to all, participants are encouraged to take the Introduction to Python and Text as Data in the Humanities workshops in advance (NYU Libraries 2019).

The workshop introduces web scraping techniques and methods using Python’s Beautiful Soup library, with a focus on developing participants’ computational thinking skills. I always emphasize that no one becomes an expert on web scraping in this 90-minute workshop, especially given that some have no previous programming experience. However, participants still learn valuable skills and concepts and through this process develop a more foundational understanding of computational logic and its affordances when applied to digital research. I call this computational thinking and it is the primary learning outcome of the workshop.

Agenda and Learning Outcomes

The workshop is divided into four sections, with the agenda as follows:

Why use web scraping?
What are the legal and ethical implications?
Technical introduction and setup
Hand-on web scraping exercises

The sections are designed to fulfill the workshop’s learning outcomes:

Strengthen computational thinking skills
Learn the concepts and basic approaches to web scraping
Understand how web scraping relates to academic research
Understand the broader legal and ethical context of web scraping

A Computational Thinking Centered Pedagogy

The primary learning objective of this workshop is to help participants strengthen their computational thinking skills. A basic working definition of computational thinking is understanding the logic of computers. It seems obvious, yet worth stating, that computers prioritize different patterns of logic than humans. There are multiple layers of complexity to understanding computational logic and then applying it in real-world research and teaching environment, and my approach is to reveal and make explicit some of these layers. For example, one of the core activities of the workshop is an in-depth look at how websites are packaged and how data, broadly defined, is structured within them using HTML and CSS. This close look at one of the building blocks of the web then allows us to identify patterns in this structured data in order to extract the useful pieces of information and build the collection. More importantly, these lessons are applicable to contexts beyond web scraping and are transferrable to our other workshops or to any activity involving data work. This gives the workshop an added value and empowers participants more confident and comfortable using technology in their research (Taylor et al. 2018).

In addition to identifying patterns in structured data, there are countless other opportunities to provide insights that give participants a deeper understanding of how technology works. For instance, when introducing the Beautiful Soup library, I describe how programming libraries are just blocks of code that allow us to write our program with 10 lines of code instead of 100. There are several web scraping programs written in other languages, but I chose Python because it has a robust developer community. That is, there are people contributing to a whole network of libraries, like Beautiful Soup, that serve to expand the functionality so that once you have extracted data from a website and are ready to analyze it, you can simply import another library, such as spaCy or the Natural Language Toolkit (NLTK) to do your next phase of work (Explosion AI 2019; NLTK Project 2019). When built into the curriculum in a thoughtful way, these parenthetical notes make it easier to learn the material at hand and also to establish a wider technical context for the work.

Why Use Web Scraping?

In addition to inserting computational thinking vignettes throughout the workshop, I find it helpful to begin with a discussion of why one might use web scraping. Since the workshop’s primary audience is humanists, this discussion of when web scraping is (and is not) appropriate and how it can be used in research is particularly useful. For example, as more and more primary and secondary source materials are appearing on/as websites, it is increasingly common for scholars to need to gather this material. Within libraries, archives and museums, initiatives such as Collections as Data underscore a shifting approach whereby library collections are conceptualized and provided as data (Always Already Computational 2019). Projects such as OPenn demonstrate how a library’s digitized special collections can be made accessible as machine readable and ready for large scale analysis (University of Pennsylvania Libraries 2019). An additional example, the New York Society Library’s City Readers project, presents the Library’s early circulation records as data, allowing users to, for example, compare whether John Jay or John Jacob Astor read more books in a given year (The New York Society Library 2019). Such examples help participants envision how they could use web scraping in their work.

Figure 1. A data visualization from the New York Society Library comparing circulation statistics among patrons.

Another core concept of the workshop is that web scraping will become one of many skills in participant’s “digital toolbox,” and can connect with other technical skills used in the research lifecycle. For example, data gathered from web scraping is often messy and often needs additional scrubbing in a program like OpenRefine (MetaWeb Technologies, Inc. 2019). Or, web scraping might be just one step in text analysis project, and you might want to use a named entity recognition (NER) package to next extract names of people or places from the scraped dataset.

What are the Legal and Ethical Implications?

Next is a conversation about the legal and ethical implications of web scraping. The key lesson here is that just because you can scrape a website, it doesn’t mean you should. It is important to first check a site’s terms of use policy to understand whether there are rate limitations or if scraping is outright prohibited. Collecting certain types of online data on human subjects (e.g. some types of social media data) will require IRB approval. After collecting data, scholars will also need to consider how will the data be stored or archived and whether this has the potential to put others at risk. This is a particularly pertinent concern for materials dealing with controversial subject matters or underrepresented groups. The Documenting the Now project has many great resources to help navigate these often complex issues (Documenting the Now Project 2019).

In terms of research best practices, it also takes some data literacy basics to evaluate your target source. There is a lot of garbage online, and how so do you know the data is what it claims to be? Is it representative and what biases does it contain? And research projects using digital sources or methods are no different from more traditional approaches in that getting the data or producing a visualization of it is often not the end of a project. In most cases, the data must then be analyzed in a theoretical framework of the scholar’s discipline in order to form a scholarly argument. The earlier cited example of the New York Society Library illustrates this well – the circulation record visualization shown above is an interesting anecdote but the image is a relatively simple data visualization and does not actually tell us anything meaningful about, say, the American Revolution or eighteenth-century reading patterns.

Using Beautiful Soup

While asking participants to bring their own laptop and set them up with their own Python environment provides rich opportunities for moments of computational thinking, it is time intensive, demanding on the instructor, and requires a longer workshop. A simpler approach is to use an already exiting environment such as JupyterHub, PythonAnywhere, or a computer lab with Jupyter Notebook installed (Project Jupyter team 2019; PythonAnywhere LLP 2019; Project Jupyter 2019).

Beautiful Soup is a Python library for extracting textual data from web pages (Richardson 2019). This data could be dates, addresses, news stories, or other such information. Beautiful Soup allows you target specific data within a page, extract the data, and remove the HTML markup surrounding it. This is where computational thinking skills are needed. Webpages are intended to be machine readable via HTML. The goal is to write a program, in machine readable form, that extracts this data in a more human readable form. This requires that we “see” as our computers “see” in order to understand that if, for example, we want the text of an article, that we need to write a program that extracts the data between the paragraph tags.

<p></p>

Once we understand the underlying rules for how pages are displayed – i.e. using HTML and CSS – we can start to see the patterns in how content creators decide to present different types of information on pages. And that is the computational thinking logic behind web scraping: identifying these patterns so that you can efficiently extract the data you need.

Computational Thinking in Action

The examples used in the workshop are available online (Coble 2019), and working through the first example – collecting the titles from the Craigslist page for writing, editing, and translation – will illustrate some of these concepts. While the research value of this data is rather limited, it is a straightforward example to introduce basic techniques that are built upon in subsequent examples.

Figure 2. Screenshot of Craigslist page for writing / editing / translation.

The first step is to use the browser’s View Source feature to look at the page’s HTML code. Not only do we get a quick glimpse into how the data is structured, we can also begin to identify the parts of the code that uniquely mark the title of these posts.

Figure 3. Screenshot of Craigslist page source code for writing / editing / translation.

For example, here is the source code for the first post on our page:

<a href="https://newyork.craigslist.org/mnh/wet/d/brightwaters-college-tasks-essay-exams/6998844623.html" data-id="6998844623" class="result-title hdrlnk">🎼 🎼 College Tasks | Essay | Exams | Course Help 🎼 🎼</a>

Let’s start by breaking this into parts:

a href="https://newyork.craigslist.org/mnh/wet/d/brightwaters-college-tasks-essay-exams/6998844623.html"

The above part is the link to the full post. We don’t want this because it’s not the title.

data-id="6998844623"

This looks better, but data-id appears to be a unique identifier for a specific post. If we write a program to search for this, it will only return one title. This won’t work because we want all titles of posts on our page.

class="result-title hdrlnk"

This looks much better. But there are actually two class tags here, class=”result-title” and class=”hdrlnk” (condensed and separated by a space), so which one is best? We can do a quick check by searching on the View Source page – using Cmd+F or Ctrl+F – for “result-title.” There are 120 posts displaying on my page, and the search for “result-title” returns 120 results. Bingo!

Figure 4. Screenshot of source code for Craigslist and the browser’s search feature.

We can repeat this process for “hdrlnk,” which, in this case, also returns 120 results. So we can comfortably use either “result-title” or “hdrlnk” for our program. To be safe, I would also do a quick manual check of other links on the page – both links for posts and for other links (My Account, Save Search, etc) to confirm that “result-title” and “hdrlnk” is the unique string that will return the post’s title and only the post’s title.

And this is the computational thinking the workshop helps to build. By understanding how web pages use HTML and CSS to structure their contents, we are able to isolate patterns unique to our target data and to use these patterns to extract the target data. Once we have these pieces in place, we can write a program that looks like this:

# import the urllib library to get the HTML
import urllib.request
# import the Beautiful Soup library to parse the HTML
from bs4 import BeautifulSoup

# define a variable with our web page
start_url = 'https://newyork.craigslist.org/search/bar'
# ask urllib to get the HTML from our web page
html = urllib.request.urlopen(start_url).read()
# ask Beautiful Soup to parse the web page as HTML
soup = BeautifulSoup(html, 'html.parser')
# ask Beautiful Soup to extract the titles
titles = soup.select('.hdrlnk')

# for loop to print each title
for title in titles:
    print (title.text)

And get something back that looks like this:

🎼 🎼 College Tasks | Essay | Exams | Course Help 🎼 🎼
Writing and English tutor. NYU and Columbia graduate.
Writing/Essay Assistance, $80. NYU and Columbia graduate.
Versatile Content Writer Provides Top Notch Business-Related Material
Experienced Proofreader At Your Service
Screenplay Solutions! Writing, Edits, Formatting, Etc.
Thesis, Research, Dissertations, Publications, Presentations.  Ivy
--> A Special Speech for a Special Event? |  Hire a Professional
Need a Bio? For profiles, websites, expert collateral, exec resumes
School and college coursework & essays w r i t i n g Service
$25 resume editing & consulting for students and young professionals
Don't Just Talk! Communicate - Medical School Intervew
Grad/law/MBA/med school personal statements due?
FOR HIRE: AWARD-WINNING, IVY-EDUCATED EDITOR/SCRIPT CONSULTANT
Pay me write your essay, edit your work, take an classes fully online
FAST Affordable Dissertation and Academic EDITING-NonNative English OK
Versatile Content Writer Provides Top Notch Business-Related Material
Winning Resume, Cover Letter and LinkedIn Package For $30
French writer and translator
Writers for FrontPage.nyc
Academic Intervention & Paper Writing

Conclusion

Bringing computational thinking concepts to the forefront of the workshops has been successful and resulted in more engaging sessions. Participant feedback has indicated that having a greater contextual understanding of web scraping and learning about its underlying principles has helped them better understand its potential applications and to feel more confident in doing their work. Given the nature of library-offered technical workshops, focusing on a computational thinking-centered pedagogy has been successful in helping participants to meet their specific need to pick up a new skill as well as to meet a less often stated need to understand how and why a particular tool or approach is situated within larger research and technology ecosystems.

Bibliography

Always Already Computational – Collections as Data. 2019. “Always Already Computational – Collections as Data.” https://collectionsasdata.github.io/.

Coble, Zach. 2019. Code examples from Collecting Textual Data with Web Scraping workshop. https://github.com/coblezc/webscraping-workshop.

Documenting the Now Project. 2019. “Documenting the Now.” https://www.docnow.io/.

Explosion AI. 2019. “SpaCy – Industrial-Strength Natural Language Processing in Python.” https://spacy.io/.

Locke, Brandon T. 2017. “Digital Humanities Pedagogy as Essential Liberal Education: A Framework for Curriculum Development.” Digital Humanities Quarterly Volume 113). http://www.digitalhumanities.org/dhq/vol/11/3/000303/000303.html.

Metaweb Technologies, Inc. 2019. “OpenRefine.” http://openrefine.org/.

NLTK Project. 2019. “NLTK 3.4.5 documentation.” https://www.nltk.org/.

NYU Libraries. 2019. “NYU Libraries Classes.” New York, NY: New York University. https://nyu.libcal.com/.

Project Jupyter. 2019. “Project Jupyter.” https://jupyter.org/.

Project Jupyter team. 2019. “JupyterHub.” https://jupyterhub.readthedocs.io/.

PythonAnywhere LLP. 2019. “Host, run, and code Python in the cloud: PythonAnywhere.” https://www.pythonanywhere.com/.

Richardson, Leonard. 2019. “Beautiful Soup.” https://www.crummy.com/software/BeautifulSoup/.

Shorish, Yasmeen. 2015. “Data Information Literacy and Undergraduates: A Critical Competency.” College & Undergraduate Libraries Volume 22, no. 1: 97–106. https://doi.org/10.1080/10691316.2015.1001246.

Taylor, Natalie G., J. Moore, M Visser, C. Drouillard. 2018. “Incorporating Computational Thinking into Library Graduate Course Goals and Objectives.” School Library Research Volume 21. http://www.ala.org/aasl/sites/ala.org.aasl/files/content/aaslpubsandjournals/slr/vol21/SLR_IncorporatingComputationalThinking_V21.pdf.

The New York Society Library. 2019. “City Readers.” https://cityreaders.nysoclib.org/.

University of Pennsylvania Libraries. 2019. “OPenn.” http://openn.library.upenn.edu/.

Data visualization of student activity (created in Gourse)

Issue Sixteen

Beyond the Fear of Failure: Towards a Method for Student Experiential Autobiography Mapping (SEAM)

Andrew Roth, Brock University

Alex Christie, Brock University

Abstract

This article advances a pedagogical ethos, which we call SEAM (Student Experiential Autobiography Mapping), that deliberately interweaves the interests of students, staff and faculty. As we argue, it additionally facilitates the design of project-based assignments that foreground the instructive value of failure. Within this context, we discuss instances where specific technological failures experienced in our fourth-year practicum have prompted us to change the way we teach our first-year courses and administer our workstations and servers. Doing so creates a feedback loop that allows us to incrementally refine our curriculum over time. After outlining the theoretical context for this approach and detailing how it allows students to learn from productive failure, we discuss a case study in implementing our SEAM approach in the classroom. As part of this discussion, we share practical examples for designing digital humanities assignments that incorporate failure as a learning outcome. We then go on to advance a longitudinal methodology for visualizing student learning over the course of an entire program, incorporating student technobiographies and user story mapping. Combined, these pedagogical strategies facilitate reflective student, staff, and faculty practices that allow a digital humanities curriculum, and chosen teaching tools, to grow and adapt over time.

Our Interactive Arts & Science students were less than two weeks away from competing in the LevelUp Student Showcase with their videogame created in our fourth-year capstone course. Yet minutes before a public play testing session they had encountered a show stopping bug. Random text and textures in their game were mysteriously replaced with glyphs: strings of cipher strewn throughout their game world, strange portents whose only underling message appeared to be the obvious—the game was unplayable, unreadable, and no one had the slightest idea as to the cause. Such failures are common in complex projects—from renovating classrooms to building a digital game. In our respective staff and faculty roles at Brock’s Centre for Digital Humanities, we are concerned at once with building and administering digital humanities infrastructure (i.e. workstations, servers, collaborative spaces) and reflecting upon how failures within those systems impact student learning. As we collaborate across our staff and faculty roles, we increasingly find the most potentially instructive failures occur when students brush up against the limitations of a particular tool. As a result, we are developing a broadly applicable digital pedagogy, combining technobiographies (Henwood 2001; Ching and Vigdor 2005; Brushwood-Rose 2006) and user story mapping (Patton 2014), that teaches students, staff, and faculty to learn from the productive failures that occur when we encounter the unforeseen limitations of the tools we use. Such learning involves deploying tools to solve a problem but also refining learning outcomes to enhance student-led problem solving using those tools. Operating in a multi-perspectival mode that resists partitioning the interests of students, faculty, and staff, we call this approach Student Experiential Autobiography Mapping, or SEAM.

SEAM sees the experiences of teachers, learners, and support staff as multi-threaded facets of shared knowledge environments and thus endeavors to further interweave them. This approach to digital pedagogy is a result of our ongoing collaborative work on the architecture of our first-year survey courses in the Interactive Arts & Science and GAME programs. These courses prepare our students for our third and fourth-year curriculum in which they are expected to collaboratively produce digital media objects, including innovative websites, digital art, and videogames. A notable challenge is using past failures, which tend to be tool-specific, to inform program outcomes, which are high-level objectives (such as learning from one’s successes and failures). Each year, a curriculum committee meets to assess the program outcomes provided as guidance to instructors to refine existing or develop new assignments. The SEAM approach to digital pedagogy outlined below describes how our method for changing infrastructure and assignments in response to our collective past failures continues to evolve. It is intended to keep a record of diverse student experiences while also helping us learn from the inevitable future failures that inform our curriculum development discussions.

We are piloting our SEAM approach to digital pedagogy at three points in a cyclical process during a four-year degree program. First, we equip students with problem-solving and troubleshooting abilities early in their program. Second, examples of critical tool failure in the fourth-year capstone courses circulate between students and instructor in our programs as cautionary tales. Changes in infrastructure, such as the addition of version control servers on campus, are material evidence of responding to failures from yesteryear; however, the narrative of student failure motivates their use. At the third point, once these changes have been made, they are incorporated back into the design of our first-year assignments. In the case of our fourth-year capstone students using version control, it is tempting to view the deployment of a server with version control, a tool, as the solution to a problem. However, paradoxically, the version control server is only a useful tool if it has been used proactively, and consistently, by students. As such, instructing students to use version control in their first assignments (despite its complexity) therefore sets the expectation that they will encounter failure later in the program.

Foregrounding technological failure at the start of our curriculum, we believe, enlivens students’ sensibilities to the creative potential of the tools we teach. Indeed, as Julia Flanders affirms: “The very seamlessness of our interface with technology is precisely what insulates us and deadens our awareness of these tools’ significance” (2019, 292). Having introduced and framed failure as constructive, we intend to map student experiences of failure throughout the program (with particular emphasis on the fourth-year capstone course), and use results gathered from such mapping to continually reflect upon and refine our first-year curriculum over time. Most importantly, we are conceiving of a SEAM approach as a way continually shape and refine the infrastructure in our digital humanities centre in response to changing student needs over time. Our final goal is a structured collection of autobiographical interviews with graduating students; this collection will serve as a knowledge database that we use to improve the learning objectives tied to future course development work. Using a design exercise called user story mapping, in which hypothetical users derive benefits from their actions, we will derive hypothetical case studies from the knowledge base and use them to inform faculty and staff decision-making related to our curriculum. We contribute our method as a working blueprint for collaboration between staff and faculty in the field of digital pedagogy.

Our method aligns itself with the seamful design of networked knowledge outlined by Aaron Mauro, Daniel Powell, and co-authors, who “wish to expose the seams that knit technological infrastructure and academic assessment for both faculty and students working on DH projects” (2017). While our approach concerns itself specifically with the classroom, rather than the context of student research on digital humanities projects discussed by Mauro et al., we equally believe that exposing students to seams—be they the ruptures and fissures that exist when tools break down or the threads that bind their own learning together with that of faculty and staff—empowers them to take an active role in the education as critical users and creators of technology. As Mauro et al. put it, “When we elide the seams between teaching and research, our students become passive agents and mere consumers of education” (2017). By teaching our students object-lessons in instructive failure, we aim to empower them to see digital environments not as spaces that demand rote repetition of established workflows but as creative problem-solving environments in which limitations and constraints can serve a liberating potential.

As the digital humanities continues to establish itself within disciplinary and institutional frameworks, discussions about the state of the field are increasingly turning from small-scale and ad-hoc stories of how different spaces operate to longer narratives about how these spaces continue to change and evolve over extended durations of time. Within this context, our SEAM approach is meant to offer a framework within which digital humanities, broadly, can draw from digital pedagogy, specifically, in order to reflect upon its diverse narratives of institutional establishment, adaptation, and maturation. In what follows, we discuss how we are implementing such an approach in our curriculum. First, we outline our experiences of instructive failure in the context of digital humanities infrastructure. We go on to discuss the design of project-based digital humanities assignments that incorporate instructive failure as a learning outcome. Finally, we conclude by outlining a method for collecting and reflecting upon student experiences of failure over time.

Beyond the Fear of Failure

The instructive value of failure is hardly new to the digital humanities. As John Unsworth reminds us, “Our failures are likely to be far more difficult to recover in the future, and far more valuable for future scholarship and researcher, than those successes” (1997). More recently, Bethany Nowviskie has renewed the value of failure in an age where ruptures in physical research materials prompt reflection upon ongoing institutional reformulations of humanities work; as she writes, “It’s worth reflecting that tensions and fractures and glitches of all sorts reveal opportunity” (2013). In the case of students in our Team-based Practicum in Interactive Media Design and Production, graphical failures were the symptom of an underlying constraint of the tools in hand. Textures in the game had exceeded the memory restrictions in the operating system (the NTFS filesystem defaults to a block size of 4096 bytes), causing a memory overflow that transformed their videogame into a piece of glitch art. A workaround was implemented, and their game debuted shortly thereafter on the packed floor of Toronto’s Design Exchange. How do the lessons learned by these students aggregate into best practices for future students?

Such glitches, ruptures, and failures often reveal infrastructural constraints in the digital humanities spaces we manage. In the instance of our 2018–2019 fourth-year practicum, the filesystem failure encountered by our students has prompted us to be more aware of the tool constraints for publishing executable games. Furthermore, the public play test was salvageable because of a best practice derived from previous years projects—reverting back to a stable build identified in their revision management system. Prior to that, in 2014, failures encountered by students prompted us to rethink how we scaffold instruction of specific tools, including revision management tools, across an entire curriculum. That year’s students signed up for an off-campus collaborative software development system with integrated version control. Project management services that include git or subversion repositories allow teams to make incremental changes to files in the cloud, syncing updates across all team members as they are made. But our students had encountered a problem: the service, provided under an educational license, did not recognize many of the emails they used as valid institutional addresses and locked them all out of the server. While the problem was resolved, it prompted us to fundamentally rethink how we teach a digital humanities curriculum. The student experiences with version control can also be gleaned from interviews with graduates of the IASC program dating back to 2012. In a similar experience to our 2019 students, graduate Isaac (anonymized) recounts:

About 24 hours before our team was heading to LevelUp to present our game, we encountered a problem where our most up-to-date build of the game was overwritten with an older build, so we lost more than five hours of work. We had to crunch to get our game back to where it needed to be for us to present at LevelUp. This is mainly because of the four lab computers we had access to use for our development, only one of those computers had the [game engine] installed. … We didn’t have a file server. We were using our 2GB free [file hosting service] accounts to share files. We should have had a file back-up system so we could’ve not lost all of that work.

Taking a cue from Miriam Posner (2016), we now administer revision management systems on file servers of our own and deploy assignments that teach students to use them in every year of the program. Like the filesystem failure our students were to encounter in 2019, the version control failure in 2014 prompted us to rethink the operating principles of our digital humanities space. We are continually motivated to formally refine and adapt the student experience in response to failures such as these.

The inevitable failures encountered by our students reveal a problematic underlying much digital humanities work, one that is as wicked as it is productive. In our university-driven work with digital tools and resources, we continually encounter instances in which digital tools developed for industry use don’t neatly align with our academic context. In other words, digital humanities scholars and students frequently work with what Susan Leigh Star and James Griesmeyer call boundary objects, those ubiquitous infrastructural resources which cross between different localized implementations and diverse communities of practice. Working with such objects causes productive failures of all sorts, such as a company’s server not recognizing our student’s institutional email addresses. Elsewhere, we have found that many educational licenses for industry-grade software restrict the contexts in which student work can be exhibited to public audiences. While using such licenses allows students to learn industry-grade tools, it also forces them (and us) to learn about licensing restrictions by diligently avoiding instances in which industry and academic uses for the tool may conflict. Conflicts such as these may tacitly inform many digital approaches to teaching rhetoric and composition that bring industry or for-profit tools into the classroom. To use more ubiquitous examples, using social media platforms such as Twitter or Medium as a venue for publicly disseminating scholarship brushes up against these platforms’ use of text as a vehicle for monetization. What can we learn about the mechanisms of clickbait, bot traffic, or sponsored posts when the tools we use to teach writing are designed to leverage these phenomena? What productive conflicts arise when using YouTube to access Open Educational Resources in the classroom also means students must watch advertisements during a lecture or other class-based exercise? As a variety of digital tools are increasingly incorporated into the classroom, their status as boundary objects that sit across diverse (and at times contradictory) contexts is evident in ways both small and large.

Situating boundary objects such as these in the field of critical infrastructure studies, Alan Liu advocates that digital humanities work “assist in shaping smart, ethical academic infrastructures that not only further normative academic work … but also intelligently transfer some, but not all, values and practices in both directions between higher education and today’s other powerful institutions” (2016). We agree emphatically, and we further believe that such an understanding of infrastructural boundaries forms an approach to digital pedagogy grounded in the instructive value of failure. We continue to learn much from infrastructural failures in which the tool at hand carries and underlying set of constraints that, sooner or later, conflict with the context in which it is being implemented. We further believe such conflicts may be repurposed to suit learning outcomes contingent upon productive failure. For instance, while the research tool Zotero is designed to store bibliographic citations, it can also be used to store other types of information (thus transforming it into a boundary object). Asking students to create a bibliographic record of their classmates’ discussion contributions in Zotero invites failure cases where the metadata students wish to record doesn’t neatly align with the fields dictated by Zotero (and various citation styles); these failure cases prompt students to learn about citation styles and bibliographic records by exploring their limitations and edge-cases. Similarly, much could be learned by asking students to compose a piece of academic writing using a text-based tool that is not designed for outputting print documents. Twine, for example, is designed to create text-based adventure games and interactive narratives; what might students learn about the conventions of academic writing by using Twine to write a short research paper? In our work as digital humanists, we frequently find that the tools we work with aren’t perfectly suited to the task at hand; as such, we have begun to design project-based assignments in which students are deliberately exposed to failures of this sort and taught to learn from them. Whereas digital pedagogy often formulates technological literacy as the ability to use a tool properly, we find technological literacy also encompasses creatively rethinking such practices in inevitable instances when the tool is only moderately suited to the present context. Echoing Mauro et al. and Flanders, this SEAM approach exposes students to the ruptures and fissures inherent in working with digital tools (which we see as boundary objects), rather than suggesting effective digital humanities work involves the seamless operation of technology.

Learning to Fail: Designing Experiential DH Assignments

The idea of a digital pedagogy based in productive failure first emerged through a conversation between Alex Christie and CDH Project Coordinator and Technical Assistant, Justin Howe. Undertaking a rapid prototyping process of our digital prototyping assignments, they considered assigning Axure RP (a digital prototyping tool) as an environment for developing small-scale persuasive games. (Bogost 2010) They agreed that the fact Axure is not a game development environment was precisely why this assignment would be so valuable to our students—the lesson to be learned was that success always means success within a set of allotted constraints. In this way, the Axure tool was being deliberately used in a context for which it was not intended—creating videogame prototypes—and therefore explicitly deployed as a boundary object. The assignment therefore forced students to figure out what creative ideas could be successfully implemented within the constraints of the Axure RP prototyping environment and other assignment parameters. In this way, it sought to expose students early on to the pragmatic value of digital prototyping (and digital humanities work broadly), not solely as an exercise in dreaming up blue sky potential, but also—more unforgivingly—as a process of forging the realistic out of the fantastic. They were bound to encounter productive failure.

If the chief learning outcome of the assignment is for students to understand that concept cannot feasibly exist apart from execution, it also codifies the underlying pedagogical values within which we situate our pedagogy. The prototyping work asked of students requires them to approach Axure as a creative problem-solving environment. This means students frequently encounters instances when the tool does not allow them to achieve an important part of their intended game. In order to move forward, students must fundamentally rethink how the tool can be used in order to achieve their stated outcome. For instance, one team created their own method for causing screen brightness to dim by overlaying a black square on the window and tying its opacity to a variable whose value was influenced by player actions. Another team failed at creating a collision-detection system that would stop the player from going through the walls of a maze; instead, they used Axure’s condition builder to ensure the two objects could never overlap. By asking students to create a videogame with a tool moderately suited to the task at hand, we build an environment where students quickly reach the constraints of the technology they use. This creates an experiential learning opportunity in which students are forced to encounter and learn from moments when technologies do not work as intended, learning to create new solutions to problems when a previous approach has failed. A key learning outcome of the assignment, then, is not so much learning how to use the assigned tool correctly as much as it is learning to continue using the tool to productive ends when it fails and breaks down.

Such a learning outcome requires students to learn to see the software environment used not as a space where outcomes are met by replicating established workflows (or a sort of digital reimagining of Paulo Freire’s banking model of education) but instead as a system that can be creatively rethought and repurposed. Central to this view is an emphasis on project management and collaboration fundamentals, which are built right into the architecture of the assignment. Following the CDH’s decision to host its own server infrastructure in 2014, we decided to build subversion into the architecture of the assignment as well. Each team is allocated its own SVN repository, and each repository is then used for students to collaboratively work on their version-controlled Axure project. Teams are also asked to communicate using Discord, and Andrew Roth uses web hooks to push changes to the subversion repository directly to each team’s corresponding Discord. Asking teams to construct their prototype using a version-controlled workflow teaches practical lessons in project management, such as using a centralized repository rather than emailing files and letting team members know when new deliverables are added. These are key lessons learned from previous instantiations of our fourth-year practicum, which we have now rolled forward into the design of our first-year assignments.

Most importantly, asking students to adopt version-control and team communication solutions as part of their assignment workflow means designing a particular lesson into the assignment: that collaboration is about accountability. Before beginning their prototyping work in earnest, teams are required to submit a Developer Document that divides prototyping work into five roles (Visual Designer, Data Modeler, UX Designer, UI Designer, and Creative Director) and asks teams to outline how the deliverables for one role required assets produced by another. This division of assignment duties foreshadows the communication challenges of the fourth-year teams; Victor (anonymized), class of 2016, said his experience of failure manifested “by either conveying too little information, outdated information, or undecided information across team before it [was] vetted.” Teams quickly learn that certain parts of the project cannot be completed until its dependencies are ready, which means that various teams encounter workflow and communication failures that expose gaps in their existing conception of how collaborative work gets done. In their final presentations to the class, numerous first-year teams reflected upon the importance of coming together to work as a team, whether such reflection included successful team workflows or admitting that a siloed approach had not delivered the expected results. We find using formalized systems, such as Discord and SVN, for team-based work helps students identify and visualize interpersonal and communication errors because team progress becomes directly contingent upon students using the system to send updates to fellow teammates. Giving students low-stakes environments to learn from such failures early in the program prepares them to address, or even obviate, high-stakes failures of this sort in their upper-year team-based practicum.

The lesson that workflow is as much about accountability as it is about cultivating a positive interpersonal environment is one that can only be learned experientially, which means designing a pedagogical framework within which teams can safely encounter workflow failures and move forward based on insights discovered therein. This framework prepares students to learn from team-based failure in two ways. First, in the weeks leading up to the final assignment, the instructor delivers lectures on topics including digital prototyping fundamentals and team management, which explicitly outline the different stages of team formation and best practices as teams move from one stage to the next. Second, the incorporation of technologies such as SVN and Discord creates a collaborative environment in which output and accountability are directly fused: each time a student works with a new version of the project, they cannot begin their work until encountering the latest revision made by another team member. Similarly, if the team hits a roadblock in their prototype because a certain asset or dependency is missing, the entire team can immediately identify the source of accountability. Both conceptually and pragmatically, then, the assignment is framed as an exercise in developing competencies in collaborative prototyping, defined as an iterative process where progress comes from finding out what doesn’t work and then moving forward. In this way, collaboration failures experienced by teams serve as object lessons in scope management, in which students are forced to consistently ask which practices best suit their goals and which do not. These project-based assignments therefore function as experiential learning opportunities in which students learn from technological and collaboration failures by directly encountering and overcoming them. So far, results have exceeded expectations. One team made a game in which navigating the maze of Brock’s Mackenzie Chown complex served as a functional metaphor for navigating depression. Another made a game about surveillance and counterinsurgency, while still others tackled topics including personality disorders and cultivating gratitude.

The first stage of our SEAM approach to digital pedagogy thus involves designing project-based assignments where students reach their own insights into doing digital humanities work by learning from instructive failure. Such failures are built into the assignment by treating the tools being taught as boundary objects, or technologies that are not perfectly suited to the given task. These assignments prompt students to reach the limitations of the tool and creatively overcome them. In the context of videogame design, this may include using a non-Game Development Environment (such as Axure) to create a videogame; in still other educational contexts, this may include using a Game Development Environment (such as Twine or Game Maker) to write a research paper or using a monetized platform (like YouTube or Facebook) to disseminate Open Educational Resources. In this way, a SEAM approach to designing digital humanities assignments focuses more on the assembly of conceptual and technical systems within which we ask students to explore and create, rather than handing down prescribed workflows by rote (again, with a nod to Freire). In turn, we ourselves refine such systems in response to student experiences later in the program, incorporating tools such as SVN and encouraging students to encounter the places where their work using such tools may begin to show at the seams.

Learning from Failure: Student Reflection through Data Visualization

In order to prompt student reflection upon failures encountered in their project-based work, we visualize student data generated throughout the course of these projects to build models of student knowledge. Andrew Roth creates such visualizations by taking the Subversion history from each team and visualizing it with Gource, an open source tool created by Andrew Caudwell that displays file systems as an animated tree evolving over time. Visualizing the complexity of the shared file system under version control at once makes the metadata of the process more legible and the task of growing that system more daunting. For example, by visualizing and comparing each repository of a single class, we can see at a glance which teams closely emulate the instructor’s example project and which grew beyond in the allotted time. While the rules of collaboration require students to diligently maintain the up-to-date version of their project, or head, by checking in functioning code, the metadata captured in the history shows a record of every failure including malfunctioning ignore files, desktop shortcuts mistakenly checked in as assets, and abandoned plugin folders. In sum, the Gource visualization for each team shows how that team’s version-controlled files and folders changed throughout the course of the project, providing a visual rendering of student activity in Axure. The visualizations open a space for reflecting on both the metadata borne of the technological infrastructure required for collaborative project work and the narrative that emerges from managing the project’s complexity over time.

Figure 1. Data visualization of student activity in first-year GAME course (created in Gource).

Figure 2. Data visualization of student activity in first-year Interactive Arts & Science course (created in Gource).

For example, in both visualizations the sample project created by the instructor is created first, followed by each group project. In an instant we can see there are sprints of productivity during lab times and very few team members committing to projects on the weekends. Using the instructor sample as a measuring stick, we can see that there are few projects in the 1F01 class that emulate the sample project’s complexity, whereas the 1P04 course has a smaller sample project and larger, more complex group projects.

Figure 3. Data visualization of student activity in fourth-year capstone course (created in Gource).

We have also used Gource to visualize the videogames created by our fourth-year students. Using data from each SVN repository used over the past four years, we are able to see differences between each of our past four student teams. For instance, the first group using version control (before hosting a server on premise) demonstrates a tightly controlled structure managed by only one or two users. In subsequent years, the number of total simultaneous users increases. This suggests the repository is used by more individuals across their respective teams, which is supported by the push by faculty to use version control across all years of the program. The number of large-scale changes over time (such as branches or deletions) also increases in frequency which indicates that mistakes are made, large scale changes are applied (such as telling subversion to ignore certain file types), and these mistakes are corrected as time passes. It is also clear how the scope of the single 4L00 project dwarfs the first-year projects in size and complexity.

After presenting these findings from our first round of visualizations at the 2018 Digital Pedagogy Institute, we began integrating these visualizations back into the pedagogical structure of our first-year classes. Once teams have completed their prototypes, we provide them with the Gource visualizations of their work as an .mp4 video and use these videos as prompts for their final reflective assignments. In their reflective essays, students frequently noticed that work was conducted ad-hoc by different team members, rather than following a pre-established working schedule. Gource videos frequently showed irregular bursts of activity from different team members, rather than steady and predictable output that followed a coordinated project schedule. This was also one of the key ways in which Gource visualizations of work done in our first-year courses differed from that of our fourth-year courses. As such, students frequently remarked that a key failure was not coordinating their schedules and efforts more closely, and that such failure was not apparent to them until they saw the timeline of their Axure work rendered visually through Gource. Using formalized systems for student collaboration lets instructors visualize student activity and provide such visualizations as tools for student reflection; we find SVN and Gource to be an effective combination of tools for designing these reflective exercises.

While the principal outcomes of the assignment are for students to assess their evolving abilities in collaborative environments, the incorporation of the Gource visualizations further demonstrates for students that soft skills including communication, organization, and team dynamics cannot and should not be neatly parsed from technical considerations such as scheduling deliverables, maintaining project dependencies, and designing data and folder structures. The assignment furthermore reframes data visualization techniques not simply as tools for revealing objective facts but additionally as environments for metacognitive reflection and personal growth. How might digital tools reveal the seams between a student’s own approaches to collaboration and those of their teammates? As we prompt students to derive reflective insights from data visualizations of their work, we also encourage more technically-minded and tech-averse students to understand that technical implementation and interpersonal interaction co-construct the latticework upon which their knowledge matures and thrives.

Stitching Our Work Together: Faculty and Staff Reflection through Autobiography Mapping

Together, our use of digital prototyping assignments and reflective exercises involve stitching together disparate strands of student failure and digital tools, using such threads as opportunities for both student and instructor learning. Thus far, we have reached a series of findings for designing project-based digital humanities assignments and using them as a vehicle for faculty and staff reflection. First, it is essential to deliver lectures on team formation fundamentals as part of the introduction to project-based assignments; doing so both introduces students to collaboration best practices (a core element of doing digital humanities work) and teaches them how to move forward from inevitable stumbling blocks. Instructors can further encourage students to learn from failure by discussing the fundamentals of scope management, time management, and rapid prototyping—all of which assume that ideas are developed by encountering errors in planning and then retooling that plan in order to move ahead. Doing this over and over, or learning through iteration, dispels the common myth that excellent ideas and strong skill sets emerge from a vacuum. As part of this approach, instructors can introduce the assignment by giving students a template and encouraging them to tweak it; for instance, our GAME students are given a short game prototype made in Axure RP and asked to fix a series of bugs (thereby preparing them to fix the eventual errors in their own game prototypes). Most of all, faculty and staff can and should work together to design the suite of technical dependencies for the assignment, architecting an environment that encourages students to safely explore and experiment instead of copying prescribed workflows by rote. While staff provide insight into the technologies available for classroom use (in our instance, Andrew facilitates the integration of Axure with SVN and Gource), instructors design activities and assignments where these technologies are used to create materials they were not primarily designed to output (and share the results with staff administering the tools). Such collaboration allows for staff and faculty to approach the classroom as an environment for low-stakes failure, while continuing to prioritize student learning as the setting’s principal outcome.

As we continue to move forward based on these insights, we are considering how this form of faculty-staff collaboration can scale up from the level of the individual course. The final stage of our SEAM approach does just this, examining student progress longitudinally throughout the whole of the program and over the course of multiple years. Inspired by Donna Haraway’s formulation of cyborg subjectivities, this next stage of our work sees student autobiographies as reflexive records of where intersectional identities evolve alongside, and are imbricated with, the technologies with which they work. This research will analyze longitudinal student experiences through user story mapping, a technique commonly used to define priorities within agile software development. Software developers lead interviews and focus groups to understand how users’ expectations map to the offerings of their software. The scope of the user story mapping in software development is deliberately broad and shallow, narrowing the most possible use cases into minimum viable product releases. In order to catch the broadest perspective on student experience, we have chosen biographical information that demonstrates the student’s relationship to technology—their technobiography. The technobiographical method originally loosely outlined by Kennedy in 2003 has previously been applied to stories of learning by youth (Brushwood-Rose 2006) and educators (Ching and Vigdor 2005). By collecting, transcribing, and tagging biographical interviews, we intend to create a repository of user stories that can be drawn upon to help address infrastructure challenges holistically. The result will be a dynamic and searchable repository of student reflections on their learning experience that faculty and staff can consult in order to inform various levels of decision-making. As the repository grows over time, it will allow additional insight into how student learning in our digital humanities curriculum changes longitudinally. While the idea of a “minimum viable product” seems inherently reductionist, the goal is not to produce static or artificial boundaries around the learning experience, rather to set priorities and outline critical paths to completion relative to external factors (e.g., time, money, space, goodwill). Our students’ narratives tell us as much about the subjectivities that move through our learning systems as they reframe the systems-level formulations to which infrastructure, by necessity, reduces human experience.

Scaffolding upon the reflective assignments introduced alongside Gource visualizations of student work, we intend to collect student autobiographies as they move throughout the program and across multiple years. This will result in a searchable database of key challenges and successes encountered by student teams over time, revealing key inflection points in the development of our infrastructure and our curriculum (such as our 2014 failures associated with version control and our 2019 failures with the NTFS filesystem). As we continue with this work and gather findings over multiple years, we envision our method and the data it generates as an autobiography of long-term growth and adaptation in Brock University’s Centre for Digital Humanities. While digital humanities spaces continue to disseminate news of progress and successes, we believe they can also share key failures as part of a productive and forward-looking institutional narrative. What are the stories behind the technologies and best practices incorporated into our labs and our curriculum? How might student experiences of technological failure inform decision-making processes when it comes time to purchase new workstations, format hard drives, and set up server space for student work? Through their own stories about themselves and how they change over time, our students and their experiences of failure may reveal much of ourselves—our intellectual values, our operating principles, and what we may still become.

Bibliography

Bogost, Ian. 2010. Persuasive Games: The Expressive Power of Videogames. Cambridge, MA: The MIT Press.

Brushwood-Rose, Chloë. 2006. “Technobiographies as Stories of Learning.” Public 34 (Fall): 88–95.

Ching, Cynthia and Linda Vigdor. 2005. “Technobiographies: Perspectives from Education and the Arts.” First International Congress of Qualitative Inquiry (May): 1–22.

Flanders, Julia. 2019. “Building Otherwise.” In Elizabeth Losh and Jacqueline Wernimont, eds. Bodies of Information: Intersectional Feminism and the Digital Humanities (289–304). Minneapolis: University of Minnesota Press.

Freire, Paulo. 2017. Pedagogy of the Oppressed. New York: Continuum.

Henwood, Flis, Helen Kennedy, and Nod Miller. 2001. Cyborg Lives: Women’s Technobiographies. York, UK: Raw Nerve.

Juul, Jesper. 2009. “Fear of Failing? The Many Meanings of Difficulty in Video Games.” In The Video Game Theory Reader 2, edited by Mark J.P. Wolf and Bernard Perron, 237–252. New York: Routledge. https://www.jesperjuul.net/text/fearoffailing/.

Liu, Alan. 2017. “Drafts for Against the Cultural Singularity (book in progress).” Lit.english.ucsb.org, February 20, 2017. http://liu.english.ucsb.edu/drafts-for-against-the-cultural-singularity.

Mauro, Aaron, et al. 2017. “Towards a Seamful Design of Networked Knowledge: Practical Pedagogies in Collaborative Teams.” Digital Humanities Quarterly. http://www.digitalhumanities.org/dhq/vol/11/3/000322/000322.html.

Nowviskie, Bethany. 2013. “Resistance in the materials.” Nowviskie.org, January 4, 2013. http://nowviskie.org/2013/resistance-in-the-materials/.

Patton, Jeff, Peter Economy, Martin Fowler, Alan Cooper, and Marty Cagan. 2014. User story mapping: discover the whole story, build the right product. Beijing: O’Reilley.

Posner, Miriam. 2016. “Here and There: Creating DH Community.” In Matthew Gold, ed. Debates in the Digital Humanities. Minneapolis: University of Minnesota Press. http://dhdebates.gc.cuny.edu/debates/text/73.

Unsworth, John. 1997. “Documenting the Reinvention of the Text: The Importance of Failure.” The Journal of Electronic Publishing 3.2 (December). https://doi.org/10.3998/3336451.0003.201.

About the Authors

Andrew Roth is the Technical Associate: Research and Learning Support in the Centre for Digital Humanities, Brock University. An exhibited artist and published interdisciplinary scholar, he has led and collaborated in augmented reality experiences, the development of published mobile apps, and the creation of tools for digital media artists.

Alex Christie is Assistant Professor of Digital Prototyping at Brock University’s Centre for Digital Humanities. In 2017, he completed the Pedagogy Toolkit project, which received grant support from the Association for Computers and the Humanities. In 2018, he served on the organizing committee for the Digital Pedagogy Institute.

Cover of ROBERTO BUSA, S.J., AND THE EMERGENCE OF HUMANITIES COMPUTING, featuring a black-and-white photo of a priest looking up at a punchcard.

Reviews / Short Form Pieces

Demythologizing the Priest and his Punched Cards

Andrew C. Stout, Covenant Theological Seminary

Review of Roberto Busa, S. J., and the Emergence of Humanities Computing: The Priest and the Punched Cards (New York: Routledge, 2016). $150.00 hardback, $38.47 ebook.

Digital humanities (DH) is a contested field—contested in terms of its definition, its scope, and its long-term relevance. One issue on which there has been much consensus is the central place given to the work of Robert Busa when charting the origin of DH. It is standard fare in DH publications to cite Busa’s Index Thomisitcus as the first major project in humanities computing, the classic example of this being Susan Hockey’s history of humanities computing in the foundational volume A Companion to Digital Humanities (2004). In Roberto Busa, S.J., and the Emergence of Humanities Computing, Steven E. Jones complicates even this supposedly straightforward narrative, and he manages to do so even while indicating fruitful ways of understanding and advancing work in the field.

Roberto Busa (1913–2011) was an Italian Jesuit priest and scholar trained in philosophy at the Papal Gregorian University in Rome. A philological approach to studying the metaphysics of Thomas Aquinas eventually led Busa to travel to North America in 1949 in search of machine assisted methods for compiling a lemmatized concordance that included every word in Aquinas’s writings. Through correspondence and archival records, Jones tells the story of Busa’s collaboration with IBM to develop the Index Thomisticus using IBM’s iconic punched cards, a form of data storage for early computing. It is not difficult to see how contemporary scholars working in DH could point to this early relationship between a humanities scholar and the corporate technology leader as a point of origin for the field. Jones explains his demythologizing project: “It’s not my aim to debunk [the myth], but only to provide a more complicated picture of its history, to fill in some of the rich contexts out of which the myth arose in the first place” (9–10). While Busa is unquestionably an important figure in charting the development of humanities computing, and eventually DH, Jones is interested in situating Busa within a complex historical web of developments in media and technology.

This goal leads Jones to dismiss another, more elusive myth—the myth of “progress.” Nowhere is the temptation to view historical developments as an inevitable upward progression more evident than in studies of technology. Jones counters this assumption by attending to historical contingencies and “adjacent possibilities” as he traces the working relationship between Busa and IBM. Busa’s use of punched-card processing systems allowed him to process an impressive amount of data, but the system was hardly cutting edge. While these systems used technology that had been in place since the nineteenth century, IBM’s Selective Sequence Electronic Calculator (SSEC) was also on display in the company’s Manhattan showroom. This iconic early computer was not made available to Busa, but it played a large part in IBM’s public image. Jones discusses the computing culture surrounding the SSEC at length as a way of setting Busa’s work in historical context. One major aspect of this context is the neglected role of women in the history of computing. Jones emphasizes that “there were women everywhere in early computing, in one capacity or another, from wartime ‘computers’ using calculating machines, to plugboard ‘programmers,’ to keypunch operators, to system operators and some software programmers, once that became a possibility” (64). There were many women who worked as operators with Busa on the production of the Index Thomisticus, and Busa trained many young women to be operators at his Center for Automation and Literary Analysis in Gallarate, Italy. However, the substantive work done by these women was often categorized as simply “clerical,” and the twenty first century has seen a major decline in the number of women in professional programming. The myth of progress is further complicated as Jones discusses tensions between academic, political, and corporate interests, as well as issues of advertising and the public perception of computing. He also highlights the array of alternative technologies that were on offer at the time that Busa was processing his punched cards.

Jones constructs a layered and detailed historical narrative that takes account of the personalities, spaces, and material objects involved. This serves to flesh out Busa’s own accounts of his work (which Jones cites frequently) and IBM’s corporate records. Jones’s ability as a storyteller is particularly evidenced by the “exploded view” that he offers of the initial 1949 meeting between Busa and Thomas J. Watson, Sr., CEO of IBM. He focuses on various details of the image of the two men meeting in IBM’s offices to explain the political, social, and technological issues that set the stage for the initial conversation and conditioned the ongoing relationship.

Several aspects of Busa’s projects draw out the relationship between developing technology and the humanities as they relate to research and teaching. First, the public demonstrations that Busa gave of his work—particularly one held at IBM World Headquarters on Madison Avenue in June of 1952—show how the interactions between computing and humanities research influence pedagogy. His demos before interdisciplinary groups of scholarly, corporate, and ecclesiastical figures were impressive displays of automated technology, but “it turned out to be the general concepts of how to use punched-card machinery to treat language as data to be processed … that influenced practice over the long term” (82). By introducing punched-card technology into philological analysis, Busa began to demonstrate how vast amounts of data generated from a text could also generate new perspectives on that text. His teaching demonstrations alerted scholarly and corporate communities to the way that the iterative process facilitated by computing allowed for a deconstruction, fragmentation, and reconstitution of texts, making technology a real collaborator in the process of creating meaning. Through these technology demonstrations, Busa sought to convince a varied audience of influential people of the usefulness of his adaption of technology in the service of the hermeneutical process. With the growth of Silicon Valley, we have all come to be aware of the formative capacity of this sort of technology demo. Jones cites Steve Jobs’s Apple Computer events and TED Talks as formats comparable to the demos of Busa and other early tech innovators. As with these more recent examples, the intent of Busa’s demos were not simply to transfer information about a new technology. Just as the public unveiling of a new Apple product creates expectation and emphasizes the way that a new iPhone, iPad, or other device can have a transformative effect on our lives, so Busa’s demos were designed to inform, impress, and indicate the potential his punched-card technology had to transform humanities research.

Second, Busa’s Center for Automation and Literary Analysis (CAAL) featured a pedagogical method that was transformed by the interaction of humanities scholarship and computing. Jones describes the CAAL, founded by Busa in 1956, as “a vocational ‘training school for key-punch operators,’ technicians of automated accounting machinery.” The CAAL’s “first reason for being was research … with students working in effect as interns, learning on the job for two-year stints” (119). The students at the CAAL (predominantly young women) found themselves in an environment that combined elements of the laboratory, factory, and religious community as they gained experience operating IBM machines as they worked to create, among other projects, an index for the recently discovered Dead Sea Scrolls. The CAAL was supported by a network of corporate and governmental patrons, and it benefited from Busa’s continuing relationship with IBM. The operators didn’t receive diplomas for their training, but they gained experience in marketable job skills. This combination of scholarly research with vocational training in computing technology leads Jones to describe CAAL as “arguably the first humanities computing center” (134). Many of the same critiques that have recently been leveled at DH initiatives in higher education (see Allington, Brouillette, and Golumbia 2016) can be seen in Jones description of the CAAL. Corporate, government, and scholarly interests were intertwined in Busa’s projects. New technical skills were learned in a context that served humanities research.

Jones demonstrates the two-way street of the pedagogical process—available technologies shape the processes of research and learning, and educational traditions influence the adoption and use of new equipment and methods. This can be seen in the contrast between the industrial model on which CAAL was based and the humanistic impulse behind its work. The training school was “both internship and production line for data processing of texts” (125). The CAAL was housed in an old industrial building, and IBM’s machines were arranged on the old factory floor. The image invoked by the setup is some combination of a factory line and a scriptorium. Technical skills were being learned not simply to manufacture a product but for the deeper understanding of a literary text. One important way Jones illustrates this dynamic is by highlighting Busa’s Jesuit identity. Jones suggests that the Jesuit focus on higher education and on broad engagement with secular culture contributed to Busa’s willingness to build networks through unlikely partnerships and to view the creative implementation of technology as an extension of humanities research.

Jones, like many practitioners, delights in the indeterminacy and indefinability of the broad field of DH. By inspecting, atomizing, and reconstituting the historical context and technical details of Busa’s career, he succeeds in demonstrating that this indeterminacy was present from the outset. Jones pushes back against definitions of DH that tie the field too rigidly to Busa’s project. He wants a definition with broad enough boundaries to include work in areas such as media studies, video games, social media, and more. Elsewhere, he defines DH broadly “as an umbrella term for a diverse set of practices and concerns, all of which combine computing and digital media with humanities research and teaching” (5). Paradoxically, Jones demonstrates that Busa’s project was only one of any number of contributing factors to the development of humanities computing and (later) DH while also uncovering new and unexpected ways that the details of that project are relevant to the expansion of the field. Though Busa initially presented his work in humanities computing terms—with machines filling a utilitarian role in otherwise traditional humanities research—digging into the material specifics of his projects reveals that technology and its layered applications were making theoretical contributions at every stage of his research. With this substantive and engaging book (as well as the Tumblr site that augments the book), Jones has made an essential contribution to the field of DH.

Bibliography

Allington, Daniel, Sarah Brouillette, and David Golumbia. 2016. “Neoliberal Tools (and Archives): A Political History of Digital Humanities.” Los Angeles Review of Books. May 1, 2016. https://lareviewofbooks.org/article/neoliberal-tools-archives-political-history-digital-humanities/.

Hockey, Susan. 2004. The History of Humanities Computing. In A Companion to Digital Humanities, edited by Susan Schreibman, Ray Siemens, and John Unsworth. Oxford: Blackwell. http://www.digitalhumanities.org/companion/view?docId=blackwell/9781405103213/9781405103213.xml&chunk.id=ss1-2-1.

Jones, Steve E. 2014. The Emergence of the Digital Humanities. New York and London: Routledge. http://www.oapen.org/search?identifier=1004200.

About the Author

Andrew C. Stout is the Access Services Librarian at the J. Oliver Buswell Jr. Library at Covenant Theological Seminary in St. Louis, MO.