Jul 21

Sounding it out: modeling orality for large-scale text collection analysis

Categories:

Abstracts: Papers, Representing Knowledge Conference

Representing Knowledge in the Digital Humanities (Saturday, September 24, 2011)
Conference Schedule

Clement, Tanya. Assistant Professor, School of Information, University of Texas

Title: Sounding it out: modeling orality for large-scale text collection analysis

Abstract: Many scholars and poets have written about the remarkable experience of hearing Gertrude Stein’s texts read aloud. “Language poets” who emerged in the 1960s and 1970s and who form important scholarly communities today have adopted Stein as an early influence and a model. In part, the nature of this relationship has been ascribed to the indeterminacy and the manner of language play that Majorie Perloff and others see evinced in Stein’s writing, but the extent to which prosody and rhythm has also influenced these artists goes undocumented.

Further, very few scholars have had the means to investigate the speech patterns (whether African American or German or French) that may have influenced Stein. This paper will discuss a use-case study in which I am using data mining to examine clusters of patterns in Stein’s poetry and prose compared to those in non-fiction narratives and oral histories as well as those present in contemporary poetry. Taking advantage of pre-existing research and development with the Mellon-funded SEASR (The Software Environment for the Advancement of Scholarly Research) application, this work has included identifying OpenMary XML (a text-to-speech system that uses an internal XML-based representation language called MaryXML) output as a base analytic, producing a tabular representation of the data for clustering and predictive modeling that includes phonemic and syntactic elements, creating a routine in MEANDRE (a semantic-web-driven data-intensive flow execution environment) that produces this data and allows future users to produce similar results, and developing a user-interface for seeing these comparisons across collections of texts. Access to large-scale repositories of text opens larger questions about how literary scholars can use such repositories in their research. John F. Sowa writes in his seminal book on computational foundations, that theories of knowledge representation are particularly useful “for anyone whose job is to analyze knowledge about the real world and map it to a computable form” (xi). Similarly, Sowa notes that knowledge representation is unproductive if the logic and ontology which shape its application in a certain domain are unclear: “without logic, knowledge representation is vague, Sowa writes, “with no criteria for determining whether statements are redundant or contradictory,” and “without ontology, the terms and symbols are ill-defined, confused, and confusing” (xii). Knowledge representation is the work of all scholars in digital humanities and these scholars must help determine the logics and ontologies that shape how we access this data. Charles Bernstein has written that “[t]he relation of sound to meaning is something like the relation of the soul (or mind) to the body. They are aspects of each other, neither prior, neither independent (17). Scholars have not had the ability to analyze the features of text that correspond to orality—their phonemes and prosodic elements—much less compare these features with similar features across collections. To incorporate this kind of study in digital humanities, it is time we considered the logics and ontologies of orality in the computational environment.

Bernstein, Charles. Close Listening: Poetry and the Performed Word. Oxford University Press, 1998. Print.

Perloff, Marjorie. The Poetics of Indeterminacy: Rimbaud to Cage. Princeton, N.J: Princeton University Press, 1981. Print.

Sowa, John F. Knowledge Representation: Logical, Philosophical, and Computational Foundations. Pacific Grove, CA: Brooks Cole Publishing Co., 2000. Print.

This post has no tag

All original text, images, and code on THATCamp Kansas 2011 are freely available for you to use, copy, adapt and distribute under a Creative Commons Attribution 3.0 Unported License as long as you mention THATCamp and (if possible) link to THATCamp.org and the Center for History and New Media. The name "THATCamp" and the THATCamp logo are trademarks of the Center for History and New Media at George Mason University. The THATCamp Kansas 2011 theme is based on the Graphene theme by Syahir Hakim.

Sounding it out: modeling orality for large-scale text collection analysis

This is the website for THATCamp Kansas 2011 (not 2012)

Blog Post Categories

Recent Comments

Sponsors

Contact

Login

Sounding it out: modeling orality for large-scale text collection analysis

This is the website for THATCamp Kansas 2011 (not 2012)

Blog Post Categories

Recent Comments

Tag Cloud

Sponsors

Contact

Login