Many thanks to the University of Kansas and all of the organizers of the Digital Humanities Forum! In the spirit of THATCamp, I wanted to share my notes from all of the sessions—Thursday-Saturday—that I was able to attend. Please feel free to do with these what you want: reuse, remix, add, etc. Apologies for any misspellings or inaccurate quotations; please leave any edits in the comments section and I’ll update accordingly. Notes after the break . . .
Voyant for Text Analysis and Visualization
- Geoffrey Rockwell, University of Alberta, Philosophy
- Voyant Tools: voyant-tools.org/
- Presentation script: hermeneuti.ca/node/242
- Draft chapter on intro to text analysis: The Measured Words.pdf
- Cirrus
- Word cloud
- Stop words: list of high-frequency function words (the, and, etc.)
- Add and edit stop word lists by clicking gear icon
- Color, shape and position is random
- Clicking on specific words brings up concordance/KWIC tool
- KWIC
- Concordances
- Keyword highlights, contextual text to left and right
- Can change amount of context
- KWIC, Loon from IBM in 1960s
- History of digital humanities built on concordances (40s-60s)
- Does not allow for wildcrards (ex. theor*) at this time
- Skin
- Particular combination of tools
- Add new columns in Words in Entire Corpus Trend > Columns
- Heart icon adds favorites
- If you want a custom skin, you can ask for a custom name and URL
- Other tools
- TACT for DOS: projects.chass.utoronto.ca/tact/
- CATMA/CLEA: www.catma.de/
- Tapor: taporware.ualberta.ca/~taporware/htmlTools/extract.shtml
- More on the hermeuniti.ca site
- Using Voyant with Large Corpora
- Adds new panel called Corpus
- Words in documents: breakdown by particular texts within corpus
- Name corpora properly to have them appear in diachronic order
- Export
- Bib citation
- HTML
- URL
- iframe of specific tools
- XML, tabular data, etc.
Visualizing Humanities Data Sets in Improvise
- Chris Weaver, University of Oklahoma, Computer Science
- Improvise: www.cs.ou.edu/~weaver/improvise/index.html
- Tutorial + download: www.cs.ou.edu/~weaver/improvise/tutorial-ku/
- Visual analytics
- Re-inject scholarly interpretations back into the data set
- “Digital” about collecting and managing and browsing data
- Visualization methods help complete digital workspace for scholarship
- Visualization as bridge between human mind and computation
- Improvise
- Full blown development environment (requires programming skills)
- “Excel” for visualization
- Supports live design of visual tools (build as you work)
- Each visualization is a document (.viz)
- Facilitation tool to explore data more broadly (still requires close reading & investigation)
- Strong for computation, quantitative analysis, mostly social sciences
- Needs advanced training to do, would need to find programmer to work with
Keynote: Humanities in a Digital Age
- Gregory Crane, Tufts University, Classics
- Perseus Digital Library: www.perseus.tufts.edu/hopper/
- Importance of undergraduate research experience (happening in STEM, not in Humanities?)
- Digital Humanities?
- Separate niche field—safely sequestered
- “The instructor is not there to serve the student. Rather, both student and instructor are there to serve Wissenschaft.” —Wilhelm von Humboldt (1809)
- Pericles’s funeral oration: www.fordham.edu/halsall/ancient/pericles-funeralspeech.asp
- Goals:
- How do we advance the intellectual life of humanity?
- What are the metaphors we might use?
- A global republic of letters?
- A dialogue among civilizations?
- Not books or artifacts but data . . . (slide: DataConservancy, dataconservancy.org/)
- Language is the great barrier, not space.
- The Digital Walters: www.thedigitalwalters.org/
- TEI production and soldiers: Humanists need to treat students as adults
- The past is hyperlingual
- Discovering a new world, no longer acting/studying in isolation
- The Mission of the Library is essential in a culture of fragmented departments
- Only organizations that capture totality of cultures, languages, etc.
- Do we want to reach 1,000 research libraries and their subscribers? Or billions of humans?
- Movie > Wikipedia > Primary Sources
- Note: what’s missing from that chain? SCHOLARLY PUBLICATIONS. (because we remove them)
- Where is your labor?
- Clever systems (translation tools)
- Advanced researchers
- Library professionals
- Citizen scholars
- Student research and a lab culture
- Only 11 historical languages with enrollments of more than 50 (96% of those are Latin, Greek, Hebrew)
- Virtuous Cycle of Learning and Contribution (WANT TO SEE THAT SLIDE)
- e-portfolios (instead of branded transcript)
- What have you contributed?
- What skills have you developed?
- Note (his): grades are inadequate
- To create a radically new and deeply traditional form of education (back to Humboldt)
- Dialogue is a process that lowers the probability of evil and violence
Workshop: XSLT Basics & Visualizing Structural Similarity with Plectograms and XSLT
- David Birnbaum, University of Pittsburgh
- obdurodon.org/ku/2012-09-21_ku-plectogram.html
- XSLT
- stylesheet for transforming XML
- Declarative programming language
- XPath is the that part XSLT that tells you what things are
- / = whole document
- Core of XSLT is template rule xml:template match=
- apply templates is code for look inside an element and do stuff
- XPath predicate filters results
- pretty print = indent button
- refer to variable with $
- { } turns from literal to calculation
- XPath by default only looks within
Workshop: Quantitative Analysis of Literary Texts with R
- Jeff Rydberg-Cox, University of Missouri-Kansas City
- daedalus.umkc.edu/?page_id=29
- R: www.r-project.org/
- Interactive “calculator” with lots of functions you can build on
- Variable = single value
- Vector = list of values
- Frame = table of values
- Not so good at aggregating, good at graphing (Perl and Python good at aggregating)
- Most of the work with data is prepping, pre-processing
- could need stem/lemmatizers (for english stanford coreNLP: nlp.stanford.edu/software/corenlp.shtml)
- wow!
- nameofcommand( )
- Book: R in Action www.manning.com/kabacoff/
- Stefan Gries, corpus linguistics with R / statistics with linguistics using R
- Are they including these in publications?
- Have included data as supplementary, put on website
- Wants: central repository, data set, documentation
Workshop: Advanced Omeka
- Can map metadata to existing Dublin Core set OR add new fields
- Items
- Collections
- Exhibits = interpretations
Keynote: False Positives: Opportunities and Dangers in Big Text Analysis
- Geoffrey Rockwell, University of Alberta, Philosophy
- 2 types of data
- Information at Rest (large databases)
- Information in Motion
- DataSift: datasift.com/
- Brunet and the Grand Corpus
- Opportunities
- Filtering and subsetting (Cornel WebLab)
- Enrichment
- Sequence alignment (Horton, Olsen, Row – Digital Studies 2:1, turnitin.com/)
- Follow the expression of texts over time
- Diachronic analysis (Google Ngrams)
- Classification and clustering (Voyant)
- Social network analysis (Voyant)
- Life-tracking (Wolfram: Quantified Self)
- Ian Lancashire, Forgetful Muses
- Predictive Data Mining: doesn’t work
- lots of data leads to lots of false hits
- never believe data, always investigate and verify
- Conjecturator
- Dreyfus, intelligence is embodied (again artificial intelligence)
Reading Genres
- Benjamin MacDonald Schmidt, Princeton University, PhD Candidate History
- sappingattention.blogspot.com/
- Harvard Cultural Observatory: www.culturomics.org/cultural-observatory-at-harvard
- Digital sources contribute knowledge beyond individuals
- Metadata lets us look at social structures
- Humanists need to be more involved in designing algorithms (not just social scientists)
- OCR is not important: scientists have always been able to work against biases
- Working on something like Google Ngrams but users can set own categories
- Bookworm: bookworm.culturomics.org/
- API lets you present data in all shapes
- His research is on “the history of attention”
- Using concordances to search words preceding “attention”
- Looking at geographical patterns of language (via newspapers)
- “All digital history ends in 1922.”
- Integrate big data in with other analyses in order to get greater comfort with results before we trust these types or results entirely on their own.
- Evidence may not be novel, but it certainly is new.
- Humanists need to be patient with data results; not everything is instantaneous
A World in a Grain of Sand: Uncertainty and Poetry Corpora Visualization
- Katherine Coles & Julie Lien, University of Utah, English
- Poems as multidimensional living things (a single poem as large data set)
- “Complex capta”
- Joanna Drucker, Humanities Approaches to Graphical Display, DHQ 2011 5.1
- Myopia at DH2012
Phylogenetic Futures: Big Data and Design Fiction
- Kari Kraus, University of Maryland, Information Science & English
- www.karikraus.com/
- Moving humanities to be future-oriented discipline
- CP Snow, Two Cultures
- Phylogenetics: en.wikipedia.org/wiki/Phylogenetics
- collate texts before charting evolution
- D’arcy Thompson: Fish Deformations (1917)
- Design Fiction
- The practice of mocking up or prototyping objects that embody our ideas about the future
- Share Horizons: Data, Biomedicine, and the Digital Humanities
- mith.umd.edu/sharedhorizons/
- Conlangs & Alternate Evolution: conlang.org/ & After Man by Dixon
- Frederic Bartlett, Remembering (1932)
- Nonobject banko lukic barry katz: nonobjectbook.com/read
- Jane McGonogal (sp?) alternate reality games
- Cathy’s Book: http://en.wikipedia.org/wiki/Cathy’s_Book
- Warning Systems for Posteriry
- Material cultural evolution: what do we know?
- component analysis
- tree topology
- variation
- A lot of design fiction about provocation and not futurity
- Julian Bleeker’s slow movement
- Franco Moretti: tree of life vs. tree of culture (Graphs, Maps, and Trees)
- Guided Variation: how biological evolution differs from cultural evolution
- nvivo: www.qsrinternational.com/products_nvivo.aspx
Lightning Talks
- Amanda French – Zotero library
- Edna St. Vincent Millay Society: millay.org & zotero library
- Arienne Dwyer
- Uyghur Light Verbs Project
- Interactive Inner Asia
- Transcriber 1.5.1: trans.sourceforge.net/en/presentation.php / transag.sourceforge.net/
- Audio transcription tool
Grounds more relative than this: Critically Harnessing Uncertainty in Digital Literary Studies in Big Data
- Patrick Flor, University of Kansas, Graduate Student in English
- Gulliver in Laputa — important for DH — frame of words
- Humanities Computing
- Computation of/from/over humanities artifacts
- of = generative art
- from = statistics from a text
- over = taking texts and making texts about texts
- Critical Humanities Computing
- Create and choose algorithms, artifacts, results, etc.
- Natural language processing
- NLTK (Python framework): nltk.org/
- Pervert the tools accordingly
- POS tagging
- Good hook for processing text, but not enough
- Immediately reentering into text, not using computing to abstract from the text (i.e. as visualization does)
- Computer substituting for failure as reader
- Close reading circling out into broader interpretation
What Are You Going to Do with that Data? Results of Needs Assessment of Humanities Scholars for Digital Collections
- Harriett Green, University of Illinois, Digital Humanities & English Librarian
- Woodchipper: mith.umd.edu/corporacamp/tool.php
- Unsworth and “scholarly primitives”
Museum Collecting in the Age of Big Data: Opportunities for Collaboration
- Peter Welsh, University of Kansas, Museum Studies
1 ping
Notes from the Digital Humanities Forum | Center for Scholarly Communication & Digital Curation
September 27, 2012 at 3:27 pm (UTC 0) Link to this comment
[…] respect for the term Humanities Computing and the history of the digital humanities. You can read all of my notes from the conference and videos of the keynotes should be […]