My Notes

Many thanks to the University of Kansas and all of the organizers of the Digital Humanities Forum! In the spirit of THATCamp, I wanted to share my notes from all of the sessions—Thursday-Saturday—that I was able to attend. Please feel free to do with these what you want: reuse, remix, add, etc. Apologies for any misspellings or inaccurate quotations; please leave any edits in the comments section and I’ll update accordingly. Notes after the break . . .

Voyant for Text Analysis and Visualization

Geoffrey Rockwell, University of Alberta, Philosophy
Voyant Tools: voyant-tools.org/
Presentation script: hermeneuti.ca/node/242
Draft chapter on intro to text analysis: The Measured Words.pdf
Cirrus
- Word cloud
- Stop words: list of high-frequency function words (the, and, etc.)
- Add and edit stop word lists by clicking gear icon
- Color, shape and position is random
- Clicking on specific words brings up concordance/KWIC tool
KWIC
- Concordances
- Keyword highlights, contextual text to left and right
- Can change amount of context
- KWIC, Loon from IBM in 1960s
- History of digital humanities built on concordances (40s-60s)
- Does not allow for wildcrards (ex. theor*) at this time
Skin
- Particular combination of tools
- Add new columns in Words in Entire Corpus Trend > Columns
- Heart icon adds favorites
- If you want a custom skin, you can ask for a custom name and URL
Other tools
- TACT for DOS: projects.chass.utoronto.ca/tact/
- CATMA/CLEA: www.catma.de/
- Tapor: taporware.ualberta.ca/~taporware/htmlTools/extract.shtml
- More on the hermeuniti.ca site
Using Voyant with Large Corpora
- Adds new panel called Corpus
- Words in documents: breakdown by particular texts within corpus
- Name corpora properly to have them appear in diachronic order
Export
- Bib citation
- HTML
- URL
- iframe of specific tools
- XML, tabular data, etc.

Visualizing Humanities Data Sets in Improvise

Chris Weaver, University of Oklahoma, Computer Science
Improvise: www.cs.ou.edu/~weaver/improvise/index.html
Tutorial + download: www.cs.ou.edu/~weaver/improvise/tutorial-ku/
Visual analytics
Re-inject scholarly interpretations back into the data set
“Digital” about collecting and managing and browsing data
Visualization methods help complete digital workspace for scholarship
Visualization as bridge between human mind and computation
Improvise
- Full blown development environment (requires programming skills)
- “Excel” for visualization
- Supports live design of visual tools (build as you work)
- Each visualization is a document (.viz)
- Facilitation tool to explore data more broadly (still requires close reading & investigation)
- Strong for computation, quantitative analysis, mostly social sciences
- Needs advanced training to do, would need to find programmer to work with

Keynote: Humanities in a Digital Age

Gregory Crane, Tufts University, Classics
Perseus Digital Library: www.perseus.tufts.edu/hopper/
Importance of undergraduate research experience (happening in STEM, not in Humanities?)
Digital Humanities?
- Separate niche field—safely sequestered
“The instructor is not there to serve the student. Rather, both student and instructor are there to serve Wissenschaft.” —Wilhelm von Humboldt (1809)
- Pericles’s funeral oration: www.fordham.edu/halsall/ancient/pericles-funeralspeech.asp
Goals:
- How do we advance the intellectual life of humanity?
- What are the metaphors we might use?
  - A global republic of letters?
  - A dialogue among civilizations?
Not books or artifacts but data . . . (slide: DataConservancy, dataconservancy.org/)
Language is the great barrier, not space.
The Digital Walters: www.thedigitalwalters.org/
TEI production and soldiers: Humanists need to treat students as adults
The past is hyperlingual
Discovering a new world, no longer acting/studying in isolation
The Mission of the Library is essential in a culture of fragmented departments
- Only organizations that capture totality of cultures, languages, etc.
Do we want to reach 1,000 research libraries and their subscribers? Or billions of humans?
Movie > Wikipedia > Primary Sources
- Note: what’s missing from that chain? SCHOLARLY PUBLICATIONS. (because we remove them)
Where is your labor?
- Clever systems (translation tools)
- Advanced researchers
- Library professionals
- Citizen scholars
- Student research and a lab culture
Only 11 historical languages with enrollments of more than 50 (96% of those are Latin, Greek, Hebrew)
Virtuous Cycle of Learning and Contribution (WANT TO SEE THAT SLIDE)
e-portfolios (instead of branded transcript)
- What have you contributed?
- What skills have you developed?
- Note (his): grades are inadequate
To create a radically new and deeply traditional form of education (back to Humboldt)
Dialogue is a process that lowers the probability of evil and violence

Workshop: XSLT Basics & Visualizing Structural Similarity with Plectograms and XSLT

David Birnbaum, University of Pittsburgh
obdurodon.org/ku/2012-09-21_ku-plectogram.html
XSLT
- stylesheet for transforming XML
- Declarative programming language
XPath is the that part XSLT that tells you what things are
- / = whole document
Core of XSLT is template rule xml:template match=
apply templates is code for look inside an element and do stuff
XPath predicate filters results
pretty print = indent button
refer to variable with $
{ } turns from literal to calculation
XPath by default only looks within

Workshop: Quantitative Analysis of Literary Texts with R

Jeff Rydberg-Cox, University of Missouri-Kansas City
daedalus.umkc.edu/?page_id=29
R: www.r-project.org/
Interactive “calculator” with lots of functions you can build on
Variable = single value
Vector = list of values
Frame = table of values
Not so good at aggregating, good at graphing (Perl and Python good at aggregating)
Most of the work with data is prepping, pre-processing
could need stem/lemmatizers (for english stanford coreNLP: nlp.stanford.edu/software/corenlp.shtml)
- wow!
nameofcommand( )
Book: R in Action www.manning.com/kabacoff/
Stefan Gries, corpus linguistics with R / statistics with linguistics using R
Are they including these in publications?
- Have included data as supplementary, put on website
- Wants: central repository, data set, documentation

Workshop: Advanced Omeka

Can map metadata to existing Dublin Core set OR add new fields
Items
Collections
Exhibits = interpretations

Keynote: False Positives: Opportunities and Dangers in Big Text Analysis

Geoffrey Rockwell, University of Alberta, Philosophy
2 types of data
- Information at Rest (large databases)
- Information in Motion
DataSift: datasift.com/
Brunet and the Grand Corpus
Opportunities
- Filtering and subsetting (Cornel WebLab)
- Enrichment
- Sequence alignment (Horton, Olsen, Row – Digital Studies 2:1, turnitin.com/)
  - Follow the expression of texts over time
- Diachronic analysis (Google Ngrams)
- Classification and clustering (Voyant)
- Social network analysis (Voyant)
- Life-tracking (Wolfram: Quantified Self)
Ian Lancashire, Forgetful Muses
Predictive Data Mining: doesn’t work
- lots of data leads to lots of false hits
- never believe data, always investigate and verify
Conjecturator
Dreyfus, intelligence is embodied (again artificial intelligence)

Reading Genres

Benjamin MacDonald Schmidt, Princeton University, PhD Candidate History
sappingattention.blogspot.com/
Harvard Cultural Observatory: www.culturomics.org/cultural-observatory-at-harvard
Digital sources contribute knowledge beyond individuals
Metadata lets us look at social structures
Humanists need to be more involved in designing algorithms (not just social scientists)
OCR is not important: scientists have always been able to work against biases
Working on something like Google Ngrams but users can set own categories
- Bookworm: bookworm.culturomics.org/
- API lets you present data in all shapes
His research is on “the history of attention”
- Using concordances to search words preceding “attention”
- Looking at geographical patterns of language (via newspapers)
“All digital history ends in 1922.”
Integrate big data in with other analyses in order to get greater comfort with results before we trust these types or results entirely on their own.
Evidence may not be novel, but it certainly is new.
Humanists need to be patient with data results; not everything is instantaneous

A World in a Grain of Sand: Uncertainty and Poetry Corpora Visualization

Katherine Coles & Julie Lien, University of Utah, English
Poems as multidimensional living things (a single poem as large data set)
“Complex capta”
Joanna Drucker, Humanities Approaches to Graphical Display, DHQ 2011 5.1
Myopia at DH2012

Phylogenetic Futures: Big Data and Design Fiction

Kari Kraus, University of Maryland, Information Science & English
www.karikraus.com/
Moving humanities to be future-oriented discipline
- CP Snow, Two Cultures
Phylogenetics: en.wikipedia.org/wiki/Phylogenetics
- collate texts before charting evolution
- D’arcy Thompson: Fish Deformations (1917)
Design Fiction
- The practice of mocking up or prototyping objects that embody our ideas about the future
- Share Horizons: Data, Biomedicine, and the Digital Humanities
- mith.umd.edu/sharedhorizons/
- Conlangs & Alternate Evolution: conlang.org/ & After Man by Dixon
- Frederic Bartlett, Remembering (1932)
- Nonobject banko lukic barry katz: nonobjectbook.com/read
- Jane McGonogal (sp?) alternate reality games
- Cathy’s Book: http://en.wikipedia.org/wiki/Cathy’s_Book
- Warning Systems for Posteriry
Material cultural evolution: what do we know?
- component analysis
- tree topology
- variation
A lot of design fiction about provocation and not futurity
- Julian Bleeker’s slow movement
Franco Moretti: tree of life vs. tree of culture (Graphs, Maps, and Trees)
Guided Variation: how biological evolution differs from cultural evolution
nvivo: www.qsrinternational.com/products_nvivo.aspx

Lightning Talks

Amanda French – Zotero library
- Edna St. Vincent Millay Society: millay.org & zotero library
Arienne Dwyer
- Uyghur Light Verbs Project
- Interactive Inner Asia
- Transcriber 1.5.1: trans.sourceforge.net/en/presentation.php / transag.sourceforge.net/
  - Audio transcription tool

Grounds more relative than this: Critically Harnessing Uncertainty in Digital Literary Studies in Big Data

Patrick Flor, University of Kansas, Graduate Student in English
Gulliver in Laputa — important for DH — frame of words
Humanities Computing
- Computation of/from/over humanities artifacts
- of = generative art
- from = statistics from a text
- over = taking texts and making texts about texts
Critical Humanities Computing
- Create and choose algorithms, artifacts, results, etc.
- Natural language processing
  - NLTK (Python framework): nltk.org/
  - Pervert the tools accordingly
POS tagging
- Good hook for processing text, but not enough
Immediately reentering into text, not using computing to abstract from the text (i.e. as visualization does)
- Computer substituting for failure as reader
Close reading circling out into broader interpretation

What Are You Going to Do with that Data? Results of Needs Assessment of Humanities Scholars for Digital Collections

Harriett Green, University of Illinois, Digital Humanities & English Librarian
Woodchipper: mith.umd.edu/corporacamp/tool.php
Unsworth and “scholarly primitives”

Museum Collecting in the Age of Big Data: Opportunities for Collaboration

Peter Welsh, University of Kansas, Museum Studies

Notes from the Digital Humanities Forum | Center for Scholarly Communication & Digital Curation

September 27, 2012 at 3:27 pm (UTC 0) Link to this comment

[…] respect for the term Humanities Computing and the history of the digital humanities. You can read all of my notes from the conference and videos of the keynotes should be […]

1 ping

Notes from the Digital Humanities Forum | Center for Scholarly Communication & Digital Curation

Comments have been disabled.

Register

When and Where

Blog Post Categories

Recent Comments

Sponsors

Contact

Login