«

Sep 24

My Notes

Many thanks to the University of Kansas and all of the organizers of the Digital Humanities Forum! In the spirit of THATCamp, I wanted to share my notes from all of the sessions—Thursday-Saturday—that I was able to attend. Please feel free to do with these what you want: reuse, remix, add, etc. Apologies for any misspellings or inaccurate quotations; please leave any edits in the comments section and I’ll update accordingly. Notes after the break . . .

 

Voyant for Text Analysis and Visualization 

  • Geoffrey Rockwell, University of Alberta, Philosophy
  • Voyant Tools: voyant-tools.org/
  • Presentation script: hermeneuti.ca/node/242
  • Draft chapter on intro to text analysis: The Measured Words.pdf
  • Cirrus
    • Word cloud
    • Stop words: list of high-frequency function words (the, and, etc.)
    • Add and edit stop word lists by clicking gear icon
    • Color, shape and position is random
    • Clicking on specific words brings up concordance/KWIC tool
  • KWIC
    • Concordances
    • Keyword highlights, contextual text to left and right
    • Can change amount of context
    • KWIC, Loon from IBM in 1960s
    • History of digital humanities built on concordances (40s-60s)
    • Does not allow for wildcrards (ex. theor*) at this time
  • Skin
    • Particular combination of tools
    • Add new columns in Words in Entire Corpus Trend > Columns
    • Heart icon adds favorites
    • If you want a custom skin, you can ask for a custom name and URL
  • Other tools
  • Using Voyant with Large Corpora
    • Adds new panel called Corpus
    • Words in documents: breakdown by particular texts within corpus
    • Name corpora properly to have them appear in diachronic order
  • Export
    • Bib citation
    • HTML
    • URL
    • iframe of specific tools
    • XML, tabular data, etc.

 

Visualizing Humanities Data Sets in Improvise 

  • Chris Weaver, University of Oklahoma, Computer Science
  • Improvise: www.cs.ou.edu/~weaver/improvise/index.html
  • Tutorial + download: www.cs.ou.edu/~weaver/improvise/tutorial-ku/
  • Visual analytics
  • Re-inject scholarly interpretations back into the data set
  • “Digital” about collecting and managing and browsing data
  • Visualization methods help complete digital workspace for scholarship
  • Visualization as bridge between human mind and computation
  • Improvise
    • Full blown development environment (requires programming skills)
    • “Excel” for visualization
    • Supports live design of visual tools (build as you work)
    • Each visualization is a document (.viz)
    • Facilitation tool to explore data more broadly (still requires close reading & investigation)
    • Strong for computation, quantitative analysis, mostly social sciences
    • Needs advanced training to do, would need to find programmer to work with

 

Keynote: Humanities in a Digital Age

  • Gregory Crane, Tufts University, Classics
  • Perseus Digital Library: www.perseus.tufts.edu/hopper/
  • Importance of undergraduate research experience (happening in STEM, not in Humanities?)
  • Digital Humanities?
    • Separate niche field—safely sequestered
  • “The instructor is not there to serve the student. Rather, both student and instructor are there to serve Wissenschaft.” —Wilhelm von Humboldt (1809)
  • Goals:
    • How do we advance the intellectual life of humanity?
    • What are the metaphors we might use?
      • A global republic of letters?
      • A dialogue among civilizations?
  • Not books or artifacts but data . . . (slide: DataConservancy, dataconservancy.org/)
  • Language is the great barrier, not space.
  • The Digital Walters: www.thedigitalwalters.org/
  • TEI production and soldiers: Humanists need to treat students as adults
  • The past is hyperlingual
  • Discovering a new world, no longer acting/studying in isolation
  • The Mission of the Library is essential in a culture of fragmented departments
    • Only organizations that capture totality of cultures, languages, etc.
  • Do we want to reach 1,000 research libraries and their subscribers? Or billions of humans?
  • Movie > Wikipedia > Primary Sources
    • Note: what’s missing from that chain? SCHOLARLY PUBLICATIONS. (because we remove them)
  • Where is your labor?
    • Clever systems (translation tools)
    • Advanced researchers
    • Library professionals
    • Citizen scholars
    • Student research and a lab culture
  • Only 11 historical languages with enrollments of more than 50 (96% of those are Latin, Greek, Hebrew)
  • Virtuous Cycle of Learning and Contribution (WANT TO SEE THAT SLIDE)
  • e-portfolios (instead of branded transcript)
    • What have you contributed?
    • What skills have you developed?
    • Note (his): grades are inadequate
  • To create a radically new and deeply traditional form of education (back to Humboldt)
  • Dialogue is a process that lowers the probability of evil and violence

 

Workshop: XSLT Basics & Visualizing Structural Similarity with Plectograms and XSLT

  • David Birnbaum, University of Pittsburgh
  • obdurodon.org/ku/2012-09-21_ku-plectogram.html
  • XSLT
    • stylesheet for transforming XML
    • Declarative programming language
  • XPath is the that part XSLT that tells you what things are
    • / = whole document
  • Core of XSLT is template rule xml:template match=
  • apply templates is code for look inside an element and do stuff
  • XPath predicate filters results
  • pretty print = indent button
  • refer to variable with $
  • { } turns from literal to calculation
  • XPath by default only looks within

 

Workshop: Quantitative Analysis of Literary Texts with R

  • Jeff Rydberg-Cox, University of Missouri-Kansas City
  • daedalus.umkc.edu/?page_id=29
  • R: www.r-project.org/
  • Interactive “calculator” with lots of functions you can build on
  • Variable = single value
  • Vector = list of values
  • Frame = table of values
  • Not so good at aggregating, good at graphing (Perl and Python good at aggregating)
  • Most of the work with data is prepping, pre-processing
  • could need stem/lemmatizers (for english stanford coreNLP: nlp.stanford.edu/software/corenlp.shtml)
    • wow!
  • nameofcommand( )
  • Book: R in Action www.manning.com/kabacoff/
  • Stefan Gries, corpus linguistics with R / statistics with linguistics using R
  • Are they including these in publications?
    • Have included data as supplementary, put on website
    • Wants: central repository, data set, documentation

 

Workshop: Advanced Omeka

  • Can map metadata to existing Dublin Core set OR add new fields
  • Items
  • Collections
  • Exhibits = interpretations

 

Keynote: False Positives: Opportunities and Dangers in Big Text Analysis 

  • Geoffrey Rockwell, University of Alberta, Philosophy
  • 2 types of data
    • Information at Rest (large databases)
    • Information in Motion
  • DataSift: datasift.com/
  • Brunet and the Grand Corpus
  • Opportunities
    • Filtering and subsetting (Cornel WebLab)
    • Enrichment
    • Sequence alignment (Horton, Olsen, Row – Digital Studies 2:1, turnitin.com/)
      • Follow the expression of texts over time
    • Diachronic analysis (Google Ngrams)
    • Classification and clustering (Voyant)
    • Social network analysis (Voyant)
    • Life-tracking (Wolfram: Quantified Self)
  • Ian Lancashire, Forgetful Muses
  • Predictive Data Mining: doesn’t work
    • lots of data leads to lots of false hits
    • never believe data, always investigate and verify
  • Conjecturator
  • Dreyfus, intelligence is embodied (again artificial intelligence)

 

Reading Genres

  • Benjamin MacDonald Schmidt, Princeton University, PhD Candidate History
  • sappingattention.blogspot.com/
  • Harvard Cultural Observatory: www.culturomics.org/cultural-observatory-at-harvard
  • Digital sources contribute knowledge beyond individuals
  • Metadata lets us look at social structures
  • Humanists need to be more involved in designing algorithms (not just social scientists)
  • OCR is not important: scientists have always been able to work against biases
  • Working on something like Google Ngrams but users can set own categories
  • His research is on “the history of attention”
    • Using concordances to search words preceding “attention”
    • Looking at geographical patterns of language (via newspapers)
  • “All digital history ends in 1922.”
  • Integrate big data in with other analyses in order to get greater comfort with results before we trust these types or results entirely on their own.
  • Evidence may not be novel, but it certainly is new.
  • Humanists need to be patient with data results; not everything is instantaneous

 

A World in a Grain of Sand: Uncertainty and Poetry Corpora Visualization 

  • Katherine Coles & Julie Lien, University of Utah, English
  • Poems as multidimensional living things (a single poem as large data set)
  • “Complex capta”
  • Joanna Drucker, Humanities Approaches to Graphical Display, DHQ 2011 5.1
  • Myopia at DH2012

 

Phylogenetic Futures: Big Data and Design Fiction 

  • Kari Kraus, University of Maryland, Information Science & English
  • www.karikraus.com/
  • Moving humanities to be future-oriented discipline
    • CP Snow, Two Cultures
  • Phylogenetics: en.wikipedia.org/wiki/Phylogenetics
    • collate texts before charting evolution
    • D’arcy Thompson: Fish Deformations (1917)
  • Design Fiction
    • The practice of mocking up or prototyping objects that embody our ideas about the future
    • Share Horizons: Data, Biomedicine, and the Digital Humanities
    • mith.umd.edu/sharedhorizons/
    • Conlangs & Alternate Evolution: conlang.org/ & After Man by Dixon
    • Frederic Bartlett, Remembering (1932)
    • Nonobject banko lukic barry katz: nonobjectbook.com/read
    • Jane McGonogal (sp?) alternate reality games
    • Cathy’s Book: http://en.wikipedia.org/wiki/Cathy’s_Book
    • Warning Systems for Posteriry
  • Material cultural evolution: what do we know?
    • component analysis
    • tree topology
    • variation
  • A lot of design fiction about provocation and not futurity
    • Julian Bleeker’s slow movement
  • Franco Moretti: tree of life vs. tree of culture (Graphs, Maps, and Trees)
  • Guided Variation: how biological evolution differs from cultural evolution
  • nvivo: www.qsrinternational.com/products_nvivo.aspx

 

Lightning Talks

Grounds more relative than this: Critically Harnessing Uncertainty in Digital Literary Studies in Big Data 

  • Patrick Flor, University of Kansas, Graduate Student in English
  • Gulliver in Laputa — important for DH — frame of words
  • Humanities Computing
    • Computation of/from/over humanities artifacts
    • of = generative art
    • from = statistics from a text
    • over = taking texts and making texts about texts
  • Critical Humanities Computing
    • Create and choose algorithms, artifacts, results, etc.
    • Natural language processing
      • NLTK (Python framework): nltk.org/
      • Pervert the tools accordingly
  • POS tagging
    • Good hook for processing text, but not enough
  • Immediately reentering into text, not using computing to abstract from the text (i.e. as visualization does)
    • Computer substituting for failure as reader
  • Close reading circling out into broader interpretation

 

What Are You Going to Do with that Data? Results of Needs Assessment of Humanities Scholars for Digital Collections 

  • Harriett Green, University of Illinois, Digital Humanities & English Librarian
  • Woodchipper: mith.umd.edu/corporacamp/tool.php
  • Unsworth and “scholarly primitives”

 

Museum Collecting in the Age of Big Data: Opportunities for Collaboration 

  • Peter Welsh, University of Kansas, Museum Studies

1 ping

  1. Notes from the Digital Humanities Forum | Center for Scholarly Communication & Digital Curation

    […] respect for the term Humanities Computing and the history of the digital humanities. You can read all of my notes from the conference and videos of the keynotes should be […]

Comments have been disabled.

Skip to toolbar