Josh Honn – THATCamp Kansas 2012 http://kansas2012.thatcamp.org The Humanities and Technology Camp Thu, 27 Sep 2012 00:48:05 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.12 My Notes http://kansas2012.thatcamp.org/09/24/my-notes/ http://kansas2012.thatcamp.org/09/24/my-notes/#comments Mon, 24 Sep 2012 15:19:13 +0000 http://kansas2012.thatcamp.org/?p=238

Continue reading »]]>

Many thanks to the University of Kansas and all of the organizers of the Digital Humanities Forum! In the spirit of THATCamp, I wanted to share my notes from all of the sessions—Thursday-Saturday—that I was able to attend. Please feel free to do with these what you want: reuse, remix, add, etc. Apologies for any misspellings or inaccurate quotations; please leave any edits in the comments section and I’ll update accordingly. Notes after the break . . .

 

Voyant for Text Analysis and Visualization 

  • Geoffrey Rockwell, University of Alberta, Philosophy
  • Voyant Tools: voyant-tools.org/
  • Presentation script: hermeneuti.ca/node/242
  • Draft chapter on intro to text analysis: The Measured Words.pdf
  • Cirrus
    • Word cloud
    • Stop words: list of high-frequency function words (the, and, etc.)
    • Add and edit stop word lists by clicking gear icon
    • Color, shape and position is random
    • Clicking on specific words brings up concordance/KWIC tool
  • KWIC
    • Concordances
    • Keyword highlights, contextual text to left and right
    • Can change amount of context
    • KWIC, Loon from IBM in 1960s
    • History of digital humanities built on concordances (40s-60s)
    • Does not allow for wildcrards (ex. theor*) at this time
  • Skin
    • Particular combination of tools
    • Add new columns in Words in Entire Corpus Trend > Columns
    • Heart icon adds favorites
    • If you want a custom skin, you can ask for a custom name and URL
  • Other tools
  • Using Voyant with Large Corpora
    • Adds new panel called Corpus
    • Words in documents: breakdown by particular texts within corpus
    • Name corpora properly to have them appear in diachronic order
  • Export
    • Bib citation
    • HTML
    • URL
    • iframe of specific tools
    • XML, tabular data, etc.

 

Visualizing Humanities Data Sets in Improvise 

  • Chris Weaver, University of Oklahoma, Computer Science
  • Improvise: www.cs.ou.edu/~weaver/improvise/index.html
  • Tutorial + download: www.cs.ou.edu/~weaver/improvise/tutorial-ku/
  • Visual analytics
  • Re-inject scholarly interpretations back into the data set
  • “Digital” about collecting and managing and browsing data
  • Visualization methods help complete digital workspace for scholarship
  • Visualization as bridge between human mind and computation
  • Improvise
    • Full blown development environment (requires programming skills)
    • “Excel” for visualization
    • Supports live design of visual tools (build as you work)
    • Each visualization is a document (.viz)
    • Facilitation tool to explore data more broadly (still requires close reading & investigation)
    • Strong for computation, quantitative analysis, mostly social sciences
    • Needs advanced training to do, would need to find programmer to work with

 

Keynote: Humanities in a Digital Age

  • Gregory Crane, Tufts University, Classics
  • Perseus Digital Library: www.perseus.tufts.edu/hopper/
  • Importance of undergraduate research experience (happening in STEM, not in Humanities?)
  • Digital Humanities?
    • Separate niche field—safely sequestered
  • “The instructor is not there to serve the student. Rather, both student and instructor are there to serve Wissenschaft.” —Wilhelm von Humboldt (1809)
  • Goals:
    • How do we advance the intellectual life of humanity?
    • What are the metaphors we might use?
      • A global republic of letters?
      • A dialogue among civilizations?
  • Not books or artifacts but data . . . (slide: DataConservancy, dataconservancy.org/)
  • Language is the great barrier, not space.
  • The Digital Walters: www.thedigitalwalters.org/
  • TEI production and soldiers: Humanists need to treat students as adults
  • The past is hyperlingual
  • Discovering a new world, no longer acting/studying in isolation
  • The Mission of the Library is essential in a culture of fragmented departments
    • Only organizations that capture totality of cultures, languages, etc.
  • Do we want to reach 1,000 research libraries and their subscribers? Or billions of humans?
  • Movie > Wikipedia > Primary Sources
    • Note: what’s missing from that chain? SCHOLARLY PUBLICATIONS. (because we remove them)
  • Where is your labor?
    • Clever systems (translation tools)
    • Advanced researchers
    • Library professionals
    • Citizen scholars
    • Student research and a lab culture
  • Only 11 historical languages with enrollments of more than 50 (96% of those are Latin, Greek, Hebrew)
  • Virtuous Cycle of Learning and Contribution (WANT TO SEE THAT SLIDE)
  • e-portfolios (instead of branded transcript)
    • What have you contributed?
    • What skills have you developed?
    • Note (his): grades are inadequate
  • To create a radically new and deeply traditional form of education (back to Humboldt)
  • Dialogue is a process that lowers the probability of evil and violence

 

Workshop: XSLT Basics & Visualizing Structural Similarity with Plectograms and XSLT

  • David Birnbaum, University of Pittsburgh
  • obdurodon.org/ku/2012-09-21_ku-plectogram.html
  • XSLT
    • stylesheet for transforming XML
    • Declarative programming language
  • XPath is the that part XSLT that tells you what things are
    • / = whole document
  • Core of XSLT is template rule xml:template match=
  • apply templates is code for look inside an element and do stuff
  • XPath predicate filters results
  • pretty print = indent button
  • refer to variable with $
  • { } turns from literal to calculation
  • XPath by default only looks within

 

Workshop: Quantitative Analysis of Literary Texts with R

  • Jeff Rydberg-Cox, University of Missouri-Kansas City
  • daedalus.umkc.edu/?page_id=29
  • R: www.r-project.org/
  • Interactive “calculator” with lots of functions you can build on
  • Variable = single value
  • Vector = list of values
  • Frame = table of values
  • Not so good at aggregating, good at graphing (Perl and Python good at aggregating)
  • Most of the work with data is prepping, pre-processing
  • could need stem/lemmatizers (for english stanford coreNLP: nlp.stanford.edu/software/corenlp.shtml)
    • wow!
  • nameofcommand( )
  • Book: R in Action www.manning.com/kabacoff/
  • Stefan Gries, corpus linguistics with R / statistics with linguistics using R
  • Are they including these in publications?
    • Have included data as supplementary, put on website
    • Wants: central repository, data set, documentation

 

Workshop: Advanced Omeka

  • Can map metadata to existing Dublin Core set OR add new fields
  • Items
  • Collections
  • Exhibits = interpretations

 

Keynote: False Positives: Opportunities and Dangers in Big Text Analysis 

  • Geoffrey Rockwell, University of Alberta, Philosophy
  • 2 types of data
    • Information at Rest (large databases)
    • Information in Motion
  • DataSift: datasift.com/
  • Brunet and the Grand Corpus
  • Opportunities
    • Filtering and subsetting (Cornel WebLab)
    • Enrichment
    • Sequence alignment (Horton, Olsen, Row – Digital Studies 2:1, turnitin.com/)
      • Follow the expression of texts over time
    • Diachronic analysis (Google Ngrams)
    • Classification and clustering (Voyant)
    • Social network analysis (Voyant)
    • Life-tracking (Wolfram: Quantified Self)
  • Ian Lancashire, Forgetful Muses
  • Predictive Data Mining: doesn’t work
    • lots of data leads to lots of false hits
    • never believe data, always investigate and verify
  • Conjecturator
  • Dreyfus, intelligence is embodied (again artificial intelligence)

 

Reading Genres

  • Benjamin MacDonald Schmidt, Princeton University, PhD Candidate History
  • sappingattention.blogspot.com/
  • Harvard Cultural Observatory: www.culturomics.org/cultural-observatory-at-harvard
  • Digital sources contribute knowledge beyond individuals
  • Metadata lets us look at social structures
  • Humanists need to be more involved in designing algorithms (not just social scientists)
  • OCR is not important: scientists have always been able to work against biases
  • Working on something like Google Ngrams but users can set own categories
  • His research is on “the history of attention”
    • Using concordances to search words preceding “attention”
    • Looking at geographical patterns of language (via newspapers)
  • “All digital history ends in 1922.”
  • Integrate big data in with other analyses in order to get greater comfort with results before we trust these types or results entirely on their own.
  • Evidence may not be novel, but it certainly is new.
  • Humanists need to be patient with data results; not everything is instantaneous

 

A World in a Grain of Sand: Uncertainty and Poetry Corpora Visualization 

  • Katherine Coles & Julie Lien, University of Utah, English
  • Poems as multidimensional living things (a single poem as large data set)
  • “Complex capta”
  • Joanna Drucker, Humanities Approaches to Graphical Display, DHQ 2011 5.1
  • Myopia at DH2012

 

Phylogenetic Futures: Big Data and Design Fiction 

  • Kari Kraus, University of Maryland, Information Science & English
  • www.karikraus.com/
  • Moving humanities to be future-oriented discipline
    • CP Snow, Two Cultures
  • Phylogenetics: en.wikipedia.org/wiki/Phylogenetics
    • collate texts before charting evolution
    • D’arcy Thompson: Fish Deformations (1917)
  • Design Fiction
    • The practice of mocking up or prototyping objects that embody our ideas about the future
    • Share Horizons: Data, Biomedicine, and the Digital Humanities
    • mith.umd.edu/sharedhorizons/
    • Conlangs & Alternate Evolution: conlang.org/ & After Man by Dixon
    • Frederic Bartlett, Remembering (1932)
    • Nonobject banko lukic barry katz: nonobjectbook.com/read
    • Jane McGonogal (sp?) alternate reality games
    • Cathy’s Book: http://en.wikipedia.org/wiki/Cathy’s_Book
    • Warning Systems for Posteriry
  • Material cultural evolution: what do we know?
    • component analysis
    • tree topology
    • variation
  • A lot of design fiction about provocation and not futurity
    • Julian Bleeker’s slow movement
  • Franco Moretti: tree of life vs. tree of culture (Graphs, Maps, and Trees)
  • Guided Variation: how biological evolution differs from cultural evolution
  • nvivo: www.qsrinternational.com/products_nvivo.aspx

 

Lightning Talks

Grounds more relative than this: Critically Harnessing Uncertainty in Digital Literary Studies in Big Data 

  • Patrick Flor, University of Kansas, Graduate Student in English
  • Gulliver in Laputa — important for DH — frame of words
  • Humanities Computing
    • Computation of/from/over humanities artifacts
    • of = generative art
    • from = statistics from a text
    • over = taking texts and making texts about texts
  • Critical Humanities Computing
    • Create and choose algorithms, artifacts, results, etc.
    • Natural language processing
      • NLTK (Python framework): nltk.org/
      • Pervert the tools accordingly
  • POS tagging
    • Good hook for processing text, but not enough
  • Immediately reentering into text, not using computing to abstract from the text (i.e. as visualization does)
    • Computer substituting for failure as reader
  • Close reading circling out into broader interpretation

 

What Are You Going to Do with that Data? Results of Needs Assessment of Humanities Scholars for Digital Collections 

  • Harriett Green, University of Illinois, Digital Humanities & English Librarian
  • Woodchipper: mith.umd.edu/corporacamp/tool.php
  • Unsworth and “scholarly primitives”

 

Museum Collecting in the Age of Big Data: Opportunities for Collaboration 

  • Peter Welsh, University of Kansas, Museum Studies
]]> http://kansas2012.thatcamp.org/09/24/my-notes/feed/ 1
Session Proposal: Undergraduates & DH Research http://kansas2012.thatcamp.org/09/21/session-proposal-undergraduates-dh-research/ Fri, 21 Sep 2012 02:15:51 +0000 http://kansas2012.thatcamp.org/?p=228

Continue reading »]]>

Inspired by Gregory Crane’s wonderful keynote earlier tonight, I would love to see a session on engaging undergraduates in digital humanities research projects. In my brief experience at the Northwestern University’s CSCDC, I’ve worked on a few digital humanities research projects and courses in which undergraduates have played key roles doing online manuscript transcription, web archiving, and creating original digital history projects. I think Gregory is right (for a number of reasons) that it is time to bring undergraduates into the research process, and the digital humanities offers so many ways in which to do this. No doubt, this is a broad topic, but I’m assuming other people have similar experiences to share or are thinking about how best to begin down this path.

]]>