In a previous blogpost, I introduced the project A Republic of Emails, where we created a dataset of the 30k Hillary Clinton Emails by scraping Wikileaks. Now that we have the data, we can start exploring with what I like to call the W-questions: What is the collection about? Where do described events take place? When did these events occur? Who are the actors involved? In this second blogpost, we will look at what the emails from the Hillary Clinton corpus are about. I will describe how we prepared the data to analyse a) the raw text, b) normalised text, and c) entities in the text (named entity recognition). Finally, we will look at a small subset of the emails using Voyant Tools. For all the steps I will point to the respective scripts on our GitHub so you can reproduce the project.
This year I will teach for the second time the Doing Digital History course for the History master at the University of Luxembourg. Just like last year, students will ask several W-questions. What is the collection about? Where do described events take place? When did these events occur? Who are the actors involved? In contrast with last year, where we had different collections per week, this year students will work with a single collection to experiment with throughout the course. In a series of blogposts I will describe the collection that the students will be exploring and the methods/tools that will be used to conduct close and distant reading. If you have feedback to further improve our ideas, please comment. If you wish to reproduce the project for your own courses, the blogposts should allow just that. As a reference to the historical Republic of Letters, I like to call this project A Republic of Emails.
The past six months I have been on parental leave to enjoy our son Felix (born 13 December 2015), and today I am finally back at the university. In these months I have seen a baby grow from not being able to do anything except for reflexes, to understanding objects around him, interacting with them, and manipulating them to do what he wants (although not yet always successfully). Watching him go through these stages of learning actually reminded me of the above gif captioned as how software developers see end users. When I saw that gif a while ago it gave me a laugh, but then I saw that my son had taken my bottle of water, and what he was doing was actually quite similar; licking the bottom, sucking on the side, holding it with his feet.
At some point he figured out what the top part is, and put that in his mouth, which left me to wonder how he figured it out. I left the cap on, so it’s not a simple trial-reward since he still cannot drink the water. Instead, I think there are two aspects of this learning process: visual feedback (seeing what side is supposed to be up), and learning by playing.
This week I’m at DHBenelux 2016, right here at the University of Luxembourg. I am part of the local organisation of the conference, and will give a tour of the DH Lab which launched its website www.dhlab.lu this week. Moreover, I will present my PhD research in a short paper, see below the abstract for my presentation. To learn more about DHBenelux, see my previous posts on DHBenelux 2016 submissions and DHBenelux submissions 2014-2016.
This year marks the third annual DHBenelux conference, which cycles through the Netherlands, Belgium, and Luxembourg. The third instalment will be held in Luxembourg, and as part of the local organisation and programme committee I get the chance this year to look at all the submissions. Inspired by Scott Weingart’s series on submissions to the annual ADHO DH conference (see his 2016 post on submissions here), I present you a first analysis of the submissions to DHBenelux 2016. Later posts will bring comparisons with the 2014 and 2015 editions, as well as a description of the steps taken to get to the figures below.
“Standing on the shoulders of giants” has long been the metaphor of choice to describe the scholarly workflow of discovering, reading, and citing literature. However, for the past decade this workflow has been influenced significantly by the availability of academic search engines. In this field, the search giant Google has come out as the discovery mechanism of choice. How does “standing on the shoulders of the Google giant” impact the scholarly workflow? This is a question I look into in a post on the LSE Impact of Social Science blog. Read the entire post here.
Methodological Intersections, the Digital Humanities Autumn School organised by Trier University and University of Luxembourg was held this year from 28 September to 3 October. With four days of theoretical reflection in Trier, and two days of hands-on courses in Belval, this autumn school provided a great introduction to the Digital Humanities for PhD students.
This blogpost is not intended to provide a complete overview of the autumn school, but rather to show the discussion from my perspective. The main theme I will follow is the discussion of tools, and that there is a need for tool appraisal.
The development of tools plays an important role in the Digital Humanities. For the recent DHBenelux conference, I found that the word “tool” was used almost a hundred times in all the abstracts, not counting my own. Still, the actual adoption of all these tools by the target audience, the humanities scholars, does not always reach its potential. Claire Warwick, M. Terras, Paul Huntington, & N. Pappa. (2007). If You Build It Will They Come? The LAIRAH Study: Quantifying the Use of Online Resources in the Arts and Humanities through Statistical Analysis of User Log Data. Literary and Linguistic Computing, 23(1), 85–102. http://doi.org/10.1093/llc/fqm045 (OA version ) In a recently published paper by Martijn Kleppe and me, titled User Required? On the Value of User Research in the Digital Humanities, we look into how Digital Humanities scholars might address this problem.Max Kemman, & Martijn Kleppe. (2015). User Required? On the Value of User Research in the Digital Humanities. In Jan Odijk (Ed.), Selected Papers from the CLARIN 2014 Conference, October 24-25, 2014, Soesterberg, The Netherlands (pp. 63–74). Linköping University Electronic Press.
References [ + ]
|1.||↑||Claire Warwick, M. Terras, Paul Huntington, & N. Pappa. (2007). If You Build It Will They Come? The LAIRAH Study: Quantifying the Use of Online Resources in the Arts and Humanities through Statistical Analysis of User Log Data. Literary and Linguistic Computing, 23(1), 85–102. http://doi.org/10.1093/llc/fqm045 (OA version )|
|2.||↑||Max Kemman, & Martijn Kleppe. (2015). User Required? On the Value of User Research in the Digital Humanities. In Jan Odijk (Ed.), Selected Papers from the CLARIN 2014 Conference, October 24-25, 2014, Soesterberg, The Netherlands (pp. 63–74). Linköping University Electronic Press.|
In the first week of June, my supervisor Andreas Fickers and I went to the US to visit several Digital Humanities centres, specifically ones working on Digital History, in Boston (MA), Lincoln (NE), and Fairfax (VA). Since the University of Luxembourg will get its own DH centre soon, we went with the goal of learning how others set up their centre, how DH is incorporated into the curriculum, and how collaboration takes place.
This blogpost is an attempt to summarise what we learned during our visit to the US. The structure I will follow is not chronologically, but by the title of John le Carré’s novel: Tinker (building and making), Tailor (specific versus generic tools), Soldier (collaborations of people), Spy (digital literacy regarding online tracking and other subjects). At the bottom of the blogpost is a numbered list of the people we met; I will refer to sources of information using these numbers.