Where? Investigating the spatial entities in a corpus

Max Kemman
University of Luxembourg
November 29, 2016

Online slides optimised for Full-HD screens in full-screen mode
Download PDF here

Doing Digital History: Introduction to Tools and Technology


  • Space: Our Next Frontier
  • StoryMap
  • CartoDB
  • Selecting data with SQL
  • Next time
    • Assignment

Space: Our Next Frontier

Gregory - Exploiting Time and Space

Separation of time (historians) and space (geographers)

Cronon: two theories of growth of Chicago

  • Wilderness, then traders arrive, then cattle ranching, then extensive agriculture, then more intensive agriculture, finally industry
    Settlements become denser leading to cities
  • A city, a ring of intensive agriculture, more extensive agriculture, cattle ranching, finally wilderness
    The further from the city, the less viable to transport goods, the less dense the economic activity


Taking only one perspective is a serious limitation

Gregory - GIS does not answer the why question: this is where more traditional approaches are of use

Bodenhamer - The Potential of Spatial Humanities

Critique of GIS as being positivist or realist:

  • Reality is perspective of observer, not reflected in maps
  • Maps can imply a certainty that the underlying evidence does not permit


Bodenhamer - How can we make GIS do what it was not intended to do, namely, represent the world as culture and not simply mapped locations?

Rather than using GIS in a quantitative sense, we will try to tell a story with a map

StoryMap Hands-on

Two options:


StoryMap creation

Before you start: StoryMap is not a data tool where you upload data to work on

Instead, you write texts and manually put the texts on the map

When working with the emails: look up selected emails by following the wikileaks url, read them, and summarise them per location

Getting started

Start by "Make a StoryMap now"

Login with Google account

Create a new StoryMap

Choose a title for now & create a first slide


The system works with slides similar to PowerPoint

Create a new slide per new location, drag to reorder

Locations - pin

With the pin you can select the location you want to describe

Drag to move the pin to where you want, zoom using the + and - symbols in the upper-left corner

Locations - search

Alternatively, you can search a location by its name

The Story

Tell the story: give the slide a title/headline and write the story. You can use HTML here


To make the story more visual, include relevant photos


Change the map with fonts, and different map types


Make sure to regularly save your work. And save when you're done


When you're done: click the Share button to get a link or HTML code for embedding in your report

Embedding: You might want to play a bit with the width and height settings


As showcased last week by Kate Jones


You can login with your Google account

New dataset

Upload CSV

Either drag and drop the file onto the browser window, or select the "Browse" button

CartoDB does not work with the ODS file, upload the CSV instead

However, if you have been working with the ODS in Spreadsheets (because your laptop is French/German), then import the spreadsheet by connecting to Google Drive

Connect dataset

Connecting dataset

Create map

Your first result will be a table

Click "Create map" to start working on the map

Map - select layer

Your first map will show pins of all the locations

To edit the map click the title of the layer with the same name as your dataset


Select "Pop-up" and play with the settings to get information when clicking a pin on the map


To get other types of maps, select "Style"

Here I've chosen the heatmap


When you have multiple colours, maybe add a legend to explain the colours

Selecting data with SQL

If you haven't made a selection with Google Spreadsheets before, this can be done in Carto as well

To make a selection of emails, change "Values" to "SQL". You will get this screen


Type a query for example to select emails where the subject contains "bomb"


Click "apply" to apply the SQL. With the lower right buttons you can switch between table view and map view of your selection


The SQL query should have the following form:

SELECT * FROM name-of-dataset where name-of-column ilike '%word-of-interest%'

Don't touch the first parts

For name-of-column, you can select subject, _from, _to, or date (NOTE THE _ on the _from and _to columns!)

For word-of-interest you can fill in anything you find of interest

Do not leave out the '% %'!


Some examples

  • Select emails from Hillary Clinton SELECT * FROM f1_geocoded where _from ilike '%clinton%'
  • Select emails to Hillary Clinton SELECT * FROM f1_geocoded where _to ilike '%clinton%'
  • Select emails where subject contains "bomb" SELECT * FROM f1_geocoded where subject ilike '%bomb%'
  • Finally, we can combine multiple selections (remove linebreak)
    SELECT * FROM f1_geocoded where _from ilike '%abedin%' 
    and _to ilike '%clinton%'

SQL date

Selecting emails from specific time period works a bit differently:

SELECT * FROM f1_geocoded where date >= '2010-01-01' and date < '2011-01-01'

Read: select emails where the date is January 1st of 2010 or later, and before January 1st 2011

Can also be used to choose a specific month, or a period spanning multiple months or years

Comparing layers in a single map

If you want to compare different information, you can create layers in a map

For example, suppose we want to see the difference between emails from Cheryll Mills and from Huma Abedin

Creating layer 1

First, in the first layer we will select emails from Cheryll Mills and apply

To know which layer shows what, we will rename the layer to describe what it shows. Next, click the blue Add button (encircled)

Creating layer 2

To create a 2nd layer, just select the same dataset and click add layer

In the 2nd layer, we now select emails from Huma Abedin and apply

Showing the difference

Finally, to show the difference, in one of the 2 layers change the style of the pins and choose a different colour


To share your map, select the icon in the blue bar and follow the options. Finally click "Share"


Click "Publish"


Either copy the link, or the HTML-code to embed the map in your report

For next time

6 December

Who? Investigating the social entities in a corpus


  • From the Hillary Clinton emails, make a selection of emails based on time (e.g. a week) or subject/content
  • Map the locations, and describe per location what is discussed about these locations
  • Aim: a visual overview of how Hillary Clinton as secretary of state covers different locations
  • Use either StoryMap or CartoDB
  • Data: download f1-geocoded (emails 1000-1999) or 10k-geocoded (emails 1-9999) either as ODS or CSV file (CartoDB requires CSV!)


  • Work in pairs of two or three
  • Document your steps and choices and discuss how this map tells a story and how it is different from just writing the story: include both good and bad!
  • Hand in the assignment in HTML, include your name and a decent profile photo


800-1500 words, in English (not including the text in the StoryMap)


  • 1pt for free
  • 1pts for HTML& CSS
  • 3pts for a good diary and map
  • 2pts for documentation of your process (choice of tool, choice of data selection)
  • 3pts for critical reflection on your map (what does it show that you don't see from just the emails?)

Email to max.kemman@uni.lu before the start of the lecture of 13 December