With all digitization efforts of the last decade, researchers are no longer only concerned with libraries, but with digital libraries. How does this transform the distribution of information? At the conference Theory and Practice in Digital Libraries (22-26 September 2013, Valetta, Malta) researchers and librarians discussed the current state-of-the-art as well as the future directions for digital libraries.
This blogpost is not intended to provide a complete overview of the conference, but rather to show the discussion of digital libraries from my perspective.
From books to dataThe opening keynote by Christine Borgman gave an overview of how libraries and research are transforming from books to data . Borgman envisions the future of science as open science through open access publications and reusable research data. Not only does this change what academic communication (e.g., is a tweet a citation?), but introduces difficulties in defining what the research data is and how to ensure this is reusable. To be reusable, we need to anticipate the researchers who have not been born yet. This not only concerns durability, but also the ability to interpret methods and data later on.
This change from books to data also introduces interesting new sources of data. Herbert van de Sompel presented the Memento project, which aims to make the history of the web browsable. As different archives have different coverage of the web, Memento indexes several archives to gain a broad overview of the web . How to treat such snapshots is not trivial, as Michael Nelson showed: snapshots of webpages can contain information from multiple periods, as websites are dynamic . How to treat dynamic data is in general an interesting challenge for libraries and archives.
A network of digital libraries
What became apparent from the TPDL2013 programme is that a lot of research is ongoing in how to employ the Semantic Web for digital libraries. Ridho Reinanda gave a very interesting presentation on how to use Named Entity Recognition to create a web of historical documents to link persons and events , a similar approach as the PoliMedia project has, which I presented . Moreover, RDF can be used to make research data reusable: Sören Auer showed a system to annotate papers in RDF in order to make methods and data discoverable and reusable . Unfortunately, they chose to implement this as manual annotations, which seems like a difficult and easily misused approach to me.
At the SUEDL2013 workshop, Theo van Veen presented how the Dutch National Library uses linked data to enrich documents. Instead of creating a browsable web of data, the aim is to use it to enhance existing documents, which is an interesting perspective . Even more fascinating was his vision that with linked data, libraries might become outdated; users will want to interact with a web of linked collections. Instead of library interfaces, a multitude of tools can be developed on this web of linked data. This was already demonstrated by several projects on top of Europeana data, such as map-based visualizations , or similarity search engines .
Standards for digitization?
The tour at the National Library of Malta remembered us of the necessary groundwork that needs to be done to make the above possible. Before computer scientists can link collections, before scholars can interpret this web of data, librarians need to digitize the not-born-digital material in their library. However, what became apparent is that this process of digitization is not fully standardized yet. Having digitized about 1% of their collection, they described they scan in MOS and JPG, which they publish online in PDF with watermarks. At this point there was no OCR present. Whether and how this is suitable for reuse by researchers is debatable: Lambert Schomaker stated at eHumanities Workshop Soeterbeeck that he wished digitization would be done in TIFF. Before researchers can make a true semantic web of digital libraries, the documents need to be digitized in a standardized way. This appears to be pretty much impossible at larger (European) scale.
The future of digital libraries
Just like general web search is shifting from searching for web pages to searching for answers in semantic search, so too will digital libraries shift from institutionalized collections to linked data. It appears computer scientists will become more important in how users access cultural heritage. Instead of focusing on portals into digital libraries, users will be able to explore and interact with the data itself through means of a multitude of visualizations and tools. To truly understand what users and scholars will need from such tools, collaboration between computer scientists, scholars and librarians is now more important than ever .
 Borgman, C.L. (2013). Digital Scholarship and Digital Libraries: Past, Present, and Future. 17th International Conference on Theory and Practice of Digital Libraries. Valletta, Malta. September 2013. Available at: http://works.bepress.com/borgman/273
 Alsum, A., Weigle, M.C., Nelson, M.L., & Van de Sompel, H. (2013). Profiling Web Archive Coverage for Top-Level Domain and Content Language. In Research and Advanced Technology for Digital Libraries (pp. 60-71). Springer Berlin Heidelberg.
 Kelly, M., Brunelle, J. F., Weigle, M. C., & Nelson, M. L. (2013). On the Change in Archivability of Websites Over Time. In Research and Advanced Technology for Digital Libraries (pp. 35-47). Springer Berlin Heidelberg.
 Reinanda, R., Utama, M., Steijlen, F., & de Rijke, M. (2013). Entity Network Extraction based on Association Finding and Relation Extraction. In Research and Advanced Technology for Digital Libraries (pp. 156-167). Springer Berlin Heidelberg.
 Kemman, M., & Kleppe, M. (2013). PoliMedia – Improving Analyses of Radio, TV & Newspaper Coverage of Political Debates. In Research and Advanced Technology for Digital Libraries (pp. 401–404). Springer Berlin Heidelberg.
 Auer, S. (2013). What can Linked Data do for Digital Libraries? 17th International Conference on Theory and Practice of Digital Libraries. Valletta, Malta. Sep. 2013.
 Van Veen, T., & Koppelaar, M. (2013). Doing More With Named Entities; Turning Text Into a Linked Data Hub. The 2nd International Workshop on Supporting Users Exploration of Digital Libraries. Valletta, Malta. September 2013.
 Hall, M., & Clough, P. (2013). Exploring Large Digital Library Collections Using a Map-Based Visualisation. In Research and Advanced Technology for Digital Libraries (pp. 216-227). Springer Berlin Heidelberg.
 Gordea, S. (2013). An Image Similarity Search for European Digital Library and Beyond. The 2nd International Workshop on Supporting Users Exploration of Digital Libraries. Valletta, Malta. September 2013.
 Kemman, M., Scagliola, S., De Jong, F., & Ordelman, R. (2013). Talking With Scholars – Developing a Research Environment for Oral History Collections. The 2nd International Workshop on Supporting Users Exploration of Digital Libraries. Valletta, Malta. September 2013.