The past few years have seen an increasing interest in Natural Language Processing and other text-mining techniques in the humanities. This tendency has been sparked by the distant reading approaches for literary theory, which aim to draw general conclusions on larger amounts of text using computational techniques. This way of tackling data has gained momentum, and many (large scale) projects have been set up in order to meet the expectations of humanities scholars wanting to make sense of the vast amount of digitised data that has become available in public domain, primarily, 18th and 19th but also 20th century publications. For some time NLP has been mainly used for the confirmation of existing historical knowledge, but now many techniques and tools have become more mature, it is time to draw an intermediary balance of NLP and text mining in general and the mining of serial publications in particular. Moreover, it also time to question the phenomenon of ‘scientific serendipity’ and NLP as a set of technologies enabling such serendipity.
We intend to bring together humanists, social scientists and computational scientists who have been working with historical serial texts and/or periodicals: including newspapers, journals, book series, congress series, etc. We are mainly interested in reflections upon completed case-studies: the research results compared to the initial expectations, examples of the phenomenon of ‘scientific serendipity’, the theoretical and heuristical implications of the choices made during the research process, comparisons between NLP and other methods such as Semantic Network Analysis. Last but not least, we welcome reflections on graphical approaches to text corpora, such as visualisation methods that are en vogue for exploratory searching, analysis and communication of textual analysis. From n-grams through tree diagrams to word clouds: what are the epistemological implications of using graphic displays developed outside the humanities?
We welcome contributions dealing with current issues in fields ranging from socio-cultural history and history of ideas to literary studies. Librarians, archivists and policy makers will be invited for a roundtable to discuss their policy towards opening up digital collections.
The contributors to the workshop will be asked to submit a working paper one week before the date of the workshop, in order to foster a discussion between the participants and to prepare the work on an edited volume published with an international publisher.
Confirmed participants: Michael Piotrowski (Université de Lausanne), Eva Pettersson (Uppsala University) and Mikko Tolonen (University of Helsinki).
Scientific committee: Sally Chambers (Ghent University), Steven Claeyssens (Royal Library Netherlands), Simon Hengchen (Université libre de Bruxelles), Mike Kestemont (Antwerp University), Frédéric Lemmers (Royal Library of Belgium), Seth van Hooland (Université libre de Bruxelles), Marianne Van Remoortel (Ghent University), Joris Van Eijnatten (Utrecht University), Charles Van den Heuvel (University of Amsterdam), Christophe Verbruggen (Ghent University).
Sponsored by: TIC Collaborative (Belspo-Brain), Research Community Digital Humanities Flanders (FWO), DARIAH-VL (FWO), ReSIC ULB and Ghent Centre for Digital Humanities.