NLP4CMC2016: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication / Social Media

Call for Papers

27 Apr 2016 4 min read Call for Papers

NLP4CMC 2016: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication / Social Media - Workshop at KONVENS 2015, Bochum/Germany September 22, 2016.

TOPIC AND SCOPE:

Over the past decade, there has been a growing interest in collecting, processing and analyzing data from genres of social media and computer-mediated communication (CMC): As part of large corpora which have been automatically crawled from the web, CMC data are often regarded as an unloved “bycatch” which is difficult to handle with NLP tools that have been optimized for processing edited text; on the other hand, the existence of CMC data in web corpora is relevant for all research and application contexts which require data sets that represent the full diversity of genres and linguistic variation on the web. For corpus-based variational linguistics, CMC corpora are an important resource for closing the “CMC gap” both in corpora of contemporary written language and in corpora of spoken language: Since CMC and social media make up an important part of contemporary everyday communication, investigations into language change and linguistic variation need to be able to include CMC and social media data into their empirical analyses. Nevertheless, the development of approaches and tools for processing the linguistic and structural peculiarities of CMC genres and for building CMC corpora is lacking behind the interest of dealing with these types of data in the field of language technology, corpus-based linguistics and web mining.

The goal of the NLP4CMC workshops which are organized by the GSCL special interest group “Social Media / Computer-Mediated Communication” is to provide a platform for the presentation of results and the discussion of ongoing work in adapting NLP tools for processing CMC data and in using NLP solutions for building and annotating social media corpora. The main focus of the workshops is on German data, but submissions on NLP approaches, annotation experiments and CMC corpus projects for data of other European languages are also welcome. The 1st NLP4CMC workshop was held in September 2014 at KONVENS at the University of Hildesheim. The 2nd NLP4CMC workshop was held in September 2015 at the international conference of the German Society forLanguage Technology and Computational Linguistics (GSCL) at the University of Duisburg-Essen. The papers from both workshops have been published online.

TOPICS OF INTEREST:

We encourage the submission of research papers on best practices in building, annotating and processing corpora and lexical semantic resources for the analysis of social media / computer-mediated communication (CMC) - including, but not restricted to the following topics:

Collection, representation, maintenance and computer-assisted/automatic analysis of CMC and social media resources
Normalization (spelling correction, …)
Automatic preprocessing (tokenization, POS tagging, lemmatization, parsing, word sense disambiguation)
Annotation of linguistic and structural features in social media / CMC data (annotation schemas, annotation experiments, metadata …)
Domain adaptation
Automatic methods in corpus-based CMC / social media analysis (sentiment analysis, summarization, topic detection, trend detection, …)
Big-data social media analysis

Besides individual papers the workshop program will include a round-table discussion with participants from the GSCL Shared Task on Automatic Linguistic Annotation of CMC / Social Media Corpora (EmpiriST2015) which will present and discuss results from the project and future perspectives for adapting NLP systems to CMC and social media data.

IMPORTANT DATES:

Submissions due: 30 June 2016
Notification (reviews due): 31 July 2016
Camera-ready papers (revised versions) due: 22 August 2016
Workshop: 22 September 2016

SUBMISSIONS:

Submissions should include the names and addresses of all authors and meet the following requirements:

Language: German or English
Length: 2-4 pages (incl. references)
Format: PDF
Style sheet: Please use the official KONVENS style sheet: https://www.linguistics.rub.de/konvens16/call/index.html#formatting-guidelines
Submissions will be accepted via EASYCHAIR: https://easychair.org/conferences/?conf=nlp4cmc2016

PROGRAM COMITEE:

Sabine Bartsch, TU Darmstadt
Stefanie Dipper, Ruhr University Bochum
Stefan Evert, University of Erlangen-Nürnberg
Iris Hendrickx, Radboud University Nijmegen
Verena Henrich, University of Tübingen
Axel Herold, Berlin-Brandenburg Academy of Sciences (BBAW), Berlin
Andrea Horbach, University of Saarbrücken
Tobias Horsmann, University of Duisburg-Essen
Anke Lüdeling, Humboldt University Berlin
Harald Lüngen, Institute for the German Language (IDS), Mannheim
Preslav Nakov, Qatar QCRI
Ines Rehbein, University of Potsdam
Roman Schneider, Institute for the German Language (IDS), Mannheim
Egon W. Stemle, Eurac Research, Bozen
Angelika Storrer, University of Mannheim
Simone Ueberwasser, University of Zürich
Kay-Michael Würzner, Berlin-Brandenburg Academy of Sciences (BBAW), Berlin

(more to be announced)

WOSKHOP ORGANIZERS:

Michael Beißwenger (University of Duisburg-Essen, German Linguistics)
Michael Wojatzki (University of Duisburg-Essen, Language Technology Lab)
Torsten Zesch (University of Duisburg-Essen, Language Technology Lab)

The workshop is organized by the special interest group “Social Media / Computer-Mediated Communication” of the German Society for Computational Linguistics & Language Technology (GSCL) (http://gscl.org/ak-ibk.html).