RANLP 2017

Keynote speakers

Robert Dale (Language Technology Group)

"The Commercial NLP Landscape"

Summary: The last few years have seen a tremendous surge in commercial interest in Artificial Intelligence, and with it, a widespread recognition that technologies based on Natural Language Processing can support valuable commercial applications. In this talk, I'll aim to give a comprehensive picture of the commercial NLP landscape, focussing on what I see as the key categories of activity: [1] virtual assistants, including chatbots; [2] text analytics and text mining technologies; [3] machine translation; [4] natural language generation; and [5] text correction technologies. In each case my goal is to sketch the history of work in the area, to identify the major players, and to give a realistic appraisal of the state of the art.

Biography: Robert Dale runs the Language Technology Group, an independent consultancy providing unbiased advice to corporations and businesses on the selection and deployment of NLP technologies. Until recently, he was Chief Technology Officer of Arria NLG, where he led the development of a cloud-based natural language generation tool; prior to joining Arria in 2012, he held a chair in the Department of Computing at Macquarie University in Sydney, where he was Director of that university's Centre for Language Technology. After receiving his PhD from the University of Edinburgh in 1989, he taught there for several years before moving to Sydney in 1994. He played a foundational role in building up the NLP community in Australia, and was editor in chief of the Computational Linguistics journal from 2003 to 2012. He writes a semi-regular column titled 'Industry Watch' for the Journal of Natural Language Engineering.

Josef van Genabith (DFKI, Saarbruecken)

"Neural Machine Translation"

Summary: Deep Neural Nets (DNNs) are having a substantial impact on Language Technologies (LTs). In this talk, I will concentrate on neural approaches to Machine Translation (NMT), in particular for morphologically complex languages with less constrained word order, and contrast neural with previous approaches to MT, drawing on research carried out in the QT21 H2020 research and innovation project (http://www.qt21.eu/) and QT21 systems for the WMT-2015, -2016 and -2017 shared tasks. I will also briefly consider the general impact of DNNs on processing pipelines, interoperability (as in system engineering) and end-to-end training for complex LT systems. I will outline potential benefits and end with a list of some of the currently open research questions.

Biography: Josef van Genabith is one of the Scientific Directors of DFKI, the German Research Centre for Artificial Intelligence, where he heads the Multilingual Technologies (MLT) Group, and jointly with Prof. Hans Uszkoreit, the Language Technology (LT) Lab. He is also Professor of Translation-Oriented Language Technologies at Saarland University, Germany. He was the founding Director of the Centre for Next Generation Localisation (CNGL, now ADAPT), in Dublin, Ireland, and a Professor in the School of Computing at Dublin City University (DCU). He worked as a researcher at the Institut für Maschinelle Sprachverarbeitung (IMS) at the University of Stuttgart, Germany. He was awarded a PhD from the University of Essex, U.K., and obtained his first degree at RWTH Aachen, Germany. His research interests include machine translation, parsing, generation, computer-assisted-language-learning and morphology. Currently he coordinates the QT21 H2020 research and innovation project on machine translation (http://www.qt21.eu/) and heads EC SMART 2014/1074 and 2015/1091 service contracts on European Language Resource Coordination (ELRC) (http://www.lr-coordination.eu/).

Veronique Hoste (Ghent University)

"Monitoring social media for signals of suicidality"

Summary: Online platforms are increasingly being used for expressing suicidal thoughts, but prevention workers are faced with an information overload when monitoring for such signals of distress. I will present ongoing work in collaboration with the Belgian suicide prevention center on online suicidality detection. I will discuss the problem of data labeling grounded in prevention practice and two derived classification tasks, viz. detection of suicide-related posts and of severe high-risk content. For the highly skewed datasets we were confronted with, I will elaborate on a range of features, the effect of model optimization through feature selection and hyperparameter optimization and of cascaded classification. We showcased the viability of suicidality detection for the moderation of a Belgian online platform, and are currently rolling it out to more networks. I will end by discussing potential other ways in which NLP can support suicide prevention.

Biography: Veronique Hoste is Professor of Computational Linguistics at Ghent University. She is the head of the Department of Translation, Interpreting and Communication at the Faculty of Arts and Philosophy. She is also the director of the LT3 Language and Translation Technology Team at the same department. She holds a PhD in computational linguistics from the University of Antwerp on Optimization Issues in machine learning of coreference resolution (2005). Based on the conviction that shallow representations based on lexical information are not sufficient to model text understanding, the team has heavily invested in often crosslingual research on named entity recognition, automatic disambiguation of word senses (WSD), anaphora resolution, hypernym and hyponym detection, etc. and in applications exploiting deeper text representations (such as readability prediction). Veronique has published on her work both in leading conference proceedings and journals and has different PhD students under her guidance.

Roberto Navigli (Sapienza University of Rome)

"Multilinguality for free, or why you should care about linking to (BabelNet) synsets"

Summary: Multilinguality is a key feature of today’s Web and a pervasive one in an increasingly interconnected world. However, many semantic representations, such as word (and often sense) embeddings, are grounded in the language from which they are obtained. In this talk I will argue that there is a pressing need to link our meaning representations to large-scale multilingual semantic networks such as BabelNet, and will show you several tasks and applications where multilingual representations of meaning provide a big boost, including key industrial use cases from Babelscape, our Sapienza startup company.

Biography: Roberto Navigli is Full Professor in the Department of Computer Science of the Sapienza University of Rome. He was awarded the Marco Somalvico 2013 AI*IA Prize for the best young researcher in AI. He is the first Italian recipient of an ERC Starting Grant in computer science, on multilingual word sense disambiguation (2011-2016), a recent winner of a second ERC Consolidator Grant on multilingual language-independent semantic representations, and a co-PI of a Google Focused Research Award on Natural Language Understanding. In 2015 he received the META prize for groundbreaking work in overcoming language barriers with BabelNet , a project also highlighted in TIME magazine. This year, he received the Prominent Paper Award from the Artificial Intelligence Journal. His research lies in the field of Natural Language Processing (including multilingual word sense disambiguation and induction, multilingual entity linking, large-scale knowledge acquisition, ontology learning from scratch, gamification for NLP, open information extraction and relation extraction). Currently he is an Associate Editor of the Artificial Intelligence Journal.

Joakim Nivre (Uppsala University)

"Perspectives on Universal Dependencies"

Summary: Universal Dependencies (UD) is a framework for cross-linguistically consistent treebank annotation that has so far been applied to over 50 languages. A basic design principle of UD is to give priority to grammatical relations between content words, which are more likely to be parallel across languages, and to treat function words essentially as features of content words, functionally similar to but structurally distinct from morphological inflection. This principle has been questioned on the grounds that it gives rise to representations that are suboptimal for dependency parsing, where higher accuracy has often been observed when function words are treated as syntactic heads. In this talk, I will defend this principle from three different perspectives. First, I will show how it allows us to capture linguistic universals, similarities in grammatical constructions across structurally different languages, and thereby gives us a solid basis for contrastive linguistic studies. Second, I will illustrate how it provides a natural interface to semantic interpretation, and thereby serves the needs of downstream language understanding tasks, especially in multilingual settings. Finally, I will review recent work on UD parsing, suggesting that the suboptimal nature of the representations has been greatly exaggerated.

Biography: Joakim Nivre is Professor of Computational Linguistics at Uppsala University. He holds a Ph.D. in General Linguistics from the University of Gothenburg and a Ph.D. in Computer Science from Vaxjo University. His research focuses on data-driven methods for natural language processing, in particular for syntactic and semantic analysis. He is one of the main developers of the transition-based approach to syntactic dependency parsing, described in his 2006 book Inductive Dependency Parsing and implemented in the widely used MaltParser system, and one of the founders of the Universal Dependencies project, which aims to develop cross-linguistically consistent treebank annotation for many languages and currently involves over 100 researchers around the world. He has produced over 200 scientific publications and has more than 10,000 citations according to Google Scholar (May, 2017). He is currently the president of the Association for Computational Linguistics.