CSIRO Data61 at the WNUT Geo Shared Task
نویسندگان
چکیده
In this paper, we describe CSIRO Data61’s participation in the Geolocation shared task at the Workshop for Noisy User-generated Text. Our approach was to use ensemble methods to capitalise on four component methods: heuristics based on metadata, a label propagation method, timezone text classifiers, and an information retrieval approach. The ensembles we explored focused on examining the role of language technologies in geolocation prediction and also in examining the use of hard voting and cascading ensemble methods. Based on the accuracy of city-level predictions, our systems were the best performing submissions at this year’s shared task. Furthermore, when estimating the latitude and longitude of a user, our median error distance was accurate to within 30 kilometers.
منابع مشابه
Data61-CSIRO systems at the CLPsych 2016 Shared Task
This paper describes the Data61-CSIRO text classification systems submitted as part of the CLPsych 2016 shared task. The aim of the shared task is to develop automated systems that can help mental health professionals with the process of triaging posts with ideations of depression and/or self-harm. We structured our participation in the CLPsych 2016 shared task in order to focus on different fa...
متن کاملStable Matching with Uncertain Pairwise Preferences
Haris Aziz Data61, CSIRO and UNSW Sydney, Australia [email protected] Péter Biró Hungarian Academy of Sciences, Budapest, Hungary [email protected] Tamás Fleiner Loránd Eötvös University Budapest, Hungary [email protected] Serge Gaspers Data61, CSIRO and UNSW Sydney, Australia [email protected] Ronald de Haan Technische Universität Wien Vienna, Austria [email protected]...
متن کاملAdapting a General Purpose Social Robot for Paediatric Rehabilitation through In-situ Design
FELIP MARTÍ CARRILLO, Swinburne University of Technology, Australia and Data61, CSIRO, Australia JOANNA BUTCHART, Royal Children’s Hospital, Australia and Murdoch Children’s Research Institute, Australia SARAHKNIGHT,Murdoch Children’s Research Institute, Australia and Royal Children’s Hospital, Australia ADAM SCHEINBERG, Royal Children’s Hospital, Australia and Murdoch Children’s Research Insti...
متن کاملTwitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text
This paper describes the shared task for the English Twitter geolocation prediction associated with WNUT 2016. We discuss details of the task settings, data preparation and participant systems. The derived dataset and performance figures from each system provide baselines for future research in this realm.
متن کاملBidirectional LSTM for Named Entity Recognition in Twitter Messages
In this paper, we present our approach for named entity recognition in Twitter messages that we used in our participation in the Named Entity Recognition in Twitter shared task at the COLING 2016 Workshop on Noisy User-generated text (WNUT). The main challenge that we aim to tackle in our participation is the short, noisy and colloquial nature of tweets, which makes named entity recognition in ...
متن کامل