Text-Based Twitter User Geolocation Prediction

نویسندگان

  • Bo Han
  • Paul Cook
  • Timothy Baldwin
چکیده

Geographical location is vital to geospatial applications like local search and event detection. In this paper, we investigate and improve on the task of text-based geolocation prediction of Twitter users. Previous studies on this topic have typically assumed that geographical references (e.g., gazetteer terms, dialectal words) in a text are indicative of its author’s location. However, these references are often buried in informal, ungrammatical, and multilingual data, and are therefore non-trivial to identify and exploit. We present an integrated geolocation prediction framework and investigate what factors impact on prediction accuracy. First, we evaluate a range of feature selection methods to obtain “location indicative words”. We then evaluate the impact of nongeotagged tweets, language, and user-declared metadata on geolocation prediction. In addition, we evaluate the impact of temporal variance on model generalisation, and discuss how users differ in terms of their geolocatability. We achieve state-of-the-art results for the text-based Twitter user geolocation task, and also provide the most extensive exploration of the task to date. Our findings provide valuable insights into the design of robust, practical text-based geolocation prediction systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Twitter User Geolocation Using a Unified Text and Network Prediction Model

We propose a label propagation approach to geolocation prediction based on Modified Adsorption, with two enhancements: (1) the removal of “celebrity” nodes to increase location homophily and boost tractability; and (2) the incorporation of text-based geolocation priors for test users. Experiments over three Twitter benchmark datasets achieve state-of-the-art results, and demonstrate the effecti...

متن کامل

A Stacking-based Approach to Twitter User Geolocation Prediction

We implement a city-level geolocation prediction system for Twitter users. The system infers a user’s location based on both tweet text and user-declared metadata using a stacking approach. We demonstrate that the stacking method substantially outperforms benchmark methods, achieving 49% accuracy on a benchmark dataset. We further evaluate our method on a recent crawl of Twitter data to investi...

متن کامل

Twitter Geolocation Prediction Shared Task of the 2016 Workshop on Noisy User-generated Text

This paper describes the shared task for the English Twitter geolocation prediction associated with WNUT 2016. We discuss details of the task settings, data preparation and participant systems. The derived dataset and performance figures from each system provide baselines for future research in this realm.

متن کامل

pigeo: A Python Geotagging Tool

We present pigeo, a Python geolocation prediction tool that predicts a location for a given text input or Twitter user. We discuss the design, implementation and application of pigeo, and empirically evaluate it. pigeo is able to geolocate informal text and is a very useful tool for users who require a free and easy-to-use, yet accurate geolocation service based on pre-trained models. Additiona...

متن کامل

Geolocation Prediction in Twitter Using Location Indicative Words and Textual Features

Knowing the location of a social media user and their posts is important for various purposes, such as the recommendation of location-based items/services, and locality detection of crisis/disasters. This paper describes our submission to the shared task “Geolocation Prediction in Twitter” of the 2nd Workshop on Noisy User-generated Text. In this shared task, we propose an algorithm to predict ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Artif. Intell. Res.

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2014