In the recent years, the amount of user generated contents shared on the Web has significantly increased, especially in social media environment, e.g. Twitter, Facebook, Google+. This large quantity of data has generated the need of reactive and sophisticated systems for capturing and understanding the underlying information enclosed in them.
TWINE is a real-time system for the exploration of information extracted from Twitter streams to help users gaining insights into these streams. The proposed system based on a Named Entity Recognition and Linking pipeline, which extracts real-world entities mentioned in tweets, link these mentions to entities described in the DBpedia knowledge base, and use these entities and additional information to provide multi-dimensional spatial analysis of processed tweets. TWINE is supported by a scalable and flexible architecture based on Big Data technologies.
A user can define her own search keywords that are used by TWINE to retrieve a stream of tweets. Tweets are analyzed in real-time and can be visualised and filtered using TWINE interfaces.
Debora Nozza, Fausto Ristagno, Matteo Palmonari, Elisabetta Fersini, Pikakshi Manchanda, and Enza Messina. 2017. TWINE: A real-time system for TWeet analysis via INformation Extraction. In Proceedings of the European Chapter of the Association for Computational Linguistics (Demo Paper).
Davide Caliano, Elisabetta Fersini, Pikakshi Manchanda, Matteo Palmonari, and Enza Messina. 2016. Unimib: Entity linking in tweets using jaro- winkler distance, popularity and coherence. In Proceedings of the 6th International Workshop on Making Sense of Microposts (# Microposts).