Submitted by CLARIN-LT on 5 June 2017
In May 2017, CLARIN-LT hosted the CLARIN-PLUS Workshop “Creation and Use of Social Media Resources “in Kaunas, Lithuania, the fourth and final workshop in a series of user involvement workshops within the CLARIN-PLUS project. The aim of the workshop was to gather researchers interested in social media data from computational linguistics, social sciences, psycholinguistics, corpus linguistics, language variation and other research domains, so that they could share their ideas on a number of themes and topics. This included sharing information about existing social media data resources and tools, active research projects that analyse social media data, and exploring problems related to the collection, storage and access of data. The discussions led to identifying some common research directions, some joint efforts to achieve resource interoperability and comparability, as well as further suggestions for ongoing cooperation.
The CLARIN-LT community was especially happy to host the event in Kaunas, Lithuania and hoped the participants would enjoy the stay on a professional and personal level. This event felt special because it brought together specialists from very different research disciplines in order to gain insight into the variation and diversity of social media as a phenomenon, its research and resource creation. We also wanted to encourage a dialogue between the users and creators of social media resources.
After the workshop, we asked Lithuanian participants to share their impressions of the event:
“Even nowadays, Digital Humanities in Lithuania still remain a challenge, a plan, whereas the CLARIN-PLUS workshop demonstrated how technologies actually support research in the Humanities and Social Sciences. Especially valuable to me were the discussions about the representativeness and reliability of the data from social media networks. The presentations about the possibilities and the limitations of various tools were very helpful for researchers who would like to analyse social media data.” Loic
“Thank you for the possibility to participate at the CLARIN event; it was my first time, but hopefully not the last. I’ve heard interesting presentations and examples of useful practical research; besides, new contacts and discussions with the colleagues working on similar topics could possibly grow into collaboration and shared projects in the future.” Jurgita
“I was impressed by the high academic quality of the workshop and presentations. The event was very important to us, as Lithuanian scientists are just taking their first steps in Social Media analysis. Therefore, the developed technology, resources and know-how of colleagues can help us bridge the gap in this field of research.” Darius
“Being a linguist myself, I particularly liked the presentation by Dr Dirk Hovy (the University of Copenhagen) “NLP meets Computational Social (Media) Science”. It touched upon a number of relevant questions for the society as well as showed the interdisciplinary nature of social media research. A variety of the methods, approaches to data and resources can be used by researchers of social and other sciences to analyse information and the results can be practically applied in economics, politics, medicine or other spheres.” Jolanta
“It is really difficult to mention all the thought-provoking ideas heard during the workshop. The topics discussed included the difficulties faced with when gathering the research data; Twitter data as a way to involve students in research; the problems of how one’s data inclusion criteria may affect the effectiveness of the NLP tools; research on Ukraine’s Euromaidan on Twitter; emoji coding, sarcasm and its automatic (non-)detection; critical approaches to data collection criteria and over/under-representation of different social media forms; the distinction between the different CMC cultures and sociolects; local vs. international chat cultures; the necessity to combine quantitative and qualitative research; simulation experiments; customisable search interfaces to make the data reusable. I think we all generated our own workshop “keyword lists” which may differ completely depending on our research interests.” Jurga
The discussion at the workshop was intensive and inspiring; here are some of the memorable quotes:
- “Retweet is evil in my world. It is good in yours.” Steven
- “We don’t know what we don’t know.” Rebekah
- “If you use a full stop at the end of the message, people will think you are angry. Use a smiley, not a full stop.” Reinhild
- “Aristotle would have loved Twitter.” Yin
- “I’m from a different dimension, but I might become a user sometime in the future.” Ester
The Workshop Impact
The workshop attracted considerable attention from the authorities of Vytautas Magnus University, the local academic community, and the CLARIN-LT consortium. The event was advertised at the university and CLARIN-LT websites. All local participants admitted that they substantially benefited from the workshop, which offered high quality presentations, discussions, and practical sessions. This event was very useful and timely, as Vytautas Magnus University and CLARIN-LT consortium is about to start a large-scale project funded by European Structural funds, which among other things involves collecting and processing social media sources. Some important lessons were learned too, e.g. the importance of metadata for social media text corpora, Twitter harvesting solutions, numerous interpretation possibilities of social media data analysis and pitfalls in interpreting these data, shortages of Twitter API’s, among other things. Undoubtedly, such workshops that focus on a particular area need to be organized on a regular basis, disseminating the knowledge and increasing the competence of CLARIN ERIC members.
Photographs from the workshop taken by Andrius Utka & Leon Wessels.