Making Sense of Large Social Media Corpora
Keywords, Topics, Sentiment, and Hashtags in the Coronavirus Twitter Corpus
Abstract
This open access book offers a comprehensive overview of available techniques and approaches to explore large social media corpora, using as an illustrative case study the Coronavirus Twitter corpus. First, the author describes in detail a number of methods, strategies, and tools that can be used to access, manage, and explore large Twitter/X corpora, including both user-friendly applications and more advanced methods that involve the use of data management skills and custom programming scripts. He goes on to show how these tools and methods are applied to explore one of the largest Twitter datasets on the COVID-19 pandemic publicly released, covering the two years when the pandemic had the strongest impact on society. Specifically, keyword extraction, topic modelling, sentiment analysis, and hashtag analysis methods are described, contrasted, and applied to extract information from the Coronavirus Twitter Corpus. The book will be of interest to students and researchers in fields that make use of big data to address societal and linguistic concerns, including corpus linguistics, sociology, psychology, and economics.
Keywords
social media; keyword extraction; corpus linguistics; natural language processing; sentiment analysisDOI
10.1007/978-3-031-52719-7ISBN
9783031527197, 9783031527180, 9783031527197Publisher
Springer NaturePublisher website
https://www.springernature.com/gp/products/booksPublication date and place
Cham, 2024Imprint
Palgrave MacmillanClassification
Linguistics
Media studies: internet, digital media and society
Media studies
Communication studies