Topical Analysis of Twitter User Clusters

This project was the culmination of my Master's degree work, influenced by the RADII project (Reticular Analysis of Discourse for Influence Indicators) active in my lab. It involved creating artificial dialogues from clusters of a 2017 French election Twitter dataset. Clusters were created using the Leiden clustering algorithm. Topical analysis was performed using the SCIL toolkit to identify meso-topics (the most prevalent topics in dialogues). Hashtag usage was also analyzed for further insight. Retweets were used to build a directed graph of users, where edges represented retweet relationships and weights were the number of retweets. Many tweets were translated from French using the Helsinki-NLP opus-mt-fr-en model with Hugging Face transformers. The visualization (ForceAtlas2 in Gephi) highlighted the 10 largest clusters based on weighted degree >=50. The project was presented at the Fall 2022 Master's Project poster session at Rensselaer Polytechnic Institute.