Emotions in Tweets: A Sentiment Analysis Approach
Abstract
Social media sentiment research has evolved into a crucial tool for analysing public opinion, brand perception, and customer sentiment in the digital age. This approach uses natural language processing (NLP), machine learning, and computational linguistics to categorize the attitudes conveyed in social media posts as positive, negative, neutral, or irrelevant. Because social media writing is unstructured and informal, the work provides distinct difficulties, such as dealing with slang, sarcasm, and multilingual content. The technique for developing a sentiment analysis system often begins with data collecting from GitHub or Twitter (using a separate account). To prepare for analysis, the data is pre processed using tokenization, Potter stemming, Text Blob, word tokenization, stop words, word clouds, and text normalization. Feature extraction approaches, such as Count Vectorizer, turn the text into numerical representations, which are then used to train machine learning models like Logistic Regression and SVM, were trained and assessed using F1- score, recall, accuracy, and precision. With the highest accuracy
of 97.49%, the SVM model demonstrated its superiority over Logistic Regression in sentiment categorization.
Keywords
Twitter Data
Natural Language Processing (NLP)
Machine Learning
Social Media Analytics
Logistic Regression
Support Vector Machine (SVM)
Data Preprocessing
Polarity Detection
Feature Extraction