HYBRID FEATURE BASED PREDICTION OF SUICIDE RELATED ACTIVITY ON TWITTER
Abstract
Suicide remains a concerning global health issue, with fatalities on the rise annually worldwide. This study focuses on identifying and analyzing suicidal ideation expressed on the online platform Twitter. Initially, we filtered out non-contributing users and identified potentially relevant tweets. We then conducted a comprehensive comparison of these tweets with risk factors outlined by domain experts. With the widespread adoption of social media platforms, users have increasingly utilized them to discuss highly personal topics, including thoughts of suicide. The abundance of data on these platforms presents challenges in terms of processing efficiency and resource constraints. To address this, we implemented a feature extraction approach involving emoticons and synonyms, along with an n-gram model combining Unigram, Bigram, and Trigram with a hybrid dictionary for scoring. Leveraging these techniques, our model predicts the severity of posts containing suicidal ideation using machine learning algorithms. Furthermore, we conducted comparative analyses involving support vector machines (SVM), Naive Bayes (NB), and Random Forests (RF) to evaluate the effectiveness of our approach.
