Characteristics of Multi-Class Suicide Risks Tweets Through Feature Extraction and Machine Learning Techniques

Yan Qian Lim - Universiti Tunku Abdul Rahman, Selangor DE 43000, Malaysia
Yim Ling Loo - Universiti Multimedia, Cyberjaya, Selangor DE 63100, Malaysia

Citation Format:



This paper presents a detailed analysis of the linguistic characteristics connected to specific levels of suicide risks, providing insight into the impact of the feature extraction techniques on the effectiveness of the predictive models of suicide ideation. Prevalent initiatives of research works had been observed in the detection of suicide ideation from social media posts through feature extraction and machine learning techniques but scarcely on the multiclass classification of suicide risks and analysis of linguistic characteristics' impact on predictability. To address this issue, this paper proposes the implementation of a machine learning framework that is capable of analyzing multiclass classification of suicide risks from social media posts with extended analysis of linguistic characteristics that contribute to suicide risk detection. A total of 552 samples of a supervised dataset of Twitter posts were manually annotated for suicide risk modeling. Feature extraction was done through a combination of feature extraction techniques of term frequency-inverse document frequency (TF-IDF), Part-of-Speech (PoS) tagging, and valence-aware dictionary for sentiment reasoning (VADER). Data training and modeling were conducted through the Random Forest technique. Testing of 138 samples with scenarios of detections in real-time data for the performance evaluation yielded 86.23% accuracy, 86.71% precision, and 86.23% recall, an improved result with a combination of feature extraction techniques rather than data modeling techniques. An extended analysis of linguistic characteristics showed that a sentence's context is the main contributor to suicide risk classification accuracy, while grammatical tags and strong conclusive terms were not.


Multiclass suicide risks; suicide ideation detection; feature extraction; machine learning; sentiment analysis

Full Text:



E. R. Kumar and N. Venkatram, "Predicting and analyzing suicidal risk behavior using rule-based approach in Twitter data," Soft comput, pp. 1–9, 2023.

T. Zhang, K. Yang, S. Ji, and S. Ananiadou, "Emotion fusion for mental illness detection from social media: A survey," Information Fusion, vol. 92, pp. 231–246, 2023.

R. Haque, N. Islam, M. Islam, and M. M. Ahsan, "A comparative analysis on suicidal ideation detection using NLP, machine, and deep learning," Technologies (Basel), vol. 10, no. 3, p. 57, 2022.

R. A. Bernert, A. M. Hilberg, R. Melia, J. P. Kim, N. H. Shah, and F. Abnousi, "Artificial intelligence and suicide prevention: a systematic review of machine learning investigations," Int J Environ Res Public Health, vol. 17, no. 16, p. 5929, 2020.

A. Pourmand, J. Roberson, A. Caggiula, N. Monsalve, M. Rahimi, and V. Torres-Llenza, "Social media and suicide: a review of technology-based epidemiology and risk assessment," Telemedicine and e-Health, vol. 25, no. 10, pp. 880–888, 2019.

S. E. Clark, M. C. Bledsoe, and C. J. Harrison, "The role of social media in promoting vaccine hesitancy," Curr Opin Pediatr, vol. 34, no. 2, pp. 156–162, 2022.

J. Lee and S. Kim, “Social media advertising: The role of personal and societal norms in page like ads on Facebook,” Journal of Marketing Communications, vol. 28, no. 3, pp. 329–342, Aug. 2019, doi:10.1080/13527266.2019.1658466.

J. Knoll, "Advertising in social media: a review of empirical evidence," Int J Advert, vol. 35, no. 2, pp. 266–300, 2016.

H. Ng, M. S. Jalani, T. T. V. Yap, and V. T. Goh, “Performance of Sentiment Classification on Tweets of Clothing Brands,” Journal of Informatics and Web Engineering, vol. 1, no. 1, pp. 16–22, Mar. 2022, doi: 10.33093/jiwe.2022.1.1.2.

A. Mbarek, S. Jamoussi, and A. Ben Hamadou, "An across online social networks profile building approach: Application to suicidal ideation detection," Future Generation Computer Systems, vol. 133, pp. 171–183, 2022.

S. Ji, C. P. Yu, S. Fung, S. Pan, and G. Long, "Supervised learning for suicidal ideation detection in online user content," Complexity, vol. 2018, 2018.

R. Skaik and D. Inkpen, "Suicide Ideation Estimators within Canadian Provinces using Machine Learning Tools on Social Media Text," Journal of Advances in Information Technology Vol, vol. 12, no. 4, 2021.

F. M. Shah, F. Haque, R. U. Nur, S. Al Jahan, and Z. Mamud, "A hybridized feature extraction approach to suicidal ideation detection from social media post," in 2020 IEEE Region 10 Symposium (TENSYMP), 2020, pp. 985–988.

S. T. Rabani, Q. R. Khan, and A. Khanday, "Detection of suicidal ideation on Twitter using machine learning & ensemble approaches," Baghdad Science Journal, vol. 17, no. 4, p. 1328, 2020.

X. Liu et al., “Proactive Suicide Prevention Online (PSPO): Machine Identification and Crisis Management for Chinese Social Media Users With Suicidal Thoughts and Behaviors,” Journal of Medical Internet Research, vol. 21, no. 5, p. e11705, May 2019, doi: 10.2196/11705.

A. Mbarek, S. Jamoussi, A. Charfi, and A. Ben Hamadou, “Suicidal Profiles Detection in Twitter.,” in WEBIST, 2019, pp. 289–296.

M. M. Tadesse, H. Lin, B. Xu, and L. Yang, "Detection of suicide ideation in social media forums using deep learning," Algorithms, vol. 13, no. 1, p. 7, 2019.

S. Ji, S. Pan, X. Li, E. Cambria, G. Long, and Z. Huang, "Suicidal ideation detection: A review of machine learning methods and applications," IEEE Trans Comput Soc Syst, vol. 8, no. 1, pp. 214–226, 2020.

P. Jain, K. R. Srinivas, and A. Vichare, "Depression and Suicide Analysis Using Machine Learning and NLP," in Journal of Physics: Conference Series, 2022, p. 12034.

Y. Q. Lim, M. J. Lee, and Y. L. Loo, “Towards A Machine Learning Framework for Suicide Ideation Detection in Twitter,” 2022 3rd International Conference on Artificial Intelligence and Data Sciences (AiDAS), Sep. 2022, doi: 10.1109/aidas56890.2022.9918782.

B. Harmer, S. Lee, D. TvH, and A. Saadabadi, "Suicidal ideation," 2020.

H.-C. Shing, S. Nair, A. Zirikly, M. Friedenberg, H. Daumé III, and P. Resnik, "Expert, crowdsourced, and machine assessment of suicide risk via online postings," in Proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic, 2018, pp. 25–36.

B. O'dea, S. Wan, P. J. Batterham, A. L. Calear, C. Paris, and H. Christensen, "Detecting suicidality on Twitter," Internet Interv, vol. 2, no. 2, pp. 183–188, 2015.

R. Ahuja, A. Chug, S. Kohli, S. Gupta, and P. Ahuja, "The impact of features extraction on the sentiment analysis," Procedia Comput Sci, vol. 152, pp. 341–348, 2019.

P. Maken, A. Gupta, and M. K. Gupta, "A study on various techniques involved in gender prediction system: a comprehensive review," Cybernetics and Information Technologies, vol. 19, no. 2, pp. 51–73, 2019.

A. E. Aladağ, S. Muderrisoglu, N. B. Akbas, O. Zahmacioglu, and H. O. Bingol, "Detecting suicidal ideation on forums: proof-of-concept study," J Med Internet Res, vol. 20, no. 6, p. e9840, 2018.

K. Dineva and T. Atanasova, "Systematic Look at Machine Learning Algorithms–Advantages, Disadvantages and Practical Applications," International Multidisciplinary Scientific GeoConference: SGEM, vol. 20, no. 2.1, pp. 317–324, 2020.

A. Culotta, N. K. Ravi, and J. Cutler, "Predicting Twitter user demographics using distant supervision from website traffic data," Journal of Artificial Intelligence Research, vol. 55, pp. 389–408, 2016.

I. I. James and V. I. Osubor, "Hostile social media harassment: A machine learning framework for filtering anti-female jokes," Nigerian Journal of Technology, vol. 41, no. 2, pp. 311–317, 2022.

P. Kumar, P. Samanta, S. Dutta, M. Chatterjee, and D. Sarkar, "Feature based depression detection from twitter data using machine learning techniques," Journal of Scientific Research, vol. 66, no. 2, pp. 220–228, 2022.

M. Monselise and C. C. Yang, "AI for Social Good in Healthcare: Moving Towards a Clear Framework and Evaluating Applications," in 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI), 2022, pp. 470–471.

T. Yang et al., “Fine-grained depression analysis based on Chinese micro-blog reviews,” Information Processing & Management, vol. 58, no. 6, p. 102681, Nov. 2021, doi: 10.1016/j.ipm.2021.102681.

S. Parrott, B. C. Britt, J. L. Hayes, and D. L. Albright, "Social media and suicide: a validation of terms to help identify suicide-related social media posts," J Evid Based Soc Work, vol. 17, no. 5, pp. 624–634, 2020.

T. Bhardwaj, P. Gupta, A. Goyal, A. Nagpal, and V. Jha, "A Review on Suicidal Ideation Detection Based on Machine Learning and Deep Learning Techniques," in 2022 IEEE World AI IoT Congress (AIIoT), 2022, pp. 27–31.

H. Metzler, H. Baginski, T. Niederkrotenthaler, and D. Garcia, "Detecting potentially harmful and protective suicide-related content on twitter: machine learning approach," J Med Internet Res, vol. 24, no. 8, p. e34705, 2022.

A. L. Nobles, J. J. Glenn, K. Kowsari, B. A. Teachman, and L. E. Barnes, "Identification of imminent suicide risk among young adults using text messages," in Proceedings of the 2018 CHI conference on human factors in computing systems, 2018, pp. 1–11.