Introversion-Extraversion Prediction using Machine Learning

Brillian Fieri - Bina Nusantara University, Jakarta, 11480, Indonesia
Joshua La'la - Bina Nusantara University, Jakarta, 11480, Indonesia
Derwin Suhartono - Bina Nusantara University, Jakarta, 11480, Indonesia

Citation Format:



Introversion and extroversion are personality traits that assess the type of interaction between people and others. Introversion and extraversion have their advantages and disadvantages. Knowing their personality, people can utilize these advantages and disadvantages for their benefit. This study compares and evaluates several machine learning models and dataset balancing methods to predict the introversion-extraversion personality based on the survey result conducted by Open-Source Psychometrics Project. The dataset was balanced using three balancing methods, and fifteen questions were chosen as the features based on their correlations with the personality self-identification result. The dataset was used to train several supervised machine-learning models. The best model for the Synthetic Minority Oversampling (SMOTE), Adaptive Synthesis Sampling (ADASYN), and Synthetic Minority Oversampling-Edited Nearest Neighbor (SMOTE-ENN) datasets was the Random Forest with the 10-fold cross-validation accuracy of 95.5%, 95.3%, and 71.0%. On the original dataset, the best model was Support Vector Machine, with a 10-fold cross-validation accuracy of 73.5%. Based on the results, the best balancing methods to increase the models’ performance were oversampling. Conversely, the hybrid method of oversampling-undersampling did not significantly increase performance. Furthermore, the tree-like models, like Random Forest and Decision Tree, improved performance substantially from the data balancing. In contrast, the other models, excluding the SVM, did not show a significant rise in performance. This research implies that further study is needed on the hybrid balancing method and another classification model to improve personality classification performance.


imbalanced dataset; introversion-extraversion; machine learning; personality prediction

Full Text:



R. M. Bergner, “What is personality? Two myths and a definition,” New Ideas Psychol., vol. 57, 2020, doi: 10.1016/j.newideapsych.2019.100759.

P. G. Zimbardo, R. L. Johnson, and V. McCann, Psychology : core concepts, 8th ed. NY: Pearson, 2017.

C. D. Nye and B. W. Roberts, A neo-socioanalytic model of personality development. Elsevier Inc., 2019.

A. Baumert et al., “Integrating Personality Structure, Personality Process, and Personality Development,” European Journal of Personality, vol. 31, no. 5, pp. 503–528, Sep. 2017, doi: 10.1002/per.2115.

D. Petric, “Introvert , Extrovert and Ambivert,” Knot Theory Mind, no. September, pp. 1–4, 2019, doi: 10.13140/RG.2.2.28059.41764/2.

M. C. Shehni and T. Khezrab, “Review of Literature on Learners’ Personality in Language Learning: Focusing on Extrovert and Introvert Learners,” Theory and Practice in Language Studies, vol. 10, no. 11, p. 1478, Nov. 2020, doi: 10.17507/tpls.1011.20.

Y. Tao, Y. Cai, C. Rana, and Y. Zhong, “The impact of the Extraversion-Introversion personality traits and emotions in a moral decision-making task,” Personality and Individual Differences, vol. 158, p. 109840, May 2020, doi: 10.1016/j.paid.2020.109840.

A. M. Grant, F. Gino, and D. A. Hofmann, “Reversing the Extraverted Leadership Advantage: The Role of Employee Proactivity,” Academy of Management Journal, vol. 54, no. 3, pp. 528–550, Jun. 2011, doi: 10.5465/amj.2011.61968043.

J. E. Bono and T. A. Judge, “Personality and Transformational and Transactional Leadership: A Meta-Analysis.,” Journal of Applied Psychology, vol. 89, no. 5, pp. 901–910, 2004, doi: 10.1037/0021-9010.89.5.901.

C. So, “Are You an Introvert or Extrovert? Accurate Classification With Only Ten Predictors,” 2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Feb. 2020, doi: 10.1109/icaiic48513.2020.9065069.

H. Baumgartl, S. Bayerlein, and R. Buettner, “Measuring Extraversion Using EEG Data,” Lecture Notes in Information Systems and Organisation, pp. 259–265, 2020, doi: 10.1007/978-3-030-60073-0_30.

L. Ge, H. Tang, Q. Zhou, Y. Tang, and J. Lang, “Classification Algorithms to Predict Students’ Extraversion-Introversion Traits,” 2016 International Conference on Cyberworlds (CW), Sep. 2016, doi: 10.1109/cw.2016.27.

S. M. Anzalone, G. Varni, S. Ivaldi, and M. Chetouani, “Automated Prediction of Extraversion During Human–Humanoid Interaction,” International Journal of Social Robotics, vol. 9, no. 3, pp. 385–399, Feb. 2017, doi: 10.1007/s12369-017-0399-6.

Open-Source Psychometrics Project, “Development of the Multidimensional Introversion-Extraversion Scales.” 2019.

J. Tanha, Y. Abdi, N. Samadi, N. Razzaghi, and M. Asadpour, “Boosting methods for multi-class imbalanced data classification: an experimental review,” Journal of Big Data, vol. 7, no. 1, Sep. 2020, doi: 10.1186/s40537-020-00349-y.

R. Mohammed, J. Rawashdeh, and M. Abdullah, “Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results,” 2020 11th International Conference on Information and Communication Systems (ICICS), Apr. 2020, doi: 10.1109/icics49469.2020.239556.

V. S. Spelmen and R. Porkodi, “A Review on Handling Imbalanced Data,” 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT), Mar. 2018, doi:10.1109/icctct.2018.8551020.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, no. September 28, pp. 321–357, 2002, [Online]. Available:

H. He, Y. Bai, E. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In IEEE International Joint Conference on Neural Networks, 2008,” IJCNN 2008.(IEEE World Congr. Comput. Intell. (pp. 1322– 1328), no. 3, pp. 1322– 1328, 2008.

T. Lu, Y. Huang, W. Zhao, and J. Zhang, “The Metering Automation System based Intrusion Detection Using Random Forest Classifier with SMOTE+ENN,” 2019 IEEE 7th International Conference on Computer Science and Network Technology (ICCSNT), Oct. 2019, doi: 10.1109/iccsnt47585.2019.8962430.

H. Zhu, X. You, and S. Liu, “Multiple Ant Colony Optimization Based on Pearson Correlation Coefficient,” IEEE Access, vol. 7, pp. 61628–61638, 2019, doi: 10.1109/access.2019.2915673.