Automated UML Class Diagram Generation from Textual Requirements Using NLP Techniques

Yang Meng - Department of Software Engineering and Information Systems, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Selangor, Malaysia
Ainita Ban - Department of Software Engineering and Information Systems, Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Selangor, Malaysia


Citation Format:



DOI: http://dx.doi.org/10.62527/joiv.8.3-2.3482

Abstract


Translating textual requirements into precise Unified Modeling Language (UML) class diagrams poses challenges due to the unstructured and often ambiguous nature of text, which can lead to inconsistencies and misunderstandings during the initial stages of software development. Current methods often struggle with effectively addressing these challenges due to limitations in handling diverse and complex textual requirements, which may result in incomplete or inaccurate UML diagrams. This study aims to propose a Natural Language Processing (NLP) model that analyzes and comprehends textual requirements to extract relevant information for generating UML class diagrams, ensuring accuracy and consistency between the diagrams and requirement descriptions. The research employs a four-step approach: preprocessing to handle text noise and redundancy, sentence classification to distinguish between "class" and "relationship" sentences, syntactic analysis to examine grammatical structures, and UML class diagram generation based on predefined rules. The results show that the model achieved a classification accuracy of 88.46% with a high Area Under the Curve (AUC) value of 0.9287, indicating robust performance in distinguishing between class definitions and relationships. This study highlights that existing methods may not fully address the nuances of translating complex textual requirements into accurate UML diagrams. This study successfully demonstrates an automated method for generating UML class diagrams from textual requirements and suggests that future research could expand datasets, optimize feature extraction, explore advanced models, and develop automated rule generation methods for further improvements.

Keywords


Software engineering; UML class diagrams; Natural Language Processing (NLP); software development

Full Text:

PDF

References


Y. Rigou and I. Khriss, “A Deep Learning Approach to UML Class Diagrams Discovery from Textual Specifications of Software Systems,” 2023, pp. 706–725. doi: 10.1007/978-3-031-16078-3_49.

M. Jahan, Z. S. H. Abad, and B. Far, “Generating Sequence Diagram from Natural Language Requirements,” in 2021 IEEE 29th International Requirements Engineering Conference Workshops (REW), IEEE, Sep. 2021, pp. 39–48. doi:10.1109/rew53955.2021.00012.

O. S. Dawood Omer and S. Eltyeb, “Towards an Automatic Generation of UML Class Diagrams from Textual Requirements using Case-based Reasoning Approach,” in 2022 4th International Conference on Applied Automation and Industrial Diagnostics (ICAAID), IEEE, Mar. 2022, pp. 1–5. doi: 10.1109/icaaid51067.2022.9799502.

M. A. Ahmed, I. Ahsan, U. Qamar, and W. H. Butt, “A Novel Natural Language Processing approach to automatically Visualize Entity-Relationship Model from Initial Software Requirements,” in 2021 International Conference on Communication Technologies (ComTech), IEEE, Sep. 2021, pp. 39–43. doi:10.1109/ComTech52583.2021.9616949.

A. Abdalazeim and F. Meziane, “A review of the generation of requirements specification in natural language using objects UML models and domain ontology,” Procedia Comput Sci, vol. 189, pp. 328–334, 2021, doi: 10.1016/j.procs.2021.05.102.

J. Shivamurthy, T. Uppal, and D. Vidyarthi, “NLP-based Auto Generation of Graph Database from Textual Requirements,” in 2024 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), IEEE, Jul. 2024, pp. 1–6. doi: 10.1109/CONECCT62155.2024.10677144.

Fatma Alharbia, Shadi R .Masadeh, and Faiz Alshrouf, “A Framework for the Generation of Class Diagram from Text Requirements using Natural Language Processing,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 10, no. 1, pp. 25–31, Feb. 2021, doi: 10.30534/ijatcse/2021/041012021.

E. A. Abdelnabi, A. M. Maatuk, T. M. Abdelaziz, and S. M. Elakeili, “Generating UML Class Diagram using NLP Techniques and Heuristic Rules,” in 2020 20th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA), IEEE, Dec. 2020, pp. 277–282. doi:10.1109/STA50679.2020.9329301.

P. More and R. Phalnikar, “Generating UML Diagrams from Natural Language Specifications,” Int J Appl Inf Syst, vol. 1, no. 8, pp. 19–23, Apr. 2012, doi: 10.5120/ijais12-450222.

H. Krishnan and P. Samuel, “Relative Extraction Methodology for class diagram generation using dependency graph,” in 2010 International Conference on Communication Control and Computing Technologies, IEEE, Oct. 2010, pp. 815–820. doi:10.1109/ICCCCT.2010.5670730.

N. Bashir, M. Bilal, M. Liaqat, M. Marjani, N. Malik, and M. Ali, “Modeling Class Diagram using NLP in Object-Oriented Designing,” in 2021 National Computing Colleges Conference (NCCC), IEEE, Mar. 2021, pp. 1–6. doi: 10.1109/nccc49330.2021.9428817.

R. Sharma, P. K. Srivastava, and K. K. Biswas, “From natural language requirements to UML class diagrams,” in 2015 IEEE Second International Workshop on Artificial Intelligence for Requirements Engineering (AIRE), IEEE, Aug. 2015, pp. 1–8. doi:10.1109/aire.2015.7337625.

A. Gupta, G. Poels, and P. Bera, “Generating multiple conceptual models from behavior-driven development scenarios,” Data Knowl Eng, vol. 145, p. 102141, May 2023, doi:10.1016/j.datak.2023.102141.

S. Yang and H. Sahraoui, “Towards automatically extracting UML class diagrams from natural language specifications,” in Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems: Companion Proceedings, New York, NY, USA: ACM, Oct. 2022, pp. 396–403. doi: 10.1145/3550356.3561592.

Z. Babaalla, E. M. Bouziane, A. Jakimi, and M. Oualla, “From text-based system specifications to UML diagrams: A bridge between words and models,” in 2024 International Conference on Circuit, Systems and Communication (ICCSC), IEEE, Jun. 2024, pp. 1–6. doi:10.1109/iccsc62074.2024.10616686.

A. Ferrari, S. Abualhaija and C. Arora, "Model Generation with LLMs: From Requirements to UML Sequence Diagrams," in 2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW), Reykjavik, Iceland, 2024, pp. 291-300, doi:10.1109/rew61692.2024.00044.

E. A. Abdelnabi, A. M. Maatuk, and M. Hagal, “Generating UML Class Diagram from Natural Language Requirements: A Survey of Approaches and Techniques,” in 2021 IEEE 1st International Maghreb Meeting of the Conference on Sciences and Techniques of Automatic Control and Computer Engineering MI-STA, IEEE, May 2021, pp. 288–293. doi: 10.1109/mi-sta52233.2021.9464433.

Z. Babaalla, A. Jakimi, M. Oualla, R. Saadane, and A. Chehri, “Towards an Automatic Extracting UML Class Diagram from System’s Textual Specification,” in Proceedings of the 7th International Conference on Networking, Intelligent Systems and Security, New York, NY, USA: ACM, Apr. 2024, pp. 1–5. doi:10.1145/3659677.3659742.

Z. Babaalla, H. Abdelmalek, A. Jakimi, and M. Oualla, “Extraction of UML class diagrams using deep learning: Comparative study and critical analysis,” Procedia Comput Sci, vol. 236, pp. 452–459, 2024, doi: 10.1016/j.procs.2024.05.053.

M. A. Umar and K. Lano, “Advances in automated support for requirements engineering: a systematic literature review,” Requir Eng, vol. 29, no. 2, pp. 177–207, Jun. 2024, doi: 10.1007/s00766-023-00411-0.

S. Zhong, A. Scarinci, and A. Cicirello, “Natural Language Processing for systems engineering: Automatic generation of Systems Modelling Language diagrams,” Knowl Based Syst, vol. 259, p. 110071, Jan. 2023, doi: 10.1016/j.knosys.2022.110071.

S. M. Cheema, S. Tariq, and I. M. Pires, “A natural language interface for automatic generation of data flow diagram using web extraction techniques,” Journal of King Saud University - Computer and Information Sciences, vol. 35, no. 2, pp. 626–640, Feb. 2023, doi:10.1016/j.jksuci.2023.01.006.

S. Kumar, Aryaman, Aryan, and D. Yadav, “Natural Language Processing based Automatic Making of Use Case Diagram,” in 2023 5th International Conference on Inventive Research in Computing Applications (ICIRCA), IEEE, Aug. 2023, pp. 1026–1032. doi:10.1109/icirca57980.2023.10220849.

A. A. Almazroi, L. Abualigah, M. A. Alqarni, E. H. Houssein, A. Q. M. AlHamad, and M. A. Elaziz, “Class Diagram Generation from Text Requirements: An Application of Natural Language Processing,” 2021, pp. 55–79. doi: 10.1007/978-3-030-79778-2_4.

A. Akundi, J. Ontiveros, and S. Luna, “Text-to-Model Transformation: Natural Language-Based Model Generation Framework,” Systems, vol. 12, no. 9, p. 369, Sep. 2024, doi: 10.3390/systems12090369.

D. Peral-García, J. Cruz-Benito, and F. J. García-Peñalvo, “Using Quantum Natural Language Processing for Sentiment Classification and Next-Word Prediction in Sentences Without Fixed Syntactic Structure,” 2024, pp. 235–243. doi: 10.1007/978-3-031-48981-5_19.

J. Chen, B. Hu, W. Diao, and Y. Huang, “Automatic generation of SysML requirement models based on Chinese natural language requirements,” in Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering, New York, NY, USA: ACM, Oct. 2022, pp. 242–248. doi: 10.1145/3573428.3573470.

R. Bougacha, R. Laleau, S. Collart-Dutilleul, and R. Ben Ayed, “Extending SysML with Refinement and Decomposition Mechanisms to Generate Event-B Specifications,” 2022, pp. 256–273. doi:10.1007/978-3-031-10363-6_18.

R. Saini, G. Mussbacher, J. L. C. Guo, and J. Kienzle, “Automated, interactive, and traceable domain modelling empowered by artificial intelligence,” Softw Syst Model, vol. 21, no. 3, pp. 1015–1045, Jun. 2022, doi: 10.1007/s10270-021-00942-6.

V. Danylyk, V. Lytvyn, and S. Mushasta, “Information system of identification of terms and abbreviations in text documents,” Herald of Khmelnytskyi National University. Technical sciences, vol. 319, no. 2, pp. 81–87, Apr. 2023, doi: 10.31891/2307-5732-2023-319-1-81-83.