A Survey on Forms of Visualization and Tools Used in Topic Modelling

Ruhaila Maskat - Universiti Teknologi MARA Shah Alam, Selangor Malaysia
Shazlyn Shaharudin - Universiti Pendidikan Sultan Idris, Tanjong Malim, Malaysia
Deden Witarsyah - Telkom University Bandung, Indonesia
Hairulnizam Mahdin - Universiti Tun Hussein Onn Malaysia, Parit Raja, Batu Pahat, Johor, Malaysia

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.7.2.1313


In this paper, we surveyed recent publications on topic modeling and analyzed the forms of visualizations and tools used. Expectedly, this information will help Natural Language Processing (NLP) researchers to make better decisions about which types of visualization are appropriate for them and which tools can help them. This could also spark further development of existing visualizations or the emergence of new visualizations if a gap is present. Topic modeling is an NLP technique used to identify topics hidden in a collection of documents. Visualizing these topics permits a faster understanding of the underlying subject matter in terms of its domain. This survey covered publications from 2017 to early 2022. The PRISMA methodology was used to review the publications. One hundred articles were collected, and 42 were found eligible for this study after filtration. Two research questions were formulated. The first question asks, "What are the different forms of visualizations used to display the result of topic modeling?" and the second question is "What visualization software or API is used? From our results, we discovered that different forms of visualizations meet different purposes of their display. We categorized them as maps, networks, evolution-based charts, and others. We also discovered that LDAvis is the most frequently used software/API, followed by the R language packages and D3.js. The primary limitation of this survey is it is not exhaustive. Hence, some eligible publications may not be included.


Topic visualization; Topic modelling; Visualization tools; Review; Survey

