A Systematic Literature Review on English and Bangla Topic Modeling

Corresponding Author: Md. Basim Uddin Ahmed Department of Computer Science and Engineering, Shahjalal University of Science and Technology, Sylhet, Bangladesh Email: basimuddintuhin@gmail.com Abstract: Due to the enormous growth of information and technology, the digitized texts and data are being immensely generated. Therefore, identifying the main topics in a vast collection of documents by humans is merely impossible. Topic modeling is such a statistical framework that infers the latent and underlying topics from text documents, corpus, or electronic archives through a probabilistic approach. It is a promising field in Natural Language Processing (NLP). Though many researchers have researched this field, only a few significant research has been done for Bangla. In this literature review paper, we have followed a systematic approach for reviewing topic modeling studies published from 2003 to 2020. We have analyzed topic modeling methods from different aspects and identified the research gap between topic modeling in English and Bangla language. After analyzing these papers, we have identified several types of topic modeling techniques, such as Latent Dirichlet Allocation (LDA), Latent Semantic Analysis (LSA), Support Vector Machine (SVM), Bi-term Topic Modeling (BTM). Furthermore, this review paper also highlights the real-world applications of topic modeling. Several evaluation methods were used to evaluate these models’ performances, which we have discussed in this study. We conclude by mentioning the huge future research scopes for topic modeling in Bangla.


Introduction
Because of the rapid development of Information Technology (e.g., Internet, Social Media, Online Databases, etc.), the amount of data generated has exponentially exacerbated in recent years. This vast accumulation of data provides essential support for training machine learning models and easy access to search engine queries. On the other hand, because of this massive flourish of information, extracting the knowledge of interest from these data has become a matter of general concern (Xu et al., 2019). According to the study of DOMO (a cloud-based business service system), roughly 2.5 Quintilian bytes of data are produced daily and 90% of that data in the world has been created in the last two years only (according to 2018 studies) (Al Helal and Mouhoub, 2018). So it is not feasible for any person to sieve useful information from these vast amounts of data manually. Moreover, the National Science Foundation (NSF) identified 'large-scale scientific data management and analysis' as one of the data-intensive challenges and as an area for future study (Karami et al., 2018). So it is crucial to precisely and efficiently estimate the numerical characteristics, to determine the appropriate statistical distributions for modeling text corpora (Jiang et al., 2017).
Topic modeling is a probabilistic approach that can be observed as an instrument of measurement for the hidden structures in a document (Shi et al., 2019). To infer these hidden structures, we have to pre-process the documents. At first, extraneous words and stop words are removed from the text. Punctuations are also usually removed, but some researchers have kept punctuations if they carry certain emotions or meanings. Then the words are stemmed (converted to its' root form). Some models consider Bi-grams (adjacent words that often appear together), tri-grams, etc. as one word. The resulting list is then transformed into a bag of words (words with count) for that document. Weights are assigned to each word by analyzing them. To give importance to particular words, their weighting factors can be changed (e.g., using the term frequency-inverse document frequency or TF-IDF). The pre-processed data is fed to a Machine Learning algorithm (LDA, LSA, BTM, etc.). These algorithms iterate through the training data several times and try to accurately infer latent topics from those collections of documents as much as possible. Various parameters and hyperparameters are used in these algorithms. They are tuned during the training phase of the model. These models output documents as a distribution over topics and topics as a mixture of words. Again, sets of topically-related words are generated as 'topic', which can be associated with the documents of that corpus (Hasan et al., 2019).
Most of the topic modeling algorithms and research papers are specific to the English language. From frequently updated surveys, we can see that internet content is 25.9% in English 1 and this percentage may decrease over the coming years. Therefore, developing similar tools for other languages is essential. Bangla is such language and has become one of the most popular languages in the world after the announcement to annually observe February 21st as the International Mother Language Day by UNESCO on November 17th, 1999. With around about 228 million native speakers and another 37 million as second language speakers, Bangla is the 5th most spoken native language and the 7th by the total number of speakers in the world (Wikipedia, 2020).
Though Bangla is a very popular language in the world, there are barely any Topic Modeling techniques and studies out there to find. So in this SLR, we provide a comprehensive view of topic modeling according to the literature and how algorithms and techniques differ between English and Bangla language. We performed a systematic study to acquaint the methods, domains, datasets, etc. related to topic modeling and showed them in a tabular and diagram form. This process helped us to understand the research community's views and observations about the methods in different domains. We also learned how to evaluate each algorithm and technique through many evaluation matrices, which are also described below. 1 www.statista.com, www.internetworldstats.com The paper is outlined as follows: We briefly explained the basics of topic modeling in section 2, then we detailed the review process in section 3 and presented the results of the SLR in section 4. We talked about the challenges and future scopes in section 5. We finally concluded our work in section 7.

Background
This section provides a brief description of the topic modeling methods used in the selected papers. The evaluation methods for topic modeling are also introduced here. It will hopefully give the reader a basic idea about the models included in the reviewed papers.

Topic Modeling Methods
Topic Modeling is an emerging machine learning technology that is widely used in various fields of research (Yuan et al., 2015). The basic idea can be simply described as: Documents consist of various topics, which are modeled as distributions over a vocabulary (Arora et al., 2013). However, implementing an efficient working algorithm may not be so simple. Various topic modeling algorithms have been developed to work with many technical challenges and diverse text documents (Shi et al., 2019). From those, a few of the topic modeling methods used in our reviewed papers are described in brief here.

LDA
Latent Dirichlet Allocation (LDA) is one of the most widely used topic modeling techniques. It is a generative probabilistic model for collections of distinct data like text documents (Blei et al., 2003). LDA treats each document as a mixture of different "topics," and each topic is treated as a mixture of different "words" (Li et al., 2013). It is a matrix factorization technique and statistical model. The input for LDA is a fixed-length vectors (bag-of-words) (Hasan et al., 2019). LDA is very old and there have been many researchers who have modified the basic LDA structure published in (Blei et al., 2003) and used the modified versions to their uses (Yuan et al., 2015;Ramage et al., 2009b;Gao et al., 2018;Ramage et al., 2009a;Hasan et al., 2019).

BTM
Biterm Topic Modeling (BTM) is a useful topic modeling technique when it comes to extracting topics from short texts. As the growing social media platforms are generating a huge amount of short texts, BTM is becoming much more popular to work with short text topic models. The main theme of this technique is that it converts the short text in an unordered pair of words. If two words are frequently co-occurring, then the possibility increases that they are of the same topic (Cheng et al., 2014;Li et al., 2019a). Several researches have modified the basic BTM and developed more specific models such as Sentiment Biterm Topic Modeling (SBTM), Multiterm Topic Modeling (MTM).

LSI
Latent Semantic Indexing (LSI) is an automatic retrieval and indexing model for topic modeling, used to identify higher-order structures and categories that associate terms with documents. It tries to find out the hidden semantic structures in documents using word co-occurrence. It uses a linear algebra technique called Singular Value Decomposition (SVD) matrix to identify statistical patterns between words and concepts in a text (Potha and Stamatatos, 2019). LSI tries to capture the many-to-many mapping between terms and concepts, outranking conventional vector-based models (Bertalan and Ruiz, 2019).

GPU-DMM
GPU-DMM model is built by associating Generalized Polya Urn (GPU) model with Dirichlet Multinomial Mixture (DMM). DMM is a probabilistic generative model that uses the assumption that a document is generated from a single topic . This assumption enriches the word co-occurrences and makes the model better for short texts (Li et al., 2019c). GPU model makes a word pair distribution of semantically related words. For the currently sampled word w, the semantically related words w are selected such that w has strong ties with the sampled topic. Because of pairing the sampled word w to its semantically related words, sampling the word w in topic t will also increase the association between the topic and w's semantically related words, not only the association of w itself. In the GPU-DMM model, the generative process is the same as in DMM, but in the inference process GPU model is applied .

HDP
Hierarchical Dirichlet Process (HDP), proposed by (Teh et al., 2006), is a non-parametric extension of LDA where texts are viewed as groups of observed words, topics are distributions over terms and each document exhibits its topics with different proportions (Bertalan and Ruiz, 2019). HDP infers the number of topics from the documents. This approach provides a prior distribution for the number of mixture components within each group. SVM Support Vector Machine (SVM) is a classifier model. It is a machine learning technique that solves the problem like matching patterns, acquiring symbolic theme that depends on syntax as well as semantic meaning (Das and Bandyopadhyay, 2010b). Given a set of training documents, each document marked with a particular category/topic, an SVM training algorithm can be used for topic modeling to categorize documents by assigning new documents into one of the predefined categories/topics (Ahmad and Amin, 2016).

Topic Modeling Evaluation Methods
As we have seen above, there are numerous methods to apply for topic modeling. To evaluate the performances of these models, many evaluation methods are used.
The most used evaluation method is Precision, Recall, F1-Measure (PRF). Confusion Matrix, Area Under Curve (AUC) were used in some papers as well. Another widely used evaluation method is Topic Coherence, which is briefed here.

Topic Coherence (PMI and UCI/U-Mass Scores)
One evaluation method taking off recently is topic coherence, which is calculated based on co-occurrences of words. It is considered a reliable evaluation system since it is highly consistent with human-produced results . A popular metric to work with topic coherence is the Pointwise Mutual Information (PMI-Score).
Given the T most probable words of a topic k, (w1,…, wT), PMI-Score measures the pairwise association between them: , P(wi, wj) and P(wi) are the probabilities of co-occurring word pair (wi, wj) and word wi estimated empirically from the external data sets, respectively (Cheng et al., 2014).
Besides PMI, UCI (Newman et al., 2010) and U-Mass (Mimno et al., 2011) scores are also used to measure topic coherence. The UCI-Coherence is calculated by the following formula (Jiang et al., 2017)

 
Both PMI and UCI use external sources of large scales, which makes them model-independent. That is why both are fair for all topic models (Cheng et al., 2014).
Given the T most probable words of a topic k, (w1,…,wT), the U-Mass coherence is calculated by: 1 M is the smoothing factor added to avoid the possibility of calculating the logarithm of zero (Jiang et al., 2017).

Human Judgement
Human judgement is very reliable for matching extracted topics from a document. But it is not always feasible. Because it is prone to bias since no two human beings will produce the same summary and besides very much time-consuming (Sarkar, 2012b). However, it has been used in several studies (Das and Bandyopadhyay, 2010b;Akter and Aziz, 2016;Abujar et al., 2017;Sarkar, 2012a;Efat et al., 2013;Sarkar, 2014;Haque et al., 2015;Ahmad et al., 2018;Shi et al., 2019).

Research Methodology
Research methodology gives an overview of how this review process was conducted. This section covers what points we were looking for in the papers, how we searched and collected the papers, what sources they were gathered from, when they were published, what types of papers were collected and which criteria were chosen for paper selection, etc.

Research Questions
The purpose of this Systematic Literature Review (SLR) is to find a proper overview of Topic Modeling and comparison between Bangla and English topic modeling schemes. We asked the questions in Table 1 to extract data from the papers and conduct the review process. The answers to these questions found after reviewing the selected papers are listed and discussed in section 4.

Search Strategy
For searching, we developed some criteria (shown in section 3.4) and followed them. At first: (a) We searched with some major/key terms related to topic modeling such as 'topic modeling,' 'topic modeling in Bangla,' 'topic modeling in Bengali,' 'text summarization in Bangla,' etc. (b) For every key term, we searched once for English and then added necessary wording to search for Bangla paper in the same context (c) We also tried out different synonyms and alternate names of these key terms to gather the collection (d) After collecting the papers, we removed the papers that were duplicate (e) We collected papers from the references of already included papers and removed duplicate papers again The search process is illustrated in Fig. 1. What is the most used method for topic modeling? RQ 2 What are the sources of the datasets used? RQ 3 What evaluation methods are used to compare the models? RQ 4 Which are the main fields of application for topic modeling? RQ 5 What are the techniques that have been used in English topic modeling but not yet used in Bangla?

Sources
From Table 2 we can see that the main sources of our collected papers were IEEE, ACM Digital Library, ScienceDirect, Springer, Wiley Online Library and Google Scholar. The selected papers were published from January 2003 to May 2020 and their distribution is given in Fig. 2. From Fig. 2, it is apparent that over time the researches on Topic Modeling have increased, especially later in the decade. Figure 3 shows paper sources (Bangla and English separately). We have collected papers published in Journals and Conferences, as shown in Fig. 4.

Selection Criteria
In this section, we arranged our paper selection criteria. Table 3 includes the criteria for including or excluding a paper. Topic Modeling techniques can be used to find the hidden structures of text documents (Al Helal and Mouhoub, 2018). Models used in topic modeling have also been used for text summarization (Chowdhury et al., 2017) (Akter and Aziz, 2016) in Bangla. So as mentioned in the inclusion criteria, apart from topic modeling, we have also included some text summarization and sentiment analysis related papers in Bangla. In Table  3, 'relevant algorithm' means algorithms that are related to topic modeling, text summarization, or sentiment analysis.

Results
In this section, we describe the outcome of this review process. We will go through the results by answering the questions asked in Table 1.

RQ1 -What is the most used method for topic modeling?
Latent Dirichlet Allocation (LDA) is the most used method for topic modeling. 22 of our reviewed papers used LDA for topic modeling, 7 of the other papers modified the basic LDA approach and used them. BTM model stands out in extracting topics from short texts and was used by 6 papers. Another conventional technique LSA (LSI before improved) was used by 5 papers. Many other models are being developed and used by researchers. Table 4 shows all the methods used in our reviewed papers. The most used methods are illustrated in Fig. 5 for easier comparison (separately for English and Bangla).

RQ2 -What are the sources of the datasets used?
There are two major sources of documents that are collected as datasets for the topic modeling techniques. One is the online newspapers and the other is the immensely growing social media sites. Most of the papers collected their data sets from either of these two sources and it is mostly true for both English and Bangla research. Researchers working on English languages collected their documents from sources such as Twitter, NY Times, BaiduQA (A Chinese Q&A website), BBC, Reuters, Yahoo, NIPS and many other online resources. Whereas for Bangla documents, researchers looked up into The Daily Prothom-Alo, The Daily Jugantar, Anandabazar Patrika, Twitter, Facebook, Comments from YouTube, etc. However, collecting a proper dataset in Bangla is challenging, since there are not many standard datasets available in Bangla (Alam et al., 2017). English research has another advantage, that is, using research article Titles, Abstracts as datasets. The same is not possible for Bangla datasets since research articles are not written in Bangla. The datasets used in each of the papers are shown in Table 5 (for English) and Table  6 (Bangla).    Hu et Wang and Blei (2011;Liang et al., 2018;Chowdhury and Chowdhury (2014;Rahman and Dey, Gao et al., 2018;Ramage et al., 2009a;Sarkar, 2014;Sarkar, 2012a;2012b;Haque et al., Xu et al., 2019;Karami et al., 2018;2015;Ahmad and Amin, 2016;Bodini, 2019;Efat et al., Song et al., 2019;Rashid et al., 2019;Chowdhury et al., 2017;Das and Bandyopadhyay, Hong and Davison, 2010) 2010a-e; Ahmad et al., 2018;Phani et al., 2017) Topic coherence (PMI) Cheng et al. (2014;Zuo et al., 2016) Al Helal and Mouhoub (2018) Lesnikowski et al. (2019;Li et al., 2016;Wu and -Li, 2019;Dieng et al., 2019;Jiang et al., 2017;Alkhodair et al., 2018;Bertalan and Ruiz, 2019) Topic coherence (UCI, UMass) Jiang et al. (2017;Zuo et al., 2016;Arora et al.,

RQ3 -What evaluation methods are used to compare the models?
There are several different evaluation matrices and methods used in the papers we reviewed. The most used methods are Precision, Recall, F1 measure (PRF), Topic Coherence (with PMI, UCI and UMass), Confusion Matrix, Probability, etc. From these, PRF is the most used evaluation metric for topic modeling. 26 of our reviewed papers used PRF. Some papers measured the F1 Score but not the recall and precision; some measured recall only and not the other two. Topic Coherence was used in 13 papers. Also, Purity Metric, L1 Error, Confusion Matrix, etc. evaluation matrices were used by a few of the papers. Table 7 shows which evaluation methods were used by which papers.
In Fig. 6, we have compared a few models which were evaluated using F1-score. Here, LSA acquired 0.9591 (Sun and Platoš, 2019) and LDA2vec acquired 0.8566 (Hasan et al., 2019), which are highest in English and Bangla language, respectively. Again, we compared a few models using Topic Coherence matrix in Fig. 7. Here, R-BTM scored highest with 2.55 (Li et al., 2019b). As different evaluation matrices were used to evaluate different models, the acquired model-accuracy data was inadequate to represent all the models in the figure. So, only the most used models are shown in Fig. 6 and. 7.
Some of the papers did not use any proper evaluation system. A few of them determined accuracy by comparison and other papers just provided the determined topics by their respective system without evaluating the quality.

RQ4 -Which are the main fields of application for topic modeling?
To find and extract information from vast collections of documents is a very hard, toiling and time-consuming process. So, to find individual documents from large document collections and to understand the general themes present in the collection, topic models are used as a statistical framework. How topic modeling can be used for researches in real-world circumstances (According to papers in our review) is given in Table 8.

Information from Social Sites
Nowadays, social media sites are significant sources of data. Many research works in topic modeling have focused on the use of these data. To get the required data from the websites, compatible APIs (Application Programming Interface) are used. These data may include web page titles, image captions, questions in Q&A websites, text advertisements and posts, messages, tweets in social media sites. LDA and BTM models were then used as standard tools for topic modeling on those data (Hong and Davison, 2010;Tong and Zhang, 2016;Li et al., 2019b;Cheng et al., 2014). There is another version of LDA, modified to better work on Twitter-dataset, called Twitter-LDA (Alkhodair et al., 2018).

Linguistic Science
Understanding the underlying meanings of texts and accordingly classifying the documents is an essential task in linguistic science. Similarly, to understand the emotions or sentiments of documents is also a part of it. Various topic modeling algorithms, e.g., LDA, SVM, Doc2Vec are used for document classifications and SVM, Long Short Term Memory (LSTM), Recurrent Neural Network (RNN) and Contextual Valency analysis (CVA) are used in sentiment analysis.

Author Verification
Author Verification system enables an author to check in which online or offline documents s/he has given the right to use his/her writing. Through this system, no particular organization or person can use another person's writing without authorization. LDA, Pachinko Allocation and Hierarchical LDA (Phani et al., 2017), LSI (Potha and Stamatatos, 2019) algorithms are used in this type of systems.

Medical/Biomedical Science
As most of the things are in digitized form, even all the medical and biomedical fields use all sorts of digital documents to conduct their work and research. To diagnose cancer and get data from gene expression or sequence, LDA was used (Kho et al., 2017).

Scientific Literature
Online archives are now the most common platform for research articles. While searching for necessary research articles, relevant search results are vital. So, in that case, topic modeling can be beneficial to get useful information. Topic modeling can be further extended to recommend similar articles and documents (Wang and Blei, 2011).

Recommendation System
A user's particular interest field can be predicted by using topic modeling. This process can help to recommend similar sorts of things to the user (Uteuov, 2019). Also, suggesting a product by extracting information from that particular product's review through topic modeling is very sophisticated. In this type of system, BTM is one of the most preferred algorithms (Li et al., 2019a).

Political Science
For politicians and political professionals, knowing the main debating topics of mass people is a valuable asset. People nowadays express their thoughts on social media or other online platforms and hence, people's political views can be extracted from those platforms. To extract information from various political websites LSI, LDA and Hierarchical Dirichlet Process (HDP) were used (Bertalan and Ruiz, 2019).

RQ5 -What are the techniques that have been used in English topic modeling but not yet used in Bangla?
There have not been many works in topic modeling in Bangla yet and the few models that have been used are basic models such as LDA, LSA and in some cases, classification models like SVM, Convolutional Neural Network (CNN), etc. There are so many models in English that are yet to be tried for Bangla    Application areas include topic extraction and news (Chauhan and Chauhan, 2016), Statistical Machine Translation.
classification. Since morphology, structure and syntax in Urdu is different, a special modified There is no specific version of LDA dedicated for version of LDA is used named ULDA (Urdu-LDA) (Shakeel et al., 2018).
Bangla languages' morphology, structure and syntax. The number of research in this field is minimal.
The same is true for Bangla as well.
This paper mainly analyzed the studies of topic modeling in English and Bangla language. However, we think ideas from studies in languages closely related to Bangla can also help improve the understanding of the current state of topic modeling in Bangla. Hence, we collected papers in some other Indic languages (Hindi and Urdu) and compared them with existing Bangla works. The comparison results are shown in Table 9.

Discussion
In the results section, we have seen that LDA is the most used topic modeling technique for both Bangla and English language. One of the reasons behind the widespread use of LDA is that it is very flexible. LDA can be combined with many other models to perform tasks such as classifications, summarization, clustering, spam detection, etc. LDA has been used even out of the scope of NLP in Computer Vision to color naming (Benavente et al., 2012). LDA is time-efficient compared to many other topic modeling techniques. It is also unsupervised, which makes it a good choice while working with unlabelled data. The basic LDA is an old model and many task-specific models were innovated from the basic LDA later. LDA has been tested on many circumstances, domains and datasets and it has provided good results, which makes it reliable. However, there are also some drawbacks associated with the model. The most important of them is that LDA tends to work poorly if the input documents are very short in word length. With the rise of social media, text mining for short texts is becoming essential. Although modifications of the basic LDA like Twitter-LDA (Alkhodair et al., 2018) were created, yet the results need improvements. LDA also fails to provide satisfactory results if a document does not consistently discuss a single topic. LDA does not build any correlation between words. These are some areas where LDA still needs improvements. Although LDA is not the single best state-of-the-art model for topic modeling, it is still a good choice under most circumstances for topic modeling.
In the above discussion, we have talked about the advantages and disadvantages of only LDA model. But overall, the topic modeling techniques also have some challenges that need to be addressed. Topic modeling algorithms mainly focus on frequently co-occurring words. The semantic meaning of a word may change according to the context it is used in. But topic modeling algorithms treat a word the same in every context, which adds noise to the word distributions. Another issue with some of the models is that the number of topics needs to be specified before training. But it is not possible to know how many topics will work best beforehand. This leads to iterating over the dataset and trying out different numbers of topics, which is time-consuming. The evaluation methods for topic modeling may test if the model is working but cannot give an absolute measurement of the models' overall quality. So, many topic modeling applications need manual checking or other extrinsic evaluations (if labeled data is available). The studies of topic modeling in Bangla used only the LDA model and its modifications. No other models were experimented with for topic modeling in Bangla.
In light of the challenges discussed above and many other possibilities, topic modeling has an open field for future research. With the rise of social media, a large portion of the generated data is in the form of short texts. Conventional topic modeling methods are found to be performing poorly on short texts. Recently, some researchers have tried developing a few models to work on short texts. Topic modeling on short texts is an area for future research. Evaluation methods for topic modeling are not still well established. Topic coherence is an intrinsic evaluation method for topic modeling that can provide only a relative measure of performance between two models. For absolute measurements, metadata of extrinsic applications (text classification, sentiment analysis, etc.) are used (Shi et al., 2019). Developing evaluation methods for topic modeling can be a potential research area. Another exciting application of topic modeling techniques can be the medical domain. Already in (Kho et al., 2017), researchers have used topic modeling to understand the genetic expressions of cancer cells. Also, topic modeling can be useful to process medical big-data. In Bangla, the scope for future research is even broader. The scarcity of topic modeling research in Bangla leaves many potential areas untouched.
Recommendation system, document classification, sentiment analysis, detecting and tracking trending topics in social platforms, trigger word or voice command recognition and many other NLP tasks in Bangla can be performed using topic modeling.

Merits and Demerits of the Studies
During this review process, we have encountered papers that have both advantages and shortcomings. In most of the papers, the proposed models were well defined and had detailed explanations in them. The overall difference between English and Bangla Topic Modeling is very apparent because of the lack of research in the Bangla language. Thus, it was not difficult to draw comparisons between them. Also, the purpose and motivation of the authors were properly mentioned in all papers.
On the other hand, different evaluation matrices were used in different papers, which makes it difficult to compare the models together. Very few papers shared the same evaluation matrix for the same model. Moreover, attributes of the datasets (i.e., size, vocabulary and other trivial parameters) were not always properly described (Especially the Bangla papers). So, it was not easy to collect and organize that information from those papers. Authors of Bangla papers should give more attention to representing the datasets properly.

Conclusion
We analyzed the current state of topic modeling and the lack of study done in Bangla language topic modeling. After exhaustively searching for papers, we finally selected 71 papers from an initial collection of 94 papers for review. These papers were published between 2003 and 2020. We gathered data concerning several aspects such as method types, datasets, evaluation methods, application fields, etc. These extracted data were later used to answer the specified research questions and give proper insight into this field.
In our reviewed papers, the LDA method stands out as the most commonly used topic modeling technique. Furthermore, the BTM model performs best in extracting topics from short texts. A variety of evaluation methods were used to judge the performances of the models. Precision, recall rate, F1score are the most used evaluating systems. Another explicit topic model evaluating method, Topic Coherence, was also used by many researchers. Besides the models and the evaluation methods, we also highlighted the field of topic modeling applications.
We believe this paper will help researchers to have a straightforward overview of topic modeling. There is a wide range of scopes available for topic modeling in the Bangla language compared to English as described in section 5. By reading this paper, researchers can easily identify the gaps between English and Bangla topic modeling and conduct further research in this emerging field.

Author's Contributions
Md. Basim Uddin Ahmed: Collected the papers for review, analyzed the papers, organized the figures and tables, drafted the manuscript.
Ananta Akash Podder: Analyzed, extracted and organized all required information from the paper for review.
Mahruba Sharmin Chowdhury: Made considerable contributions to this research by critically reviewing the manuscript for significant intellectual content.
Mohammad Abdullah Al Mumin: Verified the works, reviewed the manuscript and supervised the whole project.