Result page:
1
2
3
4
5
6
7
8
9
10
>>
1
October 2016
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 2, Downloads (12 Months): 103, Downloads (Overall): 103
Full text available:
PDF
Storyline detection aims to connect seemly irrelevant single documents into meaningful chains, which provides opportunities for understanding how events evolve over time and what triggers such evolutions. Most previous work generated the storylines through unsupervised methods that can hardly reveal underlying factors driving the evolution process. This paper introduces a ...
Keywords:
topic modeling, storyline, twitter
CCS:
Topic modeling
Keywords:
topic modeling
References:
David M Blei and John D Lafferty. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning, pages 113--120. ACM, 2006.
Daniel Ramage, Susan T Dumais, and Daniel J Liebling. Characterizing microblogs with topic models. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media, pages 130--137, 2010.
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, and Thomas Griffiths. Probabilistic author-topic models for information discovery. In Proceedings of the 10th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 306--315. ACM, 2004.
Shuang-Hong Yang, Alek Kolcz, Andy Schlaikjer, and Pankaj Gupta. Large-scale high-precision topic modeling on twitter. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1907--1916. ACM, 2014.
Full Text:
... data in one jointmodel have been proposed to improve Twitter topic modeling per-formance, by ?transferring? knowledge learned from long articlessuch as those ...
... Ramage, Susan T Dumais, and Daniel J Liebling.Characterizing microblogs with topic models. . InProceedings of the 4th International AAAI Conference onWeblogs and ... 2013.[16] Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, andThomas Griffiths. Probabilistic author-topic models forinformation discovery. In Proceedings of the 10th ACMSIGKDD international conference ... Shuang-Hong Yang, Alek Kolcz, Andy Schlaikjer, andPankaj Gupta. Large-scale high-precision topic modeling ontwitter. In Proceedings of the 20th ACM SIGKDDinternational conference on ...
2
February 2017
WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 30, Downloads (12 Months): 155, Downloads (Overall): 155
Full text available:
PDF
Latent Dirichlet Allocation (LDA) is an extremely popular probabilistic topic model used for a diverse class of appications. While highly effective, one important limitation of LDA is the high memory footprint of its inferencing algorithm, making it difficult to scale to a large dataset. In my thesis, I propose sdLDA, ...
Keywords:
topic model, scalability, text analysis
CCS:
Document topic models
Keywords:
topic model
Abstract:
<p>Latent Dirichlet Allocation (LDA) is an extremely popular probabilistic topic model used for a diverse class of appications. While highly effective, ...
Primary CCS:
Document topic models
References:
A. Smola and S. Narayanamurthy. An architecture for parallel topic models. Proceedings of the VLDB Endowment, 3(1--2):703--710, 2010.
L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 937--946. ACM, 2009.
Full Text:
... Angeles, CA 90024xuezijun@cs.ucla.eduABSTRACTLatent Dirichlet Allocation (LDA) is an extremely popularprobabilistic topic model used for a diverse class of appi-cations.[1] While highly effective, ... perfor-mance compared to main-memory-based LDAs in terms ofrunning time.KeywordsText Analysis, Topic model; ; Scalability1. REFERENCES[1] D. M. Blei, A. Y. Ng, and ... 569?577.ACM, 2008.[3] A. Smola and S. Narayanamurthy. An architecture forparallel topic models. . Proceedings of the VLDBEndowment, 3(1-2):703?710, 2010.[4] L. Yao, D. ... 2010.[4] L. Yao, D. Mimno, and A. McCallum. Efficientmethods for topic model inference on streamingdocument collections. In Proceedings of the 15th ACMSIGKDD ...
3
July 2016
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 17, Downloads (12 Months): 264, Downloads (Overall): 264
Full text available:
PDF
Discovering the author's interest over time from documents has important applications in recommendation systems, authorship identification and opinion extraction. In this paper, we propose an interest drift model (IDM), which monitors the evolution of author interests in time-stamped documents. The model further uses the discovered author interest information to help ...
Keywords:
author topic model, dynamic author interests, dynamic author topic model, topic model
CCS:
Document topic models
Keywords:
author topic model
dynamic author topic model
topic model
Abstract:
... author interest information to help finding better topics. Unlike traditional topic models, , our model is sensitive to the ordering of words, ... show that the IDM model learns better topics than state-of-the-art topic models.
Title:
Discovering Author Interest Evolution in Topic Modeling
References:
D. M. Blei and J. D. Lafferty. Dynamic topic models. In Proceedings of the 23rd ICML, pages 113--120. ACM, 2006.
A. Daud. Using time topic modeling for semantics-based dynamic research interest finding. Knowledge-Based Systems, 26:154--163, 2012.
N. Kawamae. Author interest topic model. In Proceedings of the 33rd international ACM SIGIR, pages 887--888. ACM, 2010.
M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Proceedings of the 20th conference on Uncertainty in artificial intelligence, pages 487--494. AUAI Press, 2004.
M. Yang, T. Cui, and W. Tu. Ordering-sensitive and semantic-aware topic modeling. In Proceedings of the 28th AAAI, 2015.
M. Yang, D. Zhu, and K.-P. Chow. A topic model for building fine-grained domain-specific emotion lexicon. In ACL (2), pages 421--426, 2014.
Full Text:
... author in-terest information to help ?nding better topics. Unlike tra-ditional topic models, , our model is sensitive to the orderingof words, thus ... results showthat the IDM model learns better topics than state-of-the-art topic models. .1. INTRODUCTIONAuthor interests play a crucial role in a variety ... ordering of words and the semantic mean-ing of sentences into topic modeling. . However, [11] doesnot consider the authorship and temporal information ... so that the topics rerectthe semantics of the context.2. RELATEDWORKThe Author-Topic model [8] is the ?rst generative modelthat simultaneously models the content ... documents andthe interests of authors. There are also some variants ofAuthor-Topic models such as [5]. These models are devotedto discovering static latent ... characterize topics and their evolution over time, [2]propose a dynamic topic model (DTM) which jointly mod-els word co-occurrence and time. [9] propose ...
... the drift of the individual author's interests.All of the aforementioned topic models employ the bag-of-words assumption, which is rarely true in practice. ... assumption, which is rarely true in practice. [11]801propose a generative topic model that represent each topicas a cluster of multi-dimensional vectors. Nevertheless, ...
... on two publiclyavailable datasets. We compare our model with state-of-the-art topic models with both quantitative and qualitativeevaluations.4.1 DatasetsWe use the NIPS paper ...
... methods, including Latent DirichletAllocation (LDA) model[3], Author-Topic (AT) model [8],Dynamic Topic model (DTM) [2], Gaussian Mixture NeuralTopic Model (GMNTM) [11] and Author-Topic ...
... this paper, we have proposed a interest drift model(IDM) for topic modeling. . The IDM model achieves sig-ni?cantly lower perplexity compared to ... Journal of machine Learningresearch, 3:993{1022, 2003.[4] A. Daud. Using time topic modeling forsemantics-based dynamic research interest ?nding.Knowledge-Based Systems, 26:154{163, 2012.[5] N. Kawamae. ... research interest ?nding.Knowledge-Based Systems, 26:154{163, 2012.[5] N. Kawamae. Author interest topic model. . InProceedings of the 33rd international ACM SIGIR,pages 887{888. ACM, ... 2013.[8] M. Rosen-Zvi, T. Gri?ths, M. Steyvers, andP. Smyth. The author-topic model for authors anddocuments. In Proceedings of the 20th conference onUncertainty ... 2014.[11] M. Yang, T. Cui, and W. Tu. Ordering-sensitive andsemantic-aware topic modeling. . In Proceedings of the28th AAAI, 2015.[12] M. Yang, D. ... AAAI, 2015.[12] M. Yang, D. Zhu, and K.-P. Chow. A topic model
4
July 2016
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 79, Downloads (12 Months): 1,211, Downloads (Overall): 1,211
Full text available:
PDF
For many applications that require semantic understanding of short texts, inferring discriminative and coherent latent topics from short texts is a critical and fundamental task. Conventional topic models largely rely on word co-occurrences to derive topics from a collection of documents. However, due to the length of each document, short ...
Keywords:
short texts, topic model, word embeddings
Title:
Topic Modeling for Short Texts with Auxiliary Word Embeddings
CCS:
Document topic models
Topic modeling
Keywords:
topic model
Abstract:
... from short texts is a critical and fundamental task. Conventional topic models largely rely on word co-occurrences to derive topics from a ... word co-occurrences. Data sparsity therefore becomes a bottleneck for conventional topic models to achieve good results on short texts. On the other ... a large corpus. Exploiting such auxiliary word embeddings to enrich topic modeling for short texts is the main focus of this paper. ... To this end, we propose a simple, fast, and effective topic model for short texts, named GPU-DMM. Based on the Dirichlet Multinomial ... millions of external documents can be easily exploited to improve topic modeling for short texts. Through extensive experiments on two real-world short ...
Primary CCS:
Document topic models
Topic modeling
References:
J. Chang, S. Gerrish, C. Wang, J. L. Boyd-Graber, and D. M. Blei. Reading tea leaves: How humans interpret topic models. In NIPS, 2009.
Z. Chen, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Leveraging multi-domain prior knowledge in topic models. In IJCAI, 2013.
X. Cheng, X. Yan, Y. Lan, and J. Guo. BTM: topic modeling over short texts. IEEE Trans. Knowl. Data Eng., 2014.
R. Das, M. Zaheer, and C. Dyer. Gaussian LDA for topic models with word embeddings. In ACL, 2015.
L. Hong and B. D. Davison. Empirical study of topic modeling in twitter. In The First Workshop on Social Media Analytics, 2010.
L. Hong, D. Yin, J. Guo, and B. D. Davison. Tracking trends: incorporating term volume into temporal topic models. In SIGKDD, 2011.
R. Mehrotra, S. Sanner, W. Buntine, and L. Xie. Improving lda topic models for microblogs via tweet pooling and automatic labeling. In SIGIR, 2013.
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In EMNLP, 2011.
D. Q. Nguyen, R. Billingsley, L. Du, and M. Johnson. Improving topic models with latent feature word representations. TACL, 2015.
X. Quan, C. Kit, Y. Ge, and S. J. Pan. Short and sparse text topic modeling via self-aggregation. In AAAI, 2015.
D. Ramage, S. T. Dumais, and D. J. Liebling. Characterizing microblogs with topic models. In ICWSM, 2010.
C. Wang and D. M. Blei. Collaborative topic modeling for recommending scientific articles. In SIGKDD, 2011.
X. Yan, J. Guo, Y. Lan, and X. Chen. A biterm topic model for short texts. In WWW, 2013.
X. Yunqing, T. Nan, H. Amir, and C. Erik. Discriminative bi-term topic model for headline-based social news clustering. In AAAI, 2015.
W. X. Zhao, J. Jiang, J. Weng, J. He, E.-P. Lim, H. Yan, and X. Li. Comparing twitter and traditional media using topic models. In ECIR, 2011.
Full Text:
Topic Modeling for Short Texts with Auxiliary WordEmbeddingsChenliang Li1, Haoran Wang1, Zhiqian ... a large corpus. Exploiting such auxiliary word embed-dings to enrich topic modeling for short texts is the main focus ofthis paper. To ... from millions of external documents can be easilyexploited to improve topic modeling for short texts. Through ex-tensive experiments on two real-world short ... [37], comment summarization [31], con-tent characterizing [32], and classification [34].Conventional topic modeling techniques, e.g., pLSA and LDA,are widely used to infer latent ... [14, 36, 42]. Despitetheir great success on many tasks, conventional topic models ex-perience a large performance degradation over short texts becauseof limited ... asubset of short texts to form a longer pesudo-document. Conven-tional topic models are then applied over these pesudo-documents.The aggregation is often guided ...
... models [39,42]. The third strategy is to design abrand new topic model by explicitly incorporating additional wordco-occurrence information. Examples include modeling word ... documents in specific domains.In this paper, we propose a new topic model for short texts,named GPU-DMM. GPU-DMM is designed to leverage the ... summarized as follows:1. We develop a simple, fast, and effective topic model to learnthe latent topic patterns over short texts. The model ...
... uses word embeddings as externalknowledge.Topic Models for Short Texts. Conventional topic models suchas pLSA and LDA are designed to implicitly capture word ... andbetter topic inference. Because of the length of each document,conventional topic models suffer a lot from the data sparsity prob-lem in short ... been studied by merging short texts in-to long pseudo-documents. Conventional topic modeling is thenapplied to infer the latent topics. In [38], the ... domains, e.g., search snippets and news headlines.These studies suggest that topic models ... specifically designed forgeneral short texts are imperative.166A simple and effective topic model, , named Dirichlet Multino-mial Mixture (DMM) model, has been employed ... short textsbeing modeled. Yan et al. propose a novel biterm topic model( (BTM) to explicitly model the generation of word co-occurrencepatterns instead ... word co-occurrencepatterns instead of single words as do in many topic models [39].Their experimental results show that BTM produces discriminativetopic representations as ... the aforementioned aggregation strategies, Quan et al.propose a self-aggregation based topic model (SATM) for shorttexts [30]. SATM assumes that each short text ... is the work by Nguyen et al. [24]. Theypropose a topic model with word embeddings for short texts, calledLF-DMM. Built based on ...
... is computational expensive. Similarly, Das et al. proposea LDA based topic model by using multivariate Gaussian distribu-tions with word embeddings [11]. Our ... ofshort texts. Compared with existing approaches of incorporatingword embeddings in topic model, , GPU reduces the computationalcost significantly. To the best of ...
... in ma-trix M may not be the best choice for topic modeling in short texts,explained next.Word Filtering. Based on DMM, GPU-DMM samples ...
... Parameter Setting. We compare our GPU-DMMagainst the following four state-of-the-art topic models specific toshort texts. For all the methods in comparison, we ... = 50/K and ? = 0.01 unless explicitly specifiedelsewhere.? Biterm Topic Model (BTM) learns the topics by directlymodeling the generation of word ... an unorderedword pair co-occurred in a short context.? Self-Aggregation based Topic Model (SATM) assumes thateach short text is sampled from a long ... by each model are evaluated by the topiccoherence metric. Traditionally, topic models are evaluated usingperplexity. However, as shown in [3], perplexity does ...
... might introduce some performance variations.4.4 Evaluation by Short Text ClassificationWith topic modeling, , we can represent each document with it-s topic distribution ...
... 6 suggestthat GPU-DMM is a desired choice for short text topic modeling, ,with respect to both effectiveness and efficiency.5. CONCLUSIONUnlike normal documents, ... limited context infor-mation, causing severe sparsity problems when applying conven-tional topic models. . In this paper, we propose a new topic model, ,named GPU-DMM, to leverage global word co-occurrence knowl-edge to help ...
... and B. D. Davison. Tracking trends:incorporating term volume into temporal topic models. . InSIGKDD, 2011.[15] O. Jin, N. N. Liu, K. Zhao, ... R. Mehrotra, S. Sanner, W. Buntine, and L. Xie. Improvinglda topic models for microblogs via tweet pooling andautomatic labeling. In SIGIR, 2013.[20] ... D. Q. Nguyen, R. Billingsley, L. Du, and M. Johnson.Improving topic models with latent feature wordrepresentations. TACL, 2015.[25] A. Niculescu-Mizil and R. ... Ramage, S. T. Dumais, and D. J. Liebling. Characterizingmicroblogs with topic models. . In ICWSM, 2010.[33] D. E. Rumelhar, G. E. Hinton, ... words. InSIGIR, 2012.[36] C. Wang and D. M. Blei. Collaborative topic modeling forrecommending scientific articles. In SIGKDD, 2011.[37] X. Wang, C. Zhai, ... X. Yunqing, T. Nan, H. Amir, and C. Erik. Discriminativebi-term topic model for headline-based social newsclustering. In AAAI, 2015.[42] W. X. Zhao, ...
5
May 2016
WebSci '16: Proceedings of the 8th ACM Conference on Web Science
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 7, Downloads (12 Months): 41, Downloads (Overall): 59
Full text available:
PDF
Topic modeling is a powerful tool for analyzing large collections of user-generated web content, but it still suffers from problems with topic stability, which are especially important for social sciences. We evaluate stability for different topic models and propose a new model, granulated LDA, that samples short sequences of neighboring ...
Keywords:
latent dirichlet allocation, topic modeling, gibbs sampling
Title:
Stable topic modeling for web science: granulated LDA
CCS:
Topic modeling
Keywords:
topic modeling
Abstract:
<p>Topic modeling is a powerful tool for analyzing large collections of user-generated ... especially important for social sciences. We evaluate stability for different topic models and propose a new model, granulated LDA, that samples short ...
Primary CCS:
Topic modeling
References:
R.-C. Chen, R. Swanson, and A. S. Gordon. An adaptation of topic modeling to sentences. http://rueycheng.com/paper/adaptation.pdf, 2010.
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In Proc. EMNLP 2011, pp. 262--272, 2011.
S. I. Nikolenko, O. Koltsova, and S. Koltsov. Topic modelling for qualitative studies. Journal of Information Science, 2015.
K. Vorontsov. Additive regularization for topic models of text collections. Doklady Mathematics, 89(3):301--304, 2014.
Full Text:
Stable Topic Modeling for Web Science: Granulated LDASergei Koltcovskoltsov@hse.ruSergey I. Nikolenkosergey@logic.pdmi.ras.ruOlessia Koltsovaekoltsova@hse.ruSvetlana Bodrunovavisual@jf.pu.ruLaboratory ... show that gLDA exhibits very stable results.CCS Concepts?Computing methodologies ? Topic modeling; ... ;Keywordstopic modeling, latent Dirichlet allocation, Gibbs sampling1. INTRODUCTIONIn social sciences, topic modeling can be used to conciselydescribe a large corpus of documents, ... is also a very important problem forreal life applications of topic modeling, , especially in socialsciences. For a practical application of topic models it ishighly desirable to have stable results: a social scientist ... while preserving approximately the sameor better topic quality as classical topic models. .2. TOPIC MODELINGLet D be a collection of documents, and ... from the vocabulary W .The basic assumption of all probabilistic topic models is thatthere exists a finite set of topics T , ... the distribution of topics in a docu-ment. To train a topic model, , one has to find multinomialdistributions ?wt, t ? T ... (?wt)wt and ? = (?td)td respectively.There are several approaches to topic modeling: : proba-bilistic latent semantic analysis (pLSA) model optimizes thetotal log-likelihood ... [7] adds regularizers explicitly to the objectivefunction In any case, topic modeling basically approximatesF = (Fdw) of size |D| |W | ... ?, which is obviouslyan undesirable property. Hence, regularization is importantin topic models, , but regularizers for improving topic sta-bility have virtually never ...
... the basictopic quality and topic stability metrics across several base-line topic models and granulated LDA with different windowsizes. We have trained 200 ...
... R.-C. Chen, R. Swanson, and A. S. Gordon. Anadaptation of topic modeling to sentences.http://rueycheng.com/paper/adaptation.pdf, 2010.[4] S. Koltcov, O. Koltsova, and S. I. ...
6
July 2016
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 8, Downloads (12 Months): 123, Downloads (Overall): 123
Full text available:
PDF
Automated evaluation of topic quality remains an important unsolved problem in topic modeling and represents a major obstacle for development and evaluation of new topic models. Previous attempts at the problem have been formulated as variations on the coherence and/or mutual information of top words in a topic. In this ...
Keywords:
topic quality, text mining, topic modeling
CCS:
Document topic models
Keywords:
topic modeling
Abstract:
... evaluation of topic quality remains an important unsolved problem in topic modeling and represents a major obstacle for development and evaluation of ... represents a major obstacle for development and evaluation of new topic models. . Previous attempts at the problem have been formulated as ...
Primary CCS:
Document topic models
References:
D. M. Blei. Introduction to probabilistic topic models. Communications of the ACM, 2011.
J. Chang, J. Boyd-Graber, S. Gerrish, C. Wang, and D. M. Blei. Reading tea leaves: How humans interpret topic models. Advances in Neural Information Processing Systems, 20, 2009.
J. H. Lau, D. Newman, and T. Baldwin. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In EACL, pages 530--539, 2014.
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In Proc. Conference on Empirical Methods in Natural Language Processing, pages 262--272, Stroudsburg, PA, USA, 2011. Association for Computational Linguistics.
S. I. Nikolenko, O. Koltsova, and S. Koltsov. Topic modelling for qualitative studies. Journal of Information Science, 2015.
K. Vorontsov. Additive regularization for topic models of text collections. Doklady Mathematics, 89(3):301--304, 2014.
K. V. Vorontsov and A. A. Potapenko. Additive regularization of topic models. Machine Learning, Special Issue on Data Analysis and Intelligent Optimization with Applications, 101(1):303--323, 2015.
Full Text:
... evaluation of topic quality remains an impor-tant unsolved problem in topic modeling and represents amajor obstacle for development and evaluation of new ... gold standard in this case, than previously de-veloped approaches.Keywordstopic quality; topic modeling; ; text mining1. INTRODUCTIONEvaluating topic quality has been an important ... text mining1. INTRODUCTIONEvaluating topic quality has been an important problemin topic modeling since its very inception. The problem hereis that while it ... the se-mantic information that they capture can be leveraged toevaluate topic models, , with results significantly improvingupon previously known techniques. We perform ... begin by surveying (very briefly due to space con-straints) the topic models whose results we try to evaluate.Let D be a finite ... a finiteset (vocabulary) of all terms from these texts. Probabilis-tic topic models represent the text collection as a sequenceof triples (di, wi, ...
regularization of topic models( (ARTM) [19], the basic pLSA model is augmented with ad-ditive ... it easy to devise new regularizers, adding desiredproperties to the topic model [19,20].The latent Dirichlet allocation (LDA) model [3, 4, 8] in-troduces ... additional presumed dependencies.In each case, the result of learning a topic model can berepresented as the ? and ? matrices. In this ...
... distributed word representations for the purposeof evaluating topic quality in topic modeling results. Wehave shown that the new metrics outperform previously usedtopic ... and T. Baldwin. Machinereading tea leaves: Automatically evaluating topiccoherence and topic model quality. In EACL, pages530?539, 2014.[12] C. X. Ling, J. Huang, ...
... 2014.[20] K. V. Vorontsov and A. A. Potapenko. Additiveregularization of topic models. . Machine Learning,Special Issue on Data Analysis and IntelligentOptimization with ...
7
October 2016
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 25, Downloads (12 Months): 321, Downloads (Overall): 321
Full text available:
PDF
Developing text classifiers often requires a large number of labeled documents as training examples. However, manually labeling documents is costly and time-consuming. Recently, a few methods have been proposed to label documents by using a small set of relevant keywords for each category, known as dataless text classification . In ...
Keywords:
dataless text classification, text analysis, topic modeling
Title:
Effective Document Labeling with Very Few Seed Words: A Topic Model Approach
CCS:
Document topic models
Keywords:
topic modeling
Abstract:
... <i>dataless text classification</i>. In this paper, we propose a <b>S</b>eed-Guided <b>T</b>opic <b>M</b>odel (named <b>STM</b>) for the dataless text classification task. Given a ...
Primary CCS:
Document topic models
References:
D. Andrzejewski, X. Zhu, and M. Craven. Incorporating domain knowledge into topic modeling via dirichlet forest priors. In ICML, 2009.
D. M. Blei and J. D. McAuliffe. Supervised topic models. In NIPS, 2007.
C. Chemudugunta, P. Smyth, and M. Steyvers. Modeling general and specific aspects of documents with a probabilistic topic model. In NIPS, 2006.
Z. Chen, A. Mukherjee, B. Liu, M. Hsu, M. Castellanos, and R. Ghosh. Leveraging multi-domain prior knowledge in topic models. In IJCAI, 2013.
J. Jagarlamudi, H. D. III, and R. Udupa. Incorporating lexical priors into topic models. In EACL, 2012.
C. Li, H. Wang, Z. Zhang, A. Sun, and Z. Ma. Topic Modeling for Short Texts with Auxiliary Word Embeddings. In SIGIR, 2016.
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In EMNLP, 2011.
P. Xie and E. P. Xing. Integrating document clustering and topic modeling. In UAI, 2013.
Full Text:
lfp0129.pdfEffective Document Labeling with Very Few Seed Words:A Topic Modeling ApproachChenliang Li1, Jian Xing1, Aixin Sun2, Zongyang Ma21State Key Lab ... as dataless text classi?cation. Inthis paper, we propose a Seed-Guided Topic Model (named STM)for the dataless text classi?cation task. Given a collection ...
... techniques [7, 17, 18], in this paper, we propose aSeed-guided Topic Model, , named STM, for dataless text classi?-cation. Given a collection ...
... proposed to incorporate thesemantical relations between word pairs into the topic model
... over the large external corpus is incorporatedfor better short text topic modeling in [23]. A seeded topic modelwas proposed to extract the ...
... together in the corresponding category-topic. Be-cause STM is a probabilistic topic model, , a wrong category-topicmay be sampled for some documents. Given ...
... and di?erent number of iterations.the probabilistic sampling nature of the topic model. . We thereforeneed a larger ? tomake the relevant words ...
... information of di?erentcategories.5. CONCLUSIONIn this paper, we propose a seed-guided topic model for datalesstext classi?cation, named STM. Without any labeled documents,STM takes ... D. Andrzejewski, X. Zhu, and M. Craven. Incorporatingdomain knowledge into topic modeling via dirichlet forestpriors. In ICML, 2009.[2] D. M. Blei and ... ICML, 2009.[2] D. M. Blei and J. D. McAuli?e. Supervised topic models. . InNIPS, 2007.[3] D. M. Blei, A. Y. Ng, and ... M. Steyvers. Modelinggeneral and speci?c aspects of documents with aprobabilistic topic model. . In NIPS, 2006.[7] X. Chen, Y. Xia, P. Jin, ...
... Jagarlamudi, H. D. III, and R. Udupa. Incorporating lexicalpriors into topic models. . In EACL, 2012.[21] M. J. Kusner, Y. Sun, N. ...
8
October 2016
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 12, Downloads (12 Months): 169, Downloads (Overall): 169
Full text available:
PDF
Mining topics in short texts (e.g. tweets, instant messages) can help people grasp essential information and understand key contents, and is widely used in many applications related to social media and text analysis. The sparsity and noise of short texts often restrict the performance of traditional topic models like LDA. ...
Keywords:
topic model, Bayesian nonparametric model, text mining
Title:
A Non-Parametric Topic Model for Short Texts Incorporating Word Coherence Knowledge
CCS:
Document topic models
Keywords:
topic model
Abstract:
... noise of short texts often restrict the performance of traditional topic models like LDA. Recently proposed Biterm Topic Model (BTM) which models word co-occurrence patterns directly, is revealed effective ... tackle these problems, in this paper, we propose a non-parametric topic model npCTM with the above distinction. Our model incorporates the Chinese ...
Primary CCS:
Document topic models
References:
Chang, J., Gerrish, S., Wang, C., Boyd-graber, J.L. and Blei, D.M. 2009. Reading Tea Leaves: How Humans Interpret Topic Models. Advances in Neural Information Processing Systems 22, 288--296.
Chen, W., Wang, J., Zhang, Y., Yan, H. and Li, X. 2015. User Based Aggregation for Biterm Topic Model. 489--494.
Harvey, M., Crestani, F. and Carman, M.J. 2013. Building user profiles from topic models for personalised search. Proceedings of the 22nd ACM international conference on Conference on information and knowledge management, 2309--2314.
Hong, L. and Davison, B.D. 2010. Empirical study of topic modeling in twitter. Proceedings of the First Workshop on Social Media Analytics, 80--88.
Hu, B. and Ester, M. 2013. Spatial Topic Modeling in Online Social Media for Location Recommendation. Proceedings of the 7th ACM Conference on Recommender Systems, 25--32.
Lau, J.H., Baldwin, T, and Newman, D. 2013. On Collocations and Topic Models. ACM Transactions on Speech and Language Processing, 10, no. 3 (2013): 10:1--10:14.
Lau, J.H., Newman, D. and Baldwin, T. 2014. Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 530--539.
Wallach, H.M., Murray, I., Salakhutdinov, R. and Mimno, D. 2009. Evaluation Methods for Topic Models. Proceedings of the 26th Annual International Conference on Machine Learning, 1105--1112.
Yan, X., Guo, J., Lan, Y. and Cheng, X. 2013. A Biterm Topic Model for Short Texts. Proceedings of the 22Nd International Conference on World Wide Web, 1445--1456.
Zhao, W.X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H. and Li, X. 2011. Comparing Twitter and Traditional Media Using Topic Models. Advances in Information Retrieval, 338--349.
Full Text:
Proceedings Template - WORDA Non-Parametric Topic Model for Short Texts Incorporating Word Coherence Knowledge Yuhao Zhang1 Wenji ... noise of short texts often restrict the performance of traditional topic models like LDA. Recently proposed Biterm Topic Model (BTM) which models word co-occurrence patterns directly, is revealed effective ... tackle these problems, in this paper, we propose a non-parametric topic model npCTM with the above distinction. Our model incorporates the Chinese ... coherent topics compared with the baseline methods. Keywords Text Mining; Topic Model; ; Bayesian Nonparametric Model. 1. INTRODUCTION Short texts such as ... the data is sparse and usually noisy as well. Traditional topic models like PLSI [5], LDA [2] and HDP [6] represent documents ... short texts together as pseudo documents and then use conventional topic model like LDA for topic detection. These strategies make conventional topic models perform better than using them directly in short texts. As ... co-occurrence, which is proved useful in practice [18]. The Biterm Topic Model (BTM) [17] further extends it to a more principle approach ... The distinction is both important and effective in practice for topic modeling [4, 19]. Twitter-BTM attempts to address this issue by considering ...
... short texts, in this paper we propose a non-parametric coherent topic model (npCTM) that differentiates general words and topical words at the ... word coherence knowledge from corpus. Finally, we present our non-parametric topic model npCTM and the parameters inference of the model. 2.1 Chinese ... ???? 2.3 The npCTM Model We now present our non-parametric topic model npCTM. Our method uses CRP as a non-parametric prior to ...
... to stop words, these words will influence the performance of topic models. . We also remove words occurring less than 20 times ... lower case. We randomly sample 300,000 tweets used for training topic models and use 3,101,271 tweets left as reference dataset for the ... are LDA, BTM, HDP, and npCTM-UB. LDA is a classic topic model which has been widely used. BTM is a representative topic ... method for short texts. HDP is a widely used non-parametric topic model based on Dirichlet process. We also compare our model with ...
... 3.2 Evaluation of Topic Quality A traditional way to evaluate topic models is comparing the perplexity or marginal likelihood on a held-out ... number in advance. In this paper, we propose a non-parametric topic model 2019Table 1. Results on the comparison of topic coherence for ...
... and Blei, D.M. 2009. Reading Tea Leaves: How Humans Interpret Topic Models. . Advances in Neural Information Processing Systems 22, 288?296. [4] ... H. and Li, X. 2015. User Based Aggregation for Biterm Topic Model. . 489?494. [5] Ding, C., Li, T. and Peng, W. ... Crestani, F. and Carman, M.J. 2013. Building user profiles from topic models for personalised search. Proceedings of the 22nd ACM international conference ... [9] Hong, L. and Davison, B.D. 2010. Empirical study of topic modeling in twitter. Proceedings of the First Workshop on Social Media ... Analytics, 80?88. [10] Hu, B. and Ester, M. 2013. Spatial Topic Modeling in Online Social Media for Location Recommendation. Proceedings of the ... J.H., Baldwin, T, and Newman, D. 2013. On Collocations and Topic Models. . ACM Transactions on Speech and Language Processing, 10, no. ... 2014. Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality. Proceedings of the 14th Conference of the European Chapter ... I., Salakhutdinov, R. and Mimno, D. 2009. Evaluation Methods for Topic Models. . Proceedings of the 26th Annual International Conference on Machine ... Guo, J., Lan, Y. and Cheng, X. 2013. A Biterm Topic Model for Short Texts. Proceedings of the 22Nd International Conference on ... and Li, X. 2011. Comparing Twitter and Traditional Media Using Topic Models.
9
April 2016
WWW '16: Proceedings of the 25th International Conference on World Wide Web
Publisher: International World Wide Web Conferences Steering Committee
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 7, Downloads (12 Months): 90, Downloads (Overall): 129
Full text available:
PDF
Our proposal, $N$-gram over Context (NOC), is a nonparametric topic model that aims to help our understanding of a given corpus, and be applied to many text mining applications. Like other topic models, NOC represents each document as a mixture of topics and generates each word from one topic. Unlike ...
Keywords:
graphical models, nonparametric models, topic models, latent variable models, mapreduce, N-gram topic model
CCS:
Document topic models
Keywords:
topic models
N-gram topic model
Abstract:
<p>Our proposal, $N$-gram over Context (NOC), is a nonparametric topic model that aims to help our understanding of a given corpus, ... and be applied to many text mining applications. Like other topic models, , NOC represents each document as a mixture of topics ...
Primary CCS:
Document topic models
References:
D. Blei, T.Griffiths, M. Jordan, and J. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. NIPS, 16, 2004.
J. Boyd-Graber and D. M. Blei. Syntactic topic models. In NIPS, pages 185--192, 2008.
N. Kawamae. Supervised N-gram topic model. In WSDM, pages 473--482, 2014.
R. V. Lindsey, W. P. Headden, III, and M. J. Stipicevic. A phrase-discovering topic model using hierarchical pitman-yor processes. In EMNLP-CoNLL, pages 214--222, 2012.
D. Newman, A. Asuncion, P. Smyth, and M. Welling. Distributed algorithms for topic models. JMLR, 10:1801--1828, dec 2009.
A. Smola and S. Narayanamurthy. An architecture for parallel topic models. Proc. VLDB Endow, 3(1-2):703--710, 2010.
H. Wallach. Topic modeling: Beyond bag-of-words. In ICML, pages 977--984, 2006.
J. Yuan, F. Gao, Q. Ho, W. Dai, J. Wei, X. Zheng, E. P. Xing, T.-Y. Liu, and W.-Y. Ma. Lightlda: Big topic models on modest computer clusters. In WWW, pages 1351--1361, 2015.
K. Zhai, J. Boyd-Graber, N. Asadi, and M. L. Alkhouja. Mr. lda: a exible large scale topic modeling package using variational inference in mapreduce. In WWW, pages 879--888, 2012.
Full Text:
... 261-0023kawamae@gmail.comABSTRACTOur proposal, N -gram over Context (NOC), is a nonpara-metric topic model that aims to help our understanding ofa given corpus, and ... and be applied to many text mining ap-plications. Like other topic models, , NOC represents eachdocument as a mixture of topics and ... by the help of an open-source distributed machinelearning framework.KeywordsNonparametric models, Topic models, , Latent variable mod-els, Graphical models, N -gram topic model, , MapReduce1. INTRODUCTIONAs the sheer volume of user generated content ... content on the Webnow exceeds the individual human processing capabilities,statistical topic models are an essential component of nat-ural language processing for human-computer ... than top-ics. Although introducing both these linguistic structuressimultaneously would make topic models complicated, theyare essential for applications to capture the thematic co-herence ... behind documents. This is the motivation why wepropose a new topic model to maintain both these linguisticstructures, since no previous topic models perfectly ?t forthese requirements.Motivated by these requirements, our proposal new ...
... for easy implementations. It conquers thecomplex dependencies included in hierarchical topic models, ,and could be the approximated Gibbs sampling of NOC.Since this ... For example, Identifying Sentiments over N-gram(ISN) model [14] extends Bigram topic model (BTM) [29]to obtain N -grams rerecting dependency on nearby context.By ... exhibit the powerlaw characteristics, known as Zipf's law [33] in linguistics,topic models could gain the appropriate structure of a giventext data. Phrase ... [24] hierarchy into theprocess of forming phrases. Supervised N -gram topic model( (SNT) [15] extends ISN by combining both PYP and super-vision. ... between words by us-ing a hidden Markov Model (HMM), Syntactic Topic Model( (STM) [6] uses parse trees as syntactic information, and gen-erates ... than the latter.2.2 Scalable topic modelsA promising approach to scaling topic models over largedata sets is to distribute and parallelize both the ...
... clear how these approaches can be applied to non-parametric hierarchical topic models [4], since they containcomplex dependencies which render the inference hard ... hard toparallelize and impose costly alignment overheads betweennodes. Although applying topic models to web scale datasets is not as straightforward as one ... challenge is to continue this approach for more com-plicated hierarchical topic modeling using collapsed Gibbssampling, build D-NOC on top of an open-source ... core (38), distributed computing (36), memory bandwidth (25), large scale (24)topic model topic model (19), Gibbs sampling (17), nonparametric bays (15), Dirichlet process (12),submodular ...
... Thistable shows that these groups seem to correspond topicsof conventional topic models, , and have a hierarchical se-mantic relationship. For example, both ... than the data management. As the number oftokens assigned to topic model/ /sub modular is fewer thanthe number of tokens associated with ... of w following u on k.This problem leads N -gram topic models without the topicstructure to miss these informative words/phrases and de-crease ... -grams seems to depend on the topic structure, weneed a topic model to discover a ?ne grained topic tree froma given corpus, ... tomaintain the parent-child topic relationships more preciselythan nHDP.Summary: Like other topic models, , NOC representseach document as a mixture of topics and ...
... following data sets wereused in comparative quantitative evaluations against previ-ous topic models. .4Memcached: http://memcached.org/1051Algorithm 4 Map Phase for D-NOC// Initialization// Input<Key,Value>:=<document ID ... Can NOC represent a given corpus more e?-ciently than conventional topic models? ?: Subsection 5.2.15http://dl.acm.org/6Amazon Product Review Data (Huge):http://www.cs.uic.edu/~liub/NBS/sentiment-analysis.html7Amazon reviews: https://snap.stanford.edu/data/web-Amazon.html8Twitter: http://twitter.comAlgorithm ...
... more syntactic information than theother models.6. DISCUSSIONThe disadvantages of parallelized topic models can be par-tially solved as follows: The di?erence between process-ing ...
... -grams than the others.7. CONCLUSIONThis paper shows a N -gram topic model that employs asemantic topic hierarchical structure as the thematic coher-ence, ... ACM,57(2):7:1{7:30, 2010.[5] D. Blei, T.Gri?ths, M. Jordan, and J. Tenenbaum.Hierarchical topic models and the nested chineserestaurant process. NIPS, 16, 2004.[6] J. Boyd-Graber ... n-gram. InWWW, pages 541{542, 2012.[15] N. Kawamae. Supervised N -gram topic model. . InWSDM, pages 473{482, 2014.[16] N. Kawamae. Real time recommendations ... Lindsey, W. P. Headden, III, and M. J.Stipicevic. A phrase-discovering topic model usinghierarchical pitman-yor processes. In EMNLP-CoNLL,pages 214{222, 2012.[19] J. McAuley and ... Newman, A. Asuncion, P. Smyth, and M. Welling.Distributed algorithms for topic models. . JMLR,10:1801{1828, dec 2009.[23] J. Paisley, C. Wang, D. M. ... 4:639{650, 1994.[26] A. Smola and S. Narayanamurthy. An architecture forparallel topic models. . Proc. VLDB Endow,3(1-2):703{710, 2010.[27] Y. W. Teh. A hierarchical ... and D. M. Blei.Hierarchical dirichlet processes. JASA,101(476):1566{1581, 2006.[29] H. Wallach. Topic modeling: : Beyond bag-of-words. InICML, pages 977{984, 2006.[30] X. Wei and ...
10
February 2017
WSDM '17: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 24, Downloads (12 Months): 64, Downloads (Overall): 64
Full text available:
PDF
Document network is a kind of intriguing dataset which provides both topical (texts) and topological (links) information. Most previous work assumes that documents closely linked with each other share common topics. However, the associations among documents are usually complex, which are not limited to the homophily (i.e., tendency to link ...
Keywords:
topic models, copula, document networks, plsa
CCS:
Topic modeling
Keywords:
topic models
Primary CCS:
Topic modeling
References:
J. Chang and D. M. Blei. Relational topic models for document networks. In AISTATS 2009, pages 81--88.
J. Chang, S. Gerrish, C. Wang, J. L. Boyd-graber, and D. M. Blei. Reading tea leaves: How humans interpret topic models. In NIPS 2009, pages 288--296.
N. Chen, J. Zhu, F. Xia, and B. Zhang. Generalized relational topic models with data augmentation. In AAAI 2013, pages 1273--1279.
A. Hefny, G. Gordon, and K. Sycara. Random walk features for network-aware topic models. 2013.
Q. Mei, D. Cai, D. Zhang, and C. Zhai. Topic modeling with network regularization. In WWW 2008.
D. Mimno and A. McCallum. Topic models conditioned on arbitrary features with dirichlet-multinomial regression. arXiv preprint.
D. Mimno, H. M. Wallach, E. Talley, M. Leenders, and A. McCallum. Optimizing semantic coherence in topic models. In EMNLP 2011, pages 262--272.
M. Wahabzada, Z. Xu, and K. Kersting. Topic models conditioned on relations. In Machine Learning and Knowledge Discovery in Databases 2010.
Z. Yin, L. Cao, Q. Gu, and J. Han. Latent community topic analysis: Integration of community discovery with topic modeling. ACM Trans. Intell. Syst. Technol. 2012, page 63.
Full Text:
... General; H.3.5 [InformationStorage and Retrieval]: Online Information ServicesKeywordsDocument Networks, Copula, Topic Models, , PLSAPermission to make digital or hard copies of all ...
... propose a unified model for documentnetwork by regularizing a statistical topic model PLSA [23]with the tree-averaged copula regularizer. For brevity, wename it ... It is also the firsttime that copulas are applied in topic modeling. .? The integration of tree-averaged copula and PLSA makesit possible ... Automatic ex-traction of summary from document collections has receivedconsiderable attention. Topic models such as LDA [4] andPLSA [23] have been successfully used ...
... the DMR extensions, Chang et al. [6] proposethe famous Relational Topic Model (RTM) to model docu-ment networks. In RTM, the link between ...
... state-of-the-artbaselines. By incorporating relational information, we ex-pect the performance of topic modeling to achieve signif-icant improvements. Meanwhile, the network structure iswell embedded ... a unique train of thought in modeling documentnetworks:? RTM: Relational Topic Model (RTM) [6] is a famous ex-tension of LDA in modeling ...
... graph structure.? LCTA: Latent Community Topic Analysis (LCTA) [41]simultaneously performs topic modeling and communitydetection. It reports great improvements over many base-lines, such ... with the perplexity mea-sure. To better reflect the interpretability of topic models, ,Mimno et al. [32] propose the topic coherence metric. Theypoint ...
... proposes a novel model copulaPLSA which reg-ularizes the traditional statistical topic model PLSA with atree-averaged copula regularizer. The introduction of copu-las makes ...
... university press, 2004.[6] J. Chang and D. M. Blei. Relational topic models fordocument networks. In AISTATS 2009, pages 81?88.[7] J. Chang, S. ... N. Chen, J. Zhu, F. Xia, and B. Zhang. Generalizedrelational topic models with data augmentation. InAAAI 2013, pages 1273?1279.[9] X. Chen and ... Hefny, G. Gordon, and K. Sycara. Random walkfeatures for network-aware topic models. . 2013.[23] T. Hofmann. Probabilistic latent semantic indexing. InSIGIR 1999, ...
11
October 2016
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 1, Downloads (12 Months): 41, Downloads (Overall): 41
Full text available:
PDF
We study the problem of generating DAG-structured category hierarchies over a given set of documents associated with "importance" scores. Example application includes automatically generating Wikipedia disambiguation pages for a set of articles having click counts associated with them. Unlike previous works, which focus on clustering the set of documents using ...
Keywords:
topic model, hierarchical categorisation, gibbs sampling
CCS:
Document topic models
Keywords:
topic model
References:
David M. Blei, Thomas~L. Griffiths, Michael I. Jordan, and Joshua B. Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. NIPS, 2004.
DMBTL Griffiths and MIJJB Tenenbaum. Hierarchical topic models and the nested chinese restaurant process. NIPS, 2004.
Saurabh S. Kataria, Krishnan S. Kumar, Rajeev R. Rastogi, Prithviraj Sen, and Srinivasan H. Sengamedu. Entity disambiguation with hierarchical topic models. KDD, 2011.
Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai. Automatic labeling of multinomial topic models. KDD, 2007.
Daniel Ramage, David Hall, Ramesh Nallapati, and Christopher D Manning. Labeled lda: A supervised topic model for credit attribution in multi-labeled corpora. EMNLP, 2009.
Mark Steyvers and Tom Griffiths. Probabilistic topic models. 2007.
Full Text:
... Thomas L. Griffiths, Michael I. Jordan, andJoshua B. Tenenbaum. Hierarchical topic models and thenested chinese restaurant process. NIPS, 2004.[7] David M Blei, ...
... R Rastogi,Prithviraj Sen, and Srinivasan H Sengamedu. Entitydisambiguation with hierarchical topic models. . KDD, 2011.[16] Wei Li and Andrew McCallum. Pachinko allocation:Dag-structured ... Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai. Automaticlabeling of multinomial topic models. . KDD, 2007.[21] Rada Mihalcea and Andras Csomai. Wikify!: Linkingdocuments ... for tag recommendations. KM,2011.[30] Mark Steyvers and Tom Griffiths. Probabilistic topic models. .2007.[31] Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum.Yago: A ...
12
April 2017
WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion
Publisher: International World Wide Web Conferences Steering Committee
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 3, Downloads (12 Months): 12, Downloads (Overall): 12
Full text available:
PDF
Predicting the future is hard, more so in active research areas. In this paper, we customize an established model for citation prediction of research papers and apply it on research topics. We argue that research topics, rather than individual publications, have wider relevance in the research ecosystem, for individuals as ...
Keywords:
software engineering publication, topic model, citation prediction
CCS:
Document topic models
Keywords:
topic model
Full Text:
... utility for individual researchers, aswell as research groups.Keywordssoftware engineering publication, topic model, , citation pre-diction1. INTRODUCTION1.1 BackgroundAcross domains, researchers are deeply interested ...
... abstraction than papers. The modeling effort is nontriv-ial as the topic model derived from papers is a probabilisticone based on the similarity ... our ear-lier work [7, 8], we have shown that the topic model basedon the information content of papers work reasonably wellin terms ...
13
April 2017
SAC '17: Proceedings of the Symposium on Applied Computing
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 15, Downloads (12 Months): 19, Downloads (Overall): 19
Full text available:
PDF
Topic modeling is an important area which aims at indexing and exploring massive data streams. In this paper we introduce a discrete Dynamic Topic Modeling (dDTM) algorithm, which is able to model a dynamic topic that is not necessarily present over all time slices in a stream of documents. Our ...
Keywords:
dynamic topic modeling, news mining, stream mining
CCS:
Topic modeling
Keywords:
dynamic topic modeling
Abstract:
<p>Topic modeling is an important area which aims at indexing and exploring ... data streams. In this paper we introduce a discrete Dynamic Topic Modeling (dDTM) algorithm, which is able to model a dynamic topic ... representative of the contents of documents than the original Dynamic Topic Modeling (DTM) in terms of likelihood on held-out data. Furthermore, we ...
Primary CCS:
Topic modeling
References:
D. M. Blei and J. D. Lafferty. Dynamic topic models. In Proc., ICML '06, pages 113--120, 2006.
C. Wang, D. Blei, and D. Heckerman. Continuous time dynamic topic models. Proc. of UAI, 2008.
Full Text:
... massive data streams. In this paper weintroduce a discrete Dynamic Topic Modeling (dDTM) al-gorithm, which is able to model a dynamic topic ... in streaming data.CCS Concepts?Information systems ? Data streams; ?Computingmethodologies ? Topic modeling; ;KeywordsDynamic Topic Modeling, , Stream Mining, News Mining1. INTRODUCTIONWith the rapid proliferation of ... indexing large datasets of digitized text documents havebecome increasingly important. Topic modeling is one ofthese methods, and it has been used in ... summarization and trend analysis [6] aswell as information retrieval [19]. Topic models are hier-archical Bayesian models of discrete data [17], where eachtopic ... a fixed vocabulary, rep-resenting a high level concept. One basic topic model is thePermission to make digital or hard copies of all ... Allocation (LDA) [7] that has set the basisfor many other topic modeling methods since its introduc-tion. LDA is a generative probabilistic topic model whichassumes that all documents are exchangeable in the entirecollection, i.e., ...
... of this paper are:? We present a new algorithm for topic modeling whichovercomes the limitation of the original DTM algo-rithm. That is, ... in Section 2. Furthermore,Section 3 presents necessary background information aboutdynamic topic modeling. . In Section 4 we present our noveldDTM method. Section ... model, depicted in Figure 1, which representsthe basis for other topic models including ours. LDA isa Bayesian network that generates each document ... is generated by randomly sampling from aper-topic multinomial distribution ?.Dynamic Topic Modeling (DTM) [6] represents the state-of-the-art for modeling evolving topics over ... theevolution of a topic will be a discrete process.Additionally, another topic model that tracks the evolu-tion of topics is the Topics Over ... [6] and many other previous work in the do-main of topic modeling, , use likelihood on held-out data asa standard measure of ...
... explaining our model.3.1 Latent Dirichlet AllocationLDA is a generative probabilistic topic model which dis-covers topics present in a given text corpus. LDA ... topic. However, in the next section we pro-pose a novel topic model which overcomes the limitations ofDTM, hence it suites the above-mentioned ... evolutions. The reason behind using LDA isthat, compared with other topic- -modeling algorithms, it hasshown strong results for detecting topics in texts ...
... time slice. This evaluation method is standard inthe domain of topic modeling, , and it was also used in theoriginal paper on ...
... 2006.[17] C. Wang, D. Blei, and D. Heckerman. Continuoustime dynamic topic models. . Proc. of UAI, 2008.[18] X. Wang and A. McCallum. ...
14
April 2016
WWW '16: Proceedings of the 25th International Conference on World Wide Web
Publisher: International World Wide Web Conferences Steering Committee
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 16, Downloads (12 Months): 292, Downloads (Overall): 375
Full text available:
PDF
Various topic models have been developed for sentiment analysis tasks. But the simple topic-sentiment mixture assumption prohibits them from finding fine-grained dependency between topical aspects and sentiments. In this paper, we build a Hidden Topic Sentiment Model (HTSM) to explicitly capture topic coherence and sentiment consistency in an opinionated text ...
Keywords:
topic modeling, aspect detection, sentiment analysis
CCS:
Topic modeling
Keywords:
topic modeling
Abstract:
<p>Various topic models have been developed for sentiment analysis tasks. But the simple ...
Primary CCS:
Topic modeling
References:
J. Chang, S. Gerrish, C. Wang, J. L. Boyd-Graber, and D. M. Blei. Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems, pages 288--296, 2009.
Y. Fang, L. Si, N. Somasundaram, and Z. Yu. Mining contrastive opinions on political texts using cross-perspective topic model. In Proceedings of the fifth ACM international conference on Web search and data mining, pages 63--72. ACM, 2012.
J. D. Mcauliffe and D. M. Blei. Supervised topic models. In Advances in neural information processing systems, pages 121--128, 2008.
D. Mimno and A. McCallum. Topic models conditioned on arbitrary features with dirichlet-multinomial regression. The 24th Conference on Uncertainty in Artificial Intelligence, pages 411--418, 2008.
M. Steyvers and T. Griffiths. Probabilistic topic models. Handbook of latent semantic analysis, 427(7):424--440.
H. Wang, D. Zhang, and C. Zhai. Structural topic model for latent topical structure analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pages 1526--1535. Association for Computational Linguistics, 2011.
Full Text:
... WangDepartment of Computer ScienceUniversity of VirginiaCharlottesville VA, 22903 USA{mr4xb, hw5x}@virginia.eduABSTRACTVarious topic models have been developed for sentiment analysistasks. But the simple topic-sentiment ... in political science [6], and many more.One fundamental assumption in topic models is exchangeabili-ty, i.e., topics are infinitely exchangeable within a given ... increas-es the complication of topic and sentiment mixture. For example,most topic models for sentiment analysis assume the selection oftopics are independent given ...
... the effectiveness of the proposed model. A set of state-of-the-art topic models for sentiment analysis are employed as baselines tocompare the quality ... in this paper are as follows,? We develop a unified topic model to explicitly capture topiccoherence and sentiment consistency in opinionated text ... topics [22]. Significant research effort has been paid onbuilding statistic topic models to mine user-generated opinion data.According to the notion proposed in ... Mimno and McCallum?s work[19], we can categorize most of existing topic models for sentimentanalysis as upstream models and downstream models. Upstreammodels assume ...
... on the LDA model [3], Linand He proposed a joint sentiment/topic model (JST) for sentimentanalysis [15]. In JST, the combination of topic ... sentiment ismodeled as a Cartesian product between a set of topic models andsentiment models. Accordingly, each document exhibits distincttopic mixtures under different ... analysis.Another line of related work is introducing Markov model in-to topic modeling. . Aspect-HMMmodel [2] combines pLSA with ahiddenMarkov model [23] to ... sequence, while latent topics are treated the same as inother topic models. . Hidden Topic Markov Model (HTMM) [8] isthe most similar ...
... Basedon this, HTSM drops the simple mixture assumption employed inconventional topic models [3, 9], and explicitly models topic transi-tion in successive sentences ...
... ?i, ?i, ?)Ni?n=1p(wn|?zi)The above joint distribution differentiates HTSM from conven-tional topic models for sentiment analysis, which are built on thesimple topic mixture ...
... (HTMM) [8], Aspectand Sentiment Unification model (ASUM) [12], and Joint Senti-ment/Topic model (JST) [15] as baselines. Among these baselinemodels, ASUM and JST ... andHTMM and ASUM explicitly model sentences in a document. Asunsupervised topic models, , both ASUM and JST require sentimentseed words as input. ... ?in Dirichlet priors to 1.01 and 1.001 for all the topic models. .159 500 1000 1500 2000 2500 3000 0 1000 2000 ... with increasing training size on four different review document sets.4.2 Topic modeling evaluationWe first compare the quality of learned topics from all ... number of word in a test documentDtest.We trained all the topic models (HTSM, HTMM, LDA, JST andASUM) on the described corpora to ... clear from Figure 3 that HTSM out-performed all the other topic models on all four datasets, exceptHTMM. There are two possible explanations. ...
... setting of the number of topics inHTSM and all baseline topic models, , and we fix this setting in allour following experiments.4.2.2 ... allour following experiments.4.2.2 Word intrusion comparisonsPerplexity only measures the quality topic modeling from den-sity estimation perspective; it is also necessary to evaluate ... we employ word intrusion discussed in [4]to evaluate four different topic models, , namely LDA, HTMM, A-SUM and HTSM (because ASUM and ... total we have sevenwords for each topic zk from every topic model: : among those, fiveare regular words, one is intra-topic intruding ... different annotators. Since we have fourdifferent categories and four different topic models, , for this taskwe take feedback from twenty four annotators. ...
... our HTSM model is inferring more humaninterpretable topics than other topic models. . However, in terms ofintra-topic intrusion, the performance of HTSM ...
... a thorough evaluation of sentiment classification, wealso tested all the topic models
... 0.477phone 0.118 0.439 0.443tv 0.173 0.338 0.489we choose three different topic models, , including HTMM, ASUMand HTSM, given they all explicitly model ... top two most probable sentences under each topic from everyselected topic models, , i.e., rank by p(t|z). Since there are manydifferent products ... user study. Once wehave all sentences generated from those three topic models, , we ran-domly interleave those sentences (to avoid position bias ...
... quality in both aspect recognition and sentimentpolarities than the other topic models for aspect-based contrastivesummarization, which is of particular value for customers ... to guide topic and sentiment transitions. Incontrast to the traditional sentiment-topic models which are built onsimple topic mixture assumptions, HTSM captures the ... been performed to compare the performance of HTSM againstseveral state-of-the-arts topic models on four categories of produc-t reviews from Amazon and NewEgg. ... sentiment classification performance are achieved.This work opens new direction in topic modeling for sentimentanalysis. The current HTSM only captures the first order ...
... Li. Smart stopwordlist, 2004.[15] C. Lin and Y. He. Joint sentiment/topic model for sentimentanalysis. In Proceedings of the 18th ACM conference onInformation ... ACM, 2013.[17] J. D. Mcauliffe and D. M. Blei. Supervised topic models. . InAdvances in neural information processing systems, pages121?128, 2008.[18] Q. ... Web, pages 171?180. ACM, 2007.[19] D. Mimno and A. McCallum. Topic models conditioned onarbitrary features with dirichlet-multinomial regression. The24th Conference on Uncertainty ... 618?626. ACM,2011.[29] H. Wang, D. Zhang, and C. Zhai. Structural topic model forlatent topical structure analysis. In Proceedings of the 49thAnnual Meeting ...
15
April 2017
WWW '17: Proceedings of the 26th International Conference on World Wide Web
Publisher: International World Wide Web Conferences Steering Committee
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 9, Downloads (12 Months): 51, Downloads (Overall): 51
Full text available:
PDF
Topic modeling has traditionally been studied for single text collections and applied to social media data represented in the form of text documents. With the emergence of many social media platforms, users find themselves using different social media for posting content and for social interaction. While many topics may be ...
Keywords:
user preference, topic modeling, multiple social networks
CCS:
Document topic models
Topic modeling
Keywords:
topic modeling
Abstract:
<p>Topic modeling has traditionally been studied for single text collections and applied ... well as platform preferences of users, we propose a new topic model known as MultiPlatform-LDA (MultiLDA). Instead of just merging all posts ... Our experiments results show that the MultiLDA outperforms in both topic modeling and platform choice prediction tasks. We also show empirically that ...
Primary CCS:
Topic modeling
References:
Guo, W., Wu, S., Wang, L., and Tan, T. Social-relational topic model for social networks. In CIKM.
Hong, L., and Davison, B. D. Empirical study of topic modeling in twitter. In Proceedings of the first workshop on social media analytics (2010).
Mehrotra, R., Sanner, S., Buntine, W., and Xie, L. Improving lda topic models for microblogs via tweet pooling and automatic labeling. In SIGIR (2013).
Qiu, M., Zhu, F., and Jiang, J. It is not just what we say, but how we say them: Lda-based behavior-topic model. In SIAM SDM (2013).
Rosen-Zvi, M., Griffiths, T., Steyvers, M., and Smyth, P. The author-topic model for authors and documents. In UAI (2004).
Zhao, W. X., Jiang, J., Weng, J., He, J., Lim, E.-P., Yan, H., and Li, X. Comparing twitter and traditional media using topic models. In ECIR (2011).
Full Text:
... well as platform pref-erences of users, we propose a new topic model known asMultiPlatform-LDA (MultiLDA). Instead of just merging allposts from different ... common users. Ourexperiments results show that the MultiLDA outperforms inboth topic modeling and platform choice prediction tasks.We also show empirically that among ... social media platforms belonging to the same users.The latter is topic modeling in the multi-platform contextwhere heterogeneous media types and users? platform ... to perform multi-platform topic mod-eling is to apply an existing topic model such as LDA [2] onthe directly combined content of the ... shows the methodology used in our research. Wefirst construct a topic model
plat-forms. In this paper, we propose MultiPlatform-LDA (Mul-tiLDA), a topic model that jointly learns the topical inter-ests and platform preferences of ... have accounts onTwitter, Instagram and Tumblr.Finally, we evaluate the multi-platform topic model( (s).We perform two sets of experiments to evaluate MultiLDA:(i) we ... content from mul-tiple social media platforms, MultiLDA outperformsTwitterLDA, another state-of-the-art topic model formodeling social media text.? In the prediction of users? platform ... in their tweets [18]. Hong et. al. applied LDA modeland author-topic model [22] to discover the topic interestsof Twitter users [10]. Further ...
... instead of behaviors.2.1.3 Multiple PlatformsThere are also works that apply topic models on multiplesocial media platforms. Guo et. al. proposed a model ... designed a model that incorporates users?social interactions and attributes for topic modeling and ap-plied their model on six social media platforms [5]. ...
... experiments to evaluateMultiLDA and to compare with TwitterLDA, the state-of-the-art topic model for short social media posts. We firstelaborate how we obtain ...
... have to be converted to text contentbefore we can apply topic modeling on them. One possibleway to extract the user annotated text ... videos.The generated tags will then replace the photos and videosin topic modeling. . In the case of Tumblr, we thus have poststhat ... use the TwitterLDA as our baseline. WhileTwitterLDA is the state-of-the-art topic model for tweetposts, they can be easily adapted to ?tag? posts. ...
... globaland user preferences.5. CONCLUSIONIn this paper, we proposed a novel topic model known asMultiPlatform-LDA (MultiLDA), which jointly models so-cial media topics as ... datasets from threesocial media platforms and benchmarked against the state-of-the-art topic model. . Our experiment results have shownthat MultiLDA outperforms TwitterLDA in ...
... (2014).[7] Guo, W., Wu, S., Wang, L., and Tan, T.Social-relational topic model for social networks. InCIKM.[8] Hoang, T.-A., and Lim, E.-P. On ... Mehrotra, R., Sanner, S., Buntine, W., and Xie,L. Improving lda topic models for microblogs via tweetpooling and automatic labeling. In SIGIR (2013).[17] ... is not just whatwe say, but how we say them: Lda-basedbehavior-topic model. . In SIAM SDM (2013).[22] Rosen-Zvi, M., Griffiths, T., Steyvers, ... (2013).[22] Rosen-Zvi, M., Griffiths, T., Steyvers, M., andSmyth, P. The author-topic model for authors anddocuments. In UAI (2004).[23] Vosecky, J., Hong, D., ... Yan, H., and Li, X. Comparing twitter andtraditional media using topic models. . In ECIR (2011).[30] Zhao, W. X., Li, S., He, ...
16
March 2017
LAK '17: Proceedings of the Seventh International Learning Analytics & Knowledge Conference
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 16, Downloads (12 Months): 58, Downloads (Overall): 58
Full text available:
PDF
Student knowledge modeling is an important part of modern personalized learning systems, but typically relies upon valid models of the structure of the content and skill in a domain. These models are often developed through expert tagging of skills to items. However, content creators in crowdsourced personalized learning systems often ...
Keywords:
correlational topic modeling, mathematics education, intelligent tutoring systems, natural language processing, topic modeling
Title:
Using correlational topic modeling for automated topic identification in intelligent tutoring systems
CCS:
Document topic models
Keywords:
correlational topic modeling
topic modeling
Abstract:
... labeling skills in a crowdsourced personalized learning system using correlated topic modeling, , a natural language processing approach, to analyze the linguistic ...
Primary CCS:
Document topic models
References:
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77--84.
Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 17--35.
Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J. L., & Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems, 288--296.
Full Text:
Microsoft Word - LAK 2017 Topic Modeling vFINAL_Jan20Edits.docxUsing Correlational Topic Modeling for Automated Topic Identification in Intelligent Tutoring Systems Stefan Slater ... labeling skills in a crowdsourced personalized learning system using correlated topic modeling, , a natural language processing approach, to analyze the linguistic ... for mathematics problem-solving. CCS Concepts ? Information systems~Document topic modelsKeywords Topic Modeling; ; Correlational Topic Modeling; ; Natural Language Processing; Mathematics Education; Intelligent Tutoring Systems 1. ...
... to methods utilizing patterns of correct and incorrect answers is topic modeling approaches, such as Latent Semantic Analysis (LSA; [16]), Latent Dirichlet ... Analysis (LSA; [16]), Latent Dirichlet Analysis (LDA; [5]), and Correlational Topic Modeling (CTM; [4]) Topic models are not dependent on human tagging of skills ? the ... required is the textual content of the problem itself. Using topic modeling approaches to label skills and topics utilizes both the relationships ... similarity that exists within problems that share a common skill. Topic modeling is a form of natural language processing that utilizes word ... words, called ?topics?, that appear in large collections of documents. Topic modeling can be loosely characterized as factor analysis conducted on words, ... as factor analysis conducted on words, rather than numerical variables. Topic models have been used for a range of applications, such as ... pass [6]. From this family of models, we select correlated topic modeling (CTM) [4], which models the intercorrelations of words in text ...
... perplexity of each model. Perplexity scores measure the ability of topic models to generalize to new and unseen text (in the case ... of the three models are presented in Table 2. The 25-topic model was found to have the lowest perplexity among the three ... content within ASSISTments. Table 2. Perplexity scores for the three topic models K = Perplexity 5 319.91 15 227.40 25 189.28 The ...
... semantic content of problems have been used before [21], and topic modeling may be an additional approach to identifying themes that are ...
... mathematics items in text could greatly enhance the ability of topic models to successfully distinguish individual skills. Future work in this domain ... as BKT or PFA. If CTM and other forms of topic modeling can achieve an acceptable level of agreement with expert skill ... Springer Berlin Heidelberg, 603-611. [3] Blei, D. M. (2012). Probabilistic topic models. . Communications of the ACM, 55(4), 77-84. [4] Blei, D. ... Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 17-35. [5] Blei, ... Blei, D. M. (2009). Reading tea leaves: How humans interpret topic models. . In Advances in neural information processing systems, 288-296. [8] ...
17
April 2017
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 31, Downloads (12 Months): 110, Downloads (Overall): 110
Full text available:
PDF
Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images. Applications require LDA to handle both large datasets and a large number of topics. Though distributed CPU systems have been used, GPU-based systems have emerged as a promising alternative because of the ...
Keywords:
lda, palellel computing, topic model, gpu
Also published in:
May 2017
ACM SIGOPS Operating Systems Review - SIGOPS Member Plus: Volume 51 Issue 2, June 2017 May 2017
ACM SIGPLAN Notices - ASPLOS '17: Volume 52 Issue 4, April 2017 May 2017
ACM SIGARCH Computer Architecture News - Asplos'17: Volume 45 Issue 1, March 2017
CCS:
Document topic models
Keywords:
topic model
Title:
SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
Primary CCS:
Document topic models
References:
J. L. Boyd-Graber, D. M. Blei, and X. Zhu. A topic model for word sense disambiguation. In EMNLP-CoNLL, 2007.
L. Cao and L. Fei-Fei. Spatially coherent latent topic model for concurrent segmentation and classification of objects and scenes. In ICCV, 2007.
J. Chang and D. Blei. Relational topic models for document networks. In AISTATS, 2009.
N. Chen, J. Zhu, F. Xia, and B. Zhang. Discriminative relational topic models. IEEE Trans. on Pattern Analysis and Machine Intelligence, 37 (5): 973--986, 2015.
T. Iwata, T. Yamada, and N. Ueda. Probabilistic latent semantic visualization: topic model for visualizing documents. In KDD, 2008.
A. Q. Li, A. Ahmed, S. Ravi, and A. J. Smola. Reducing the sampling complexity of topic models. In KDD, 2014.
H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. Evaluation methods for topic models. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 1105--1112. ACM, 2009.
Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, J. Zeng, Q. Yang, et al. Towards topic modeling for big data. ACM Transactions on Intelligent Systems and Technology, 2014.
L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 937--946. ACM, 2009.
H.-F. Yu, C.-J. Hsieh, H. Yun, S. Vishwanathan, and I. S. Dhillon. A scalable asynchronous distributed algorithm for topic modeling. In Proceedings of the 24th International Conference on World Wide Web, pages 1340--1350. International World Wide Web Conferences Steering Committee, 2015.
J. Yuan, F. Gao, Q. Ho, W. Dai, J. Wei, X. Zheng, E. P. Xing, T.-Y. Liu, and W.-Y. Ma. Lightlda: Big topic models on modest compute clusters. In WWW, 2015.
J. Zhu, A. Ahmed, and E. Xing. Medlda: maximum margin supervised topic models. Journal of Machine Learning Research, 13: 2237--2278, 2012.
Full Text:
SaberLDA: Sparsity-Aware Learning of Topic Models on GPUsKaiwei Li1 Jianfei Chen1;2 Wenguang Chen1;3 Jun Zhu1;2flikw14,chenjian14g@mails.tsinghua.edu.cn fcwg,dcszjg@tsinghua.edu.cn1Department ... than dense matrices and tensors.In this paper, we focus on topic modeling, , an importantsubclass of PGMs and demonstrate the challenges and ... and demonstrate the challenges and solu-tions encountered on accelerating PGMs. Topic models pro-vide a suite of widely adopted statistical tools for feature ... semanticallycorrelated. Latent Dirichlet Allocation (LDA) [3] is the mostpopular of topic models due to its simplicity, and has beendeployed as a key ...
... of CPUs, large clusters are typi-cally required to learn large topic models [1, 25, 26]. Forexample, a 32-machine cluster is used to ...
... J. L. Boyd-Graber, D. M. Blei, and X. Zhu. A topic model forword sense disambiguation. In EMNLP-CoNLL, 2007.[5] L. Cao and L. ... andscenes. In ICCV, 2007.[6] J. Chang and D. Blei. Relational topic models for documentnetworks. In AISTATS, 2009.[7] J. Chen, K. Li, J. ... Chen, J. Zhu, F. Xia, and B. Zhang. Discriminative re-lational topic models. . IEEE Trans. on Pattern Analysis andMachine Intelligence, 37(5):973?986, 2015.[9] ... Iwata, T. Yamada, and N. Ueda. Probabilistic latent seman-tic visualization: topic model for visualizing documents. InKDD, 2008.[13] Z. G. Kingsley. Selective studies ... S. Ravi, and A. J. Smola. Reducing thesampling complexity of topic models. . In KDD, 2014.[15] J. D. O. Mark Harris, Shubhabrata ... Wallach, I. Murray, R. Salakhutdinov, and D. Mimno.Evaluation methods for topic models. . In Proceedings of the26th Annual International Conference on Machine ...
... J. Zhu, A. Ahmed, and E. Xing. Medlda: maximum marginsupervised topic models. . Journal of Machine Learning Re-search, 13:2237?2278, 2012.[31] J. Zhu, ...
18
November 2016
WAMA 2016: Proceedings of the International Workshop on App Market Analytics
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 9, Downloads (12 Months): 94, Downloads (Overall): 94
Full text available:
PDF
Does the advertised behavior of apps correlate with what a user sees on a screen? In this paper, we introduce a technique to statically extract the text from the user interface definitions of an Android app. We use this technique to compare the natural language topics of an app’s user ...
Keywords:
Android, App mining, Topic models, UI Anomalies
CCS:
Document topic models
Keywords:
Topic models
Full Text:
... systems! Document topic mod-els; ?Computing methodologies! Anomaly detection;KeywordsAndroid; App mining; Topic models; ; UI Anomalies;1. INTRODUCTIONWhen a user decides whether to install ...
... data (as part of its app marketmetadata). We then apply topic modeling ... to the corpus of descrip-tions, and later use the inferred topic model on the user interfacedata. Thus, for each app we obtain ...
... the topics they belong to.We leverage LDA to build a topic model of our corpus of apps.Topic modeling provides a simple way ... then be described with probabilities of belonging toeach topic.We leverage topic modeling by first training the model on thecorpus of app descriptions. ...
19
July 2016
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
Publisher: ACM
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 5, Downloads (12 Months): 135, Downloads (Overall): 135
Full text available:
PDF
Social media has become a major source for analyzing all aspects of daily life. Thanks to dedicated latent topic analysis methods such as the Ailment Topic Aspect Model (ATAM), public health can now be observed on Twitter. In this work, we are interested in monitoring people's health over time. Recently, ...
Keywords:
social media, ailments, public health, topic models
CCS:
Topic modeling
Keywords:
topic models
Primary CCS:
Topic modeling
References:
D. M. Blei and J. D. Lafferty. Dynamic Topic Models. In ICML, pages 113--120, 2006.
H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. Evaluation methods for topic models. In ICML, pages 1105--1112, 2009.
W. X. Zhao, J. Jiang, J. Weng, J. He, E. Lim, H. Yan, and X. Li. Comparing Twitter and Traditional Media Using Topic Models. In ECIR, pages 338--349, 2011.
Full Text:
... as mea-suring behavioral risk factors and triggering public healthcampaigns.Popular probabilistic topic modeling methods such as La-tent Dirichlet Allocation [2] and pLSA [4] ... to discoverailments from tweets [5].While the primary goal of probabilistic topic modeling isto learn topic models, , an equally interesting objective is toexamine topic transitions. A ... new model, coined TM?ATAM. Our model is differ-ent from dynamic topic models such as [1,7], as it is designedto learn topic transition ... is designedto learn topic transition patterns from temporally-orderedposts, while dynamic topic models focus on changing worddistributions of topics over time. TM?ATAM learns ...
... models. If ratio computes to be less than"1" for competitor topic model , TM?ATAM is perform-ing worse. If ratio is more than ... Wallach, I. Murray, R. Salakhutdinov, andD. Mimno. Evaluation methods for topic models. . InICML, pages 1105?1112, 2009.[7] X. Wang and A. McCallum. ... Lim, H. Yan,and X. Li. Comparing Twitter and Traditional MediaUsing Topic Models.
20
April 2016
WWW '16: Proceedings of the 25th International Conference on World Wide Web
Publisher: International World Wide Web Conferences Steering Committee
Bibliometrics:
Citation Count: 1
Downloads (6 Weeks): 11, Downloads (12 Months): 161, Downloads (Overall): 208
Full text available:
PDF
Dynamic topic models (DTMs) are very effective in discovering topics and capturing their evolution trends in time series data. To do posterior inference of DTMs, existing methods are all batch algorithms that scan the full dataset before each update of the model and make inexact variational approximations with mean-field assumptions. ...
Keywords:
MCMC, topic model, MPI, dynamic topic model, large scale machine learning, parallel computing
Title:
Scaling up Dynamic Topic Models
CCS:
Document topic models
Keywords:
topic model
dynamic topic model
Abstract:
<p>Dynamic topic models (DTMs) are very effective in discovering topics and capturing their ... large-scale applications. We are able to learn the largest Dynamic Topic Model to our knowledge, and learned the dynamics of 1,000 topics ...
Primary CCS:
Document topic models
References:
D. Blei and J. Lafferty. Correlated topic models. Advances in Neural Information Processing Systems (NIPS), 2006.
D. M. Blei and J. D. Lafferty. Dynamic topic models. In International Conference on Machine Learning (ICML), 2006.
J. Chen, J. Zhu, Z. Wang, X. Zheng, and B. Zhang. Scalable inference for logistic-normal topic models. In Advances in Neural Information Processing Systems (NIPS), 2013.
A. Q. Li, A. Ahmed, S. Ravi, and A. J. Smola. Reducing the sampling complexity of topic models. In International Conference on Knowledge Discovery and Data mining (SIGKDD), 2014.
H. M. Wallach, I. Murray, R. Salakhutdinov, and D. Mimno. Evaluation methods for topic models. In International Conference on Machine Learning (ICML). ACM, 2009.
Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, J. Zeng, Q. Yang, et al. Towards topic modeling for big data. ACM Transactions on Intelligent Systems and Technology, 2014.
P. Xie and E. P. Xing. Integrating document clustering and topic modeling. arXiv preprint arXiv:1309.6874, 2013.
L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In International Conference on Knowledge Discovery and Data mining (SIGKDD), 2009.
J. Yuan, F. Gao, Q. Ho, W. Dai, J. Wei, X. Zheng, E. P. Xing, T.-Y. Liu, and W.-Y. Ma. Lightlda: Big topic models on modest compute clusters. arXiv:1412.1576, 2014.
J. Zhu, A. Ahmed, and E. P. Xing. MedLDA: Maximum margin supervised topic models. Journal of Machine Learning Research, 13:2237--2278, 2012.
J. Zhu, N. Chen, H. Perkins, and B. Zhang. Gibbs max-margin topic models with data augmentation. Journal of Machine Learning Research, 15:1073--1110, 2014.
Full Text:
... of Software, Tsinghua University, Beijing, 100084 Chinaabhadury@flipboard.com; chenjian14@mails.tsinghua.edu.cn; {dcszj, shixia}@tsinghua.edu.cnABSTRACTDynamic topic models (DTMs) are very effective in discover-ing topics and capturing their ... to large-scale applications. We are able tolearn the largest Dynamic Topic Model to our knowledge,and learned the dynamics of 1,000 topics from ... but also achieves lower perplexity.General TermsAlgorithms, Experimentation, PerformanceKeywordsTopic Model; Dynamic Topic Model; ; Large Scale MachineLearning; Parallel Computing; MCMC; MPI1. INTRODUCTIONSurrounded by ... Scale MachineLearning; Parallel Computing; MCMC; MPI1. INTRODUCTIONSurrounded by data, statistical topic models have becomesome of the most useful machine learning tools to ... 978-1-4503-4143-1/16/04.DOI: http://dx.doi.org/10.1145/2872427.2883046 .text documents and images under some bag-of-words rep-resentations. Topic models can capture thematic structurethat exists within a data corpus and ... 34), and data visualiza-tion (16). One of the most popular topic models, , LatentDirichlet Allocation (LDA) (5), has seen large amounts ofapplication ... and the temporal evo-lution of topics in data streams. Correlated Topic Model( (CTM) (3) is one such extension to LDA that introducesnon-conjugate ... to betterperformance in terms of both time efficiency and testinglikelihood/perpelexity.Dynamic Topic Model (DTM) (4) is another extensionto LDA that discovers topics and ...
... variationalmethods, inferring the variational distribution over topics381for a word in topic modeling is typically of O(K) complex-ity, where K is the number ... et. al. (26) capturing a large numberof topic trends in topic modeling is extremely important asit improves tasks such as advertisement and ... and further present a parallel algo-rithm to learn large Dynamic Topic Models on multiple ma-chines. Our algorithm is very close to being ... with the number oftime slices. We learn a large dynamic topic model from a 9GB dataset consisting of 2.6 million documents in ... briefly review some related work on thevanilla LDA and dynamic topic models. .2.1 Latent Dirichlet AllocationLatent Dirichlet Allocation (LDA) is a probabilistic ...
... fast sam-pler of topic indices for DTM.2.2 Dynamic Topic ModelsDynamic Topic Model (DTM) (4) is a topic model that isused to model time series data. Since a Dirichlet ... even faster Gibbs Sampler that is efficient in capturinglarge dynamic topic models with many topics.3. GIBBS SAMPLER FOR DTMWe now present our ...
... specifiesthe number of cores.over Time (ToT) (24) but the Dynamic Topic Model pro-posed by Blei et. al has the distinct advantage of ... correlation graphs of topics over mul-tiple time slices (Dynamic Correlated Topic Models) ), andimprove the visualization of the learned evolution of topicsfrom ... inference.7. CONCLUSIONSWe propose a scalable and efficient inference method ofDynamic Topic Models. . Our algorithm is a novel combina-tion of Stochastic Gradient ... appli-cable for both researchers and industries as a large scaleDynamic Topic Model can capture very interesting trends.AcknowledgmentsThe work was supported by the ... Z. Wang, X. Zheng, and B. Zhang.Scalable inference for logistic-normal topic models. . InAdvances in Neural Information Processing Systems(NIPS), 2013.[9] T. Chen, ...
... Ravi, and A. J. Smola. Re-ducing the sampling complexity of topic models. . InInternational Conference on Knowledge Discovery andData mining (SIGKDD), 2014.[16] ... Wallach, I. Murray, R. Salakhutdinov, andD. Mimno. Evaluation methods for topic models. . In In-ternational Conference on Machine Learning (ICML).ACM, 2009.[23] C. ... 2011.[29] P. Xie and E. P. Xing. Integrating document clusteringand topic modeling. . arXiv preprint arXiv:1309.6874,2013.[30] L. Yao, D. Mimno, and A. ... L. Yao, D. Mimno, and A. McCallum. Efficient meth-ods for topic model inference on streaming documentcollections. In International Conference on KnowledgeDiscovery and ... Zhu, A. Ahmed, and E. P. Xing. MedLDA: Maximummargin supervised topic models. . Journal of MachineLearning Research, 13:2237?2278, 2012.[34] J. Zhu, N. ... Zhu, N. Chen, H. Perkins, and B. Zhang. Gibbs max-margin topic models with data augmentation. Journalof Machine Learning Research, 15:1073?1110, 2014.390IntroductionRelated WorkLatent ...
Result page:
1
2
3
4
5
6
7
8
9
10
>>