Author image not provided
 David M Mimno

Authors:
Add personal information
  Affiliation history
Bibliometrics: publication history
Average citations per article24.93
Citation Count698
Publication count28
Publication years2004-2017
Available for download16
Average downloads per article667.25
Downloads (cumulative)10,676
Downloads (12 Months)1,484
Downloads (6 Weeks)138
SEARCH
ROLE
Arrow RightAuthor only


AUTHOR'S COLLEAGUES
See all colleagues of this author

SUBJECT AREAS
See all subject areas




BOOKMARK & SHARE


29 results found Export Results: bibtexendnoteacmrefcsv

Result 1 – 20 of 29
Result page: 1 2

Sort by:

1
June 2017 Journal of the Association for Information Science and Technology: Volume 68 Issue 6, June 2017
Publisher: John Wiley & Sons, Inc.
Bibliometrics:
Citation Count: 0

Researchers in information science and related areas have developed various methods for analyzing textual data, such as survey responses. This article describes the application of analysis methods from two distinct fields, one method from interpretive social science and one method from statistical machine learning, to the same survey data. The ...

2
June 2017 Journal of the Association for Information Science and Technology: Volume 68 Issue 6, June 2017
Publisher: John Wiley & Sons, Inc.
Bibliometrics:
Citation Count: 0

Researchers in information science and related areas have developed various methods for analyzing textual data, such as survey responses. This article describes the application of analysis methods from two distinct fields, one method from interpretive social science and one method from statistical machine learning, to the same survey data. The ...

3
April 2017 WWW '17: Proceedings of the 26th International Conference on World Wide Web
Publisher: International World Wide Web Conferences Steering Committee
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 13,   Downloads (12 Months): 106,   Downloads (Overall): 106

Full text available: PDFPDF
The content of today's social media is becoming more and more rich, increasingly mixing text, images, videos, and audio. It is an intriguing research question to model the interplay between these different modes in attracting user attention and engagement. But in order to pursue this study of multimodal content, we ...
Keywords: reddit, multimodal, image processing, language modeling, social media

4
December 2016 NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing Systems
Publisher: Curran Associates Inc.
Bibliometrics:
Citation Count: 0
Downloads (6 Weeks): 0,   Downloads (12 Months): 0,   Downloads (Overall): 0

Full text available: PDFPDF
Many online communities present user-contributed responses such as reviews of products and answers to questions. User-provided helpfulness votes can highlight the most useful responses, but voting is a social process that can gain momentum based on the popularity of responses and the polarity of existing votes. We propose the Chinese ...

5 published by ACM
November 2016 GROUP '16: Proceedings of the 19th International Conference on Supporting Group Work
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 21,   Downloads (12 Months): 157,   Downloads (Overall): 202

Full text available: PDFPDF
Grounded Theory Method (GTM) and Machine Learning (ML) are often considered to be quite different. In this note, we explore unexpected convergences between these methods. We propose new research directions that can further clarify the relationships between these methods, and that can use those relationships to strengthen our ability to ...
Keywords: grounded theory, supervised learning, unsupervised learning, axial coding, coding families, machine learning

6
December 2015 NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2
Publisher: MIT Press
Bibliometrics:
Citation Count: 0

Spectral inference provides fast algorithms and provable optimality for latent topic analysis. But for real data these algorithms require additional ad-hoc heuristics, and even then often produce unusable results. We explain this poor performance by casting the problem of topic inference in the framework of Joint Stochastic Matrix Factorization (JSMF) ...

7
June 2013 ICML'13: Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28
Publisher: JMLR.org
Bibliometrics:
Citation Count: 1

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been ...

8
December 2012 NIPS'12: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 2
Publisher: Curran Associates Inc.
Bibliometrics:
Citation Count: 5

We develop a scalable algorithm for posterior inference of overlapping communities in large networks. Our algorithm is based on stochastic variational inference in the mixed-membership stochastic blockmodel (MMSB). It naturally interleaves subsampling the network with estimating its community structure. We apply our algorithm on ten large, real-world networks with up ...

9
June 2012 ICML'12: Proceedings of the 29th International Coference on International Conference on Machine Learning
Publisher: Omnipress
Bibliometrics:
Citation Count: 2

We present a hybrid algorithm for Bayesian topic models that combines the efficiency of sparse Gibbs sampling with the scalability of online stochastic inference. We used our algorithm to analyze a corpus of 1.2 million books (33 billion words) with thousands of topics. Our approach reduces the bias of variational ...

10 published by ACM
June 2012 JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
Publisher: ACM
Bibliometrics:
Citation Count: 2
Downloads (6 Weeks): 3,   Downloads (12 Months): 33,   Downloads (Overall): 303

Full text available: PDFPDF
Concept taxonomies such as MeSH, the ACM Computing Classification System, and the NY Times Subject Headings are frequently used to help organize data. They typically consist of a set of concept names organized in a hierarchy. However, these names and structure are often not sufficient to fully capture the intended ...
Keywords: taxonomy browsing, topic modeling, taxonomy annotation

11 published by ACM
April 2012 Journal on Computing and Cultural Heritage (JOCCH): Volume 5 Issue 1, April 2012
Publisher: ACM
Bibliometrics:
Citation Count: 7
Downloads (6 Weeks): 10,   Downloads (12 Months): 51,   Downloads (Overall): 577

Full text available: PDFPDF
More than a century of modern Classical scholarship has created a vast archive of journal publications that is now becoming available online. Most of this work currently receives little, if any, attention. The collection is too large to be read by any single person and mostly not of sufficient interest ...

12
July 2011 EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 8
Downloads (6 Weeks): 2,   Downloads (12 Months): 53,   Downloads (Overall): 344

Full text available: PDFPDF
Real document collections do not fit the independence assumptions asserted by most statistical topic models, but how badly do they violate them? We present a Bayesian method for measuring how well a topic model fits a corpus. Our approach is based on posterior predictive checking, a method for diagnosing Bayesian ...

13
July 2011 EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 104
Downloads (6 Weeks): 34,   Downloads (12 Months): 366,   Downloads (Overall): 1,287

Full text available: PDFPDF
Latent variable models have the potential to add value to large document collections by discovering interpretable, low-dimensional subspaces. In order for people to use such models, however, they must trust them. Unfortunately, typical dimensionality reduction methods for text, such as latent Dirichlet allocation, often produce low-dimensional subspaces (topics) that are ...

14
July 2011 UAI'11: Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence
Publisher: AUAI Press
Bibliometrics:
Citation Count: 0

A database of objects discovered in houses in the Roman city of Pompeii provides a unique view of ordinary life in an ancient city. Experts have used this collection to study the structure of Roman households, exploring the distribution and variability of tasks in architectural spaces, but such approaches are ...

15
January 2011
Bibliometrics:
Citation Count: 0

Text documents are generally accompanied by non-textual information, such as authors, dates, publication sources, and, increasingly, automatically recognized named entities. Work in text analysis has often involved predicting these non-text values based on text data for tasks such as document classification and author identification. This thesis considers the opposite problem: ...

16
December 2009 NIPS'09: Proceedings of the 22nd International Conference on Neural Information Processing Systems
Publisher: Curran Associates Inc.
Bibliometrics:
Citation Count: 48

Implementations of topic models typically use symmetric Dirichlet priors with fixed concentration parameters, with the implicit assumption that such "smoothing parameters" have little practical effect. In this paper, we explore several classes of structured priors for topic models. We find that an asymmetric Dirichlet prior over the document-topic distributions has ...

17
August 2009 EMNLP '09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Publisher: Association for Computational Linguistics
Bibliometrics:
Citation Count: 79
Downloads (6 Weeks): 11,   Downloads (12 Months): 83,   Downloads (Overall): 632

Full text available: PDFPDF
Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive collections of interlinked documents in dozens of languages, such as Wikipedia, are now widely available, calling for tools that can characterize content in many ...

18 published by ACM
June 2009 KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Publisher: ACM
Bibliometrics:
Citation Count: 97
Downloads (6 Weeks): 9,   Downloads (12 Months): 154,   Downloads (Overall): 1,457

Full text available: PDFPDF
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of training documents requires approximate inference techniques that are computationally expensive. With today's large-scale, constantly expanding document collections, it is useful to ...
Keywords: inference, topic modeling

19 published by ACM
June 2009 ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
Publisher: ACM
Bibliometrics:
Citation Count: 125
Downloads (6 Weeks): 16,   Downloads (12 Months): 304,   Downloads (Overall): 1,875

Full text available: PDFPDF
A natural evaluation metric for statistical topic models is the probability of held-out documents given a trained model. While exact computation of this probability is intractable, several estimators for this probability have been used in the topic modeling literature, including the harmonic mean method and empirical likelihood method. In this ...

20
December 2008 ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience
Publisher: IEEE Computer Society
Bibliometrics:
Citation Count: 0

As network-enabled scholarship produces huge quantities of formal and informal research outputs in a variety of formats and varying levels of access, it is "enhanced" science that will facilitate the discovery, selection, and analysis of information that are a necessary part of the scientific research cycle particularly among interdisciplinary research ...
Keywords: information work, network-enabled research, nanomanufacturing, interdisciplinary research, research cycle, information discovery, information selection



The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2018 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us