Concepts inDiscriminative models for information retrieval
Discriminative model
Discriminative models are a class of models used in machine learning for modeling the dependence of an unobserved variable on an observed variable . Within a statistical framework, this is done by modeling the conditional probability distribution, which can be used for predicting from . Discriminative models differ from generative models in that they do not allow one to generate samples from the joint distribution of and .
more from Wikipedia
Information retrieval
Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web. There is overlap in the usage of the terms data retrieval, document retrieval, information retrieval, and text retrieval, but each also has its own body of literature, theory, praxis, and technologies.
more from Wikipedia
Generative model
In probability and statistics, a generative model is a model for randomly generating observable data, typically given some hidden parameters. It specifies a joint probability distribution over observation and label sequences. Generative models are used in machine learning for either modeling data directly (i.e. , modeling observed draws from a probability density function), or as an intermediate step to forming a conditional probability density function.
more from Wikipedia
Principle of maximum entropy
In Bayesian probability theory, the principle of maximum entropy is an axiom. It states that, subject to precisely stated prior data, which must be a proposition that expresses testable information, the probability distribution which best represents the current state of knowledge is the one with largest information theoretical entropy. Let some precisely stated prior data or testable information about a probability distribution function be given.
more from Wikipedia
Language model
A statistical language model assigns a probability to a sequence of m words by means of a probability distribution. Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information retrieval. In speech recognition and in data compression, such a model tries to capture the properties of a language, and to predict the next word in a speech sequence.
more from Wikipedia
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases. A learner can take advantage of examples (data) to capture characteristics of interest of their unknown underlying probability distribution. Data can be seen as examples that illustrate relations between observed variables.
more from Wikipedia
Support vector machine
A support vector machine (SVM) is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis. The standard SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the input, making the SVM a non-probabilistic binary linear classifier.
more from Wikipedia
Entropy
Entropy is a thermodynamic property that can be used to determine the energy not available for work in a thermodynamic process, such as in energy conversion devices, engines, or machines. Such devices can only be driven by convertible energy, and have a theoretical maximum efficiency when converting energy to work. During this work, entropy accumulates in the system, which then dissipates in the form of waste heat.
more from Wikipedia