Concepts inGenerative model-based clustering of directional data
Mixture model
In statistics, a mixture model is a probabilistic model for representing the presence of sub-populations within an overall population, without requiring that an observed data-set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population.
more from Wikipedia
Directional antenna
A directional antenna or beam antenna is an antenna which radiates greater power in one or more directions allowing for increased performance on transmit and receive and reduced interference from unwanted sources. Directional antennas like Yagi-Uda antennas provide increased performance over dipole antennas when a greater concentration of radiation in a certain direction is desired.
more from Wikipedia
Von Mises¿Fisher distribution
In directional statistics, the von Mises¿Fisher distribution is a probability distribution on the -dimensional sphere in . If the distribution reduces to the von Mises distribution on the circle. The probability density function of the von Mises-Fisher distribution for the random p-dimensional unit vector is given by: where and the normalization constant is equal to where denotes the modified Bessel function of the first kind and order .
more from Wikipedia
Cosine similarity
Cosine similarity is a measure of similarity between two vectors by measuring the cosine of the angle between them. The cosine of 0 is 1, and less than 1 for any other angle; the lowest value of the cosine is -1. The cosine of the angle between two vectors thus determines whether two vectors are pointing in roughly the same direction. This is often used to compare documents in text mining. In addition, it is used to measure cohesion within clusters in the field of data mining.
more from Wikipedia
Clustering high-dimensional data
Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional data spaces are often encountered in areas such as medicine, where DNA microarray technology can produce a large number of measurements at once, and the clustering of text documents, where, if a word-frequency vector is used, the number of dimensions equals the size of the dictionary.
more from Wikipedia
K-means clustering
In data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. This results into a partitioning of the data space into Voronoi cells. The problem is computationally difficult, however there are efficient heuristic algorithms that are commonly employed and converge fast to a local optimum.
more from Wikipedia
Expectation¿maximization algorithm
In statistics, an expectation¿maximization (EM) algorithm is an iterative method for finding maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables.
more from Wikipedia