Concepts inA fast k-means implementation using coresets
Cluster analysis
Cluster analysis or clustering is the task of assigning a set of objects into groups (called clusters) so that the objects in the same cluster are more similar (in some sense or another) to each other than to those in other clusters. Clustering is a main task of explorative data mining, and a common technique for statistical data analysis used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics.
more from Wikipedia
Algorithm
In mathematics and computer science, an algorithm Listen/ˈælɡərɪðəm/ (originating from al-Khwārizmī, the famous mathematician Muḥammad ibn Mūsā al-Khwārizmī) is a step-by-step procedure for calculations. Algorithms are used for calculation, data processing, and automated reasoning. More precisely, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function.
more from Wikipedia
K-means clustering
In data mining, k-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. This results into a partitioning of the data space into Voronoi cells. The problem is computationally difficult, however there are efficient heuristic algorithms that are commonly employed and converge fast to a local optimum.
more from Wikipedia
Data set
A data set (or dataset) is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. It lists values for each of the variables, such as height and weight of an object. Each value is known as a datum. The data set may comprise data for one or more members, corresponding to the number of rows.
more from Wikipedia
Set (mathematics)
A set is a collection of well defined and distinct objects, considered as an object in its own right. Sets are one of the most fundamental concepts in mathematics. Developed at the end of the 19th century, set theory is now a ubiquitous part of mathematics, and can be used as a foundation from which nearly all of mathematics can be derived.
more from Wikipedia
Data
Data are values of qualitative or quantitative variables, belonging to a set of items. Data in computing are often represented by a combination of items organized in rows and multiple variables organized in columns. Data are typically the results of measurements and can be visualised using graphs or images. Data as an abstract concept can be viewed as the lowest level of abstraction from which information and then knowledge are derived. Raw data, i.e.
more from Wikipedia
Algorithmic efficiency
In computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process, where the goal is to reduce resource consumption, including time to completion, to some acceptable, optimal level.
more from Wikipedia
Randomness
Randomness has somewhat differing meanings as used in various fields. It also has common meanings which are connected to the notion of predictability (or lack thereof) of events. The Oxford English Dictionary defines 'random' as "Having no definite aim or purpose; not sent or guided in a particular direction; made, done, occurring, etc. , without method or conscious choice; haphazard.
more from Wikipedia