Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), a relatively young and interdisciplinary field of computer science, is the process that results in the discovery of new patterns in large data sets. It utilizes methods at the intersection of artificial intelligence, machine learning, statistics, and database systems.
more from Wikipedia
Privacy
Privacy (from Latin: privatus "separated from the rest, deprived of something, esp. office, participation in the government", from privo "to deprive") is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively. The boundaries and content of what is considered private differ among cultures and individuals, but share basic common themes.
more from Wikipedia
Statistical classification
In machine learning and statistics, classification is the problem of identifying which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known. The individual observations are analyzed into a set of quantifiable properties, known as various explanatory variables, features, etc. These properties may variously be categorical (e.g.
more from Wikipedia
Record (computer science)
In computer science, records (also called tuples, structs, or compound data) are among the simplest data structures. A record is a value that contains other values, typically in fixed number and sequence and typically indexed by names. The elements of records are usually called fields or members. For example, a date could be stored as a record containing a numeric year field, a month field represented as a string, and a numeric day-of-month field.
more from Wikipedia
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values. For a more precise definition one needs to distinguish between discrete and continuous random variables. In the discrete case, one can easily assign a probability to each possible value: when throwing a die, each of the six values 1 to 6 has the probability 1/6.
more from Wikipedia
Aggregate data
In statistics, aggregate data describes data combined from several measurements. When you aggregate data, you replace groups of observations with summary statistics based on those observations. In economics, aggregate data or data aggregates describes high-level data that is composed from a multitude or combination of other more individual data.
more from Wikipedia