Introduction
A ML model may be predictive to make predictions in the future, or descriptive to gain knowlegde from data, or both. So there are predictive machine leaning and descriptive machine learning.
Examples of ML Applications
Leaning Associations
Finding an association rule is learning a conditional probability $P(Y|X)$ where Y is the product we'd like to condition on X that one has already bought. When making a distinction among customers, we are more willing to estimate $P(Y|X,D)$ where $D$ is the set of customer attributes.
Classification and Regression
Both classification and regression are supervised learning problems there is an input $X$, an output $Y$, and the task is to learn the mapping from the input to the output $$y=g(x| heta)$$ where $g(dot)$ is the model and $ heta$ are its parameters. $Y$ is a number in regression and a class code (e.g. 0/1) in the case of classification. $g()$ is the discriminant function(判别函数, 是直接用来对模式样本进行分类的准则函数) separating the instances of different classes. In statistics, classification is called discriminant analysis.
Unsupervised Learning
In unpuservised learning, the aim is to find the regularities in the input. In statistics, it is also called density estimation. One method for density estimation is clustering like customer segmentation, image compression, document clustering and learning motif(small sequences that frequently happens) by clustering sequences of DNA.
Reinforcement Learning
Notes
Dedicated journals in ML are Machine Learning, Journal of Machine Learning Research, Neural Computation, Neural Networks, IEEE Transactions on Neural Networks. Statistics journals like Annals of Statistics, Journal of the American Statistical Association, IEEE Transactions on Pattern Analysis, Machine Intelligence, Data Mining and Knowledge Discovery, IEEE Transactions on Knowledge and Data Engineering, ACM Special Interest Group on Knowledge Discovery and Data Mining Explorations Journal.
Dedicated conference in ML Neural Information Processing Systems, Uncertainty in Artificial Intelligence, International Conference on Machine Learning, European Conference on Machine Learning, Computational Learning Theory, International Joint Conference on Artificial Intelligence.
Supervised Learning
Binary Classification
Suppose the Input is 2D, i.e, $ extbf{x}=egin{bmatrix} x_{1} \ x_{2}end{bmatrix}$ with label $r=egin{cases}1 & if~ extbf{x}~is~a~positive~example\0 & if~ extbf{x}~is~a~negative~exampleend{cases}$. The training set contains N such examples $X={x^t,r^t}^{N}_{t=1}$ where $t$ indexes different examples in the set where each example $t$ is a data point at $(x^t_1,x^t_2)$ with its type $r^t$.