zoukankan      html  css  js  c++  java
  • Machine Learning

    假期找了几本关于机器学习的书,将一些比较重要的核心公式整体到这里。

    模型描述

    特征空间假设, 寻找线性系数 $ theta $ 以希望用一个线性函数逼近目标向量。

    逼近的效果好坏叫做 Cost Function, 下面列出的MSE便是其中一种。

    Linear Regression

    梯度下降

    其中

    带有正则项

    • Ridge Regression
    • LASSO
    • Elastic Net
    sklearn-线性回归
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11

    from sklearn.linear_model import LinearRegression
    lr = LinearRegression()
    lr.fit(X, y)
    lr.intercept_, lr.coef_


    from sklearn.metrics import mean_squared_error

    # sgd
    from sklearn.linear_model import SGDRegressor

    对数线性回归 - Logistic Regression

    $ sigma(t) $ 是Sigmoid函数

    Logistic Regression cost function (log loss)

    Logistic cost function partial derivatives

    sklearn-Logistic Regression
    1
    2
    3
    4
    5

    from sklearn.linear_model import LogisticRegression

    log_reg = LogisticRegression()
    log_reg.fit(X, y)

    Softmax Regression

    支持向量机

    Support Vector Machine

    • Decision Functions and Predictions
    • Hard Margin Classification

    subject to

    • Soft Margin Classification

    subject to

    • Dual Problem

    subject to

    LinearSVC
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18


    import numpy as np
    from sklearn import datasets
    from sklearn.pipeline import Pipeline
    from sklearn.preprocessing import StandardScaler
    from sklearn.svm import LinearSVC
    < 大专栏  Machine Learningspan class="line">
    iris = datasets.load_iris()
    X = iris["data"][:, (2, 3)] # petal length, petal width
    y = (iris["target"] == 2).astype(np.float64) # Iris-Virginica

    svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("linear_svc", LinearSVC(C=1, loss="hinge")),
    ])

    svm_clf.fit(X, y)

    Common kernels

    • Linear
    • Polynomial
    • Gaussian RBF
    • Sigmoid

    从树到森林。

    Decision Tree

    Decision Trees

    • Gini impurity
    • Entropy
    • CART cost function for regression

    where

    DecisionTreeClassifier
    1
    2
    3
    4
    5
    6
    7
    8
    9
    from sklearn.datasets import load_iris
    from sklearn.tree import DecisionTreeClassifier

    iris = load_iris()
    X = iris.data[:, 2:] # petal length and width
    y = iris.target

    tree_clf = DecisionTreeClassifier(max_depth=2)
    tree_clf.fit(X, y)

    Random Forests

    RF 在我看来是 Ensemble Learning (集成学习)的经典代表。

    以Classifiers举例,对待同样的数据, 不同分类器可能有不同的决策结果。

    Logistic Regression classifier, Random Forest classifier, K-Nearest Neighbors classifier

    自然而然的, 可引入选举策略来作最终决策。

    voting of classifier
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14

    from sklearn.ensemble import RandomForestClassifier
    from sklearn.ensemble import VotingClassifier
    from sklearn.linear_model import LogisticRegression
    from sklearn.svm import SVC

    log_clf = LogisticRegression()
    rnd_clf = RandomForestClassifier()
    svm_clf = SVC()

    voting_clf = VotingClassifier(
    estimators=[('lr', log_clf), ('rf', rnd_clf), ('svc', svm_clf)],
    voting='hard')
    voting_clf.fit(X_train, y_train)

    Boosting

    Adaboost

    Gradient Boosting

    效果指标

    确定Model收敛的方向, 对连续和离散模型都有若干种Metrics

    Classification

    $F_1$ 是二者的调和平均

    precision_score and recall_score
    1
    2

    from sklearn.metrics import precision_score, recall_score

    Regression

    • MSE
  • 相关阅读:
    Windows XP下 Android开发环境 搭建
    Android程序的入口点
    在eclipse里 新建android项目时 提示找不到proguard.cfg
    64位WIN7系统 下 搭建Android开发环境
    在eclipse里 新建android项目时 提示找不到proguard.cfg
    This Android SDK requires Android Developer Toolkit version 20.0.0 or above
    This Android SDK requires Android Developer Toolkit version 20.0.0 or above
    Android requires compiler compliance level 5.0 or 6.0. Found '1.4' instead
    Windows XP下 Android开发环境 搭建
    Android程序的入口点
  • 原文地址:https://www.cnblogs.com/lijianming180/p/12037887.html
Copyright © 2011-2022 走看看