zoukankan      html  css  js  c++  java
  • 二分类例题

    分类练习

    1. 绘制训练集

    在进行分类时,依旧需要观察训练集的样本特征,选择合适的模型

    data = load("ex2data2.txt");
    x = data(:,[1,2]);
    y = data(:,3);
    plotData(x,y);
    
    x = mapFeature(x(:,1),x(:,2));
    
    initial_theta = zeros(size(x, 2), 1);
    
    fprintf("System pause.press anykey to continue...");
    pause;
    % Set regularization parameter lambda to 1 (you should vary this)
    lambda = 100;
    
    % Set Options
    options = optimset('GradObj', 'on', 'MaxIter', 400);
    
    % Optimize
    [theta, J, exit_flag] = ...
    	fminunc(@(t)(costFunction(t, x, y, lambda)), initial_theta, options);
    
    % Plot Boundary
    plotDecisionBoundary(theta, x, y);
    hold on;
    title(sprintf('lambda = %g', lambda))
    
    % Labels and Legend
    xlabel('Microchip Test 1')
    ylabel('Microchip Test 2')
    
    legend('y = 1', 'y = 0', 'Decision boundary')
    hold off;
    
    

    Figure1

    通过观察样本分布,我们能大致估计决策边界应该是椭圆型。

    2. 特征映射

    特征映射能够将样本数据构造成多项式展开式

    function out = mapFeature(X1, X2)
    % MAPFEATURE Feature mapping function to polynomial features
    %
    %   MAPFEATURE(X1, X2) maps the two input features
    %   to quadratic features used in the regularization exercise.
    %
    %   Returns a new feature array with more features, comprising of 
    %   X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
    %
    %   Inputs X1, X2 must be the same size
    %
    
    degree = 6;
    out = ones(size(X1(:,1)));
    for i = 1:degree
        for j = 0:i
            out(:, end+1) = (X1.^(i-j)).*(X2.^j);
        end
    end
    
    end
    

    下图为展开的结果

    截屏2020-09-17 上午11.11.42

    3. 拟合

    本题也可以用梯度下降进行拟合,但是我不清楚为什么我的梯度下降算法中,惩罚系数(lambda)的影响微乎其微,以至于无论(lambda)为0还是1,所得到的决策界限的图像都呈现正常拟合。只有当我把学习率提升至1,并且迭代次数提升到40000时,(lambda)为0的过拟合现象才稍微有一些表现。我在Coursera社区中提问,助教给我的答复是本题如果要用梯度下降算法,应该首先进行特征归一化。即使这样,过拟合现象还是体现的不够明显。

    计算代价函数及梯度

    function [J, grad] = costFunction(theta, x, y, lambda)
    %COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
    %   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
    %   theta as the parameter for regularized logistic regression and the
    %   gradient of the cost w.r.t. to the parameters. 
    
    % Initialize some useful values
    m = length(y); % number of training examples
    
    % You need to return the following variables correctly 
    J = 0;
    grad = zeros(size(theta));
    
    % ====================== YOUR CODE HERE ======================
    % Instructions: Compute the cost of a particular choice of theta.
    %               You should set J to the cost.
    %               Compute the partial derivatives and set grad to the partial
    %               derivatives of the cost w.r.t. each parameter in theta
    
    
    J = -1/m * sum(y.*log(sigmoid(x*theta))+(1-y).*log(1-sigmoid(x*theta))) + lambda/(2*m)*sum(theta.*theta);
    grad = 1/m * x' * (sigmoid(x*theta)-y) + lambda / m * theta;
    
    % =============================================================
    
    end
    
    

    Coursera给出的题解是利用Matlab的函数'fminunc'进行拟合。下面先介绍fminunc函数的使用方法。

    优化多变量无约束函数 fminunc

    函数句柄

    可用于间接调用函数的变量

    函数句柄是一种表示函数的 MATLAB® 数据类型。函数句柄的典型用法是将一个函数传递给另一个函数。例如,您可以将函数句柄用作基于某个值范围计算数学表达式的函数的输入参数。

    函数句柄可以表示命名函数或匿名函数。要创建函数句柄,请使用 @ 运算符。例如,创建用于计算表达式 x2 – y2 的匿名函数的句柄:

    f = @(x,y) (x.^2 - y.^2);
    

    fminunc用法

    官方文档

    调用fminunc进行拟合

    initial_theta = zeros(size(x, 2), 1);
    
    % Set regularization parameter lambda to 1 (you should vary this)
    lambda = 100;
    
    % Set Options
    options = optimset('GradObj', 'on', 'MaxIter', 400);
    
    % Optimize
    [theta, J, exit_flag] = ...
    	fminunc(@(t)(costFunction(t, x, y, lambda)), initial_theta, options);
    
    

    4. 绘制决策边界

    本题决策边界的绘制只能使用描点法。它的思路是,指定x1,x2的范围,然后将x1,x2映射成多项展开式,然后再和计算出的(Theta)矩阵相乘。这样算出的是决策的函数(XTheta)。现在只需要绘制这个函数在值域为[0,0]的等高线即可得出(XTheta = 0)即决策边界的图像。

    function plotDecisionBoundary(theta, X, y)
    %PLOTDECISIONBOUNDARY Plots the data points X and y into a new figure with
    %the decision boundary defined by theta
    %   PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the 
    %   positive examples and o for the negative examples. X is assumed to be 
    %   a either 
    %   1) Mx3 matrix, where the first column is an all-ones column for the 
    %      intercept.
    %   2) MxN, N>3 matrix, where the first column is all-ones
    
    % Plot Data
    plotData(X(:,2:3), y);
    hold on
    
    if size(X, 2) <= 3
        % Only need 2 points to define a line, so choose two endpoints
        plot_x = [min(X(:,2))-2,  max(X(:,2))+2];
    
        % Calculate the decision boundary line
        plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));
    
        % Plot, and adjust axes for better viewing
        plot(plot_x, plot_y)
        
        % Legend, specific for the exercise
        legend('Admitted', 'Not admitted', 'Decision Boundary')
        axis([30, 100, 30, 100])
    else
        % Here is the grid range
        u = linspace(-1, 1.5, 50);
        v = linspace(-1, 1.5, 50);
    
        z = zeros(length(u), length(v));
        % Evaluate z = theta*x over the grid
        for i = 1:length(u)
            for j = 1:length(v)
                z(i,j) = mapFeature(u(i), v(j))*theta;
            end
        end
        z = z'; % important to transpose z before calling contour
    
        % Plot z = 0
        % Notice you need to specify the range [0, 0]
        contour(u, v, z, [0, 0], 'LineWidth', 2)
    end
    hold off
    
    end
    
    

    5.尝试更多的惩罚系数

    Figure2

    Figure3

    Figure4

    可以看出,(lambda)为1,0,100分别得到了Fitting / Overfitting / Underfitting的结果

    ---- suffer now and live the rest of your life as a champion ----
  • 相关阅读:
    集群和高可用
    haproxy-负载均衡介绍
    HAproxy负载均衡-ACL篇
    Haproxy配置文件(2)
    Haproxy原理(1)
    Haproxy状态监控配置教程
    在Windows中单机环境下创建RabbitMQ集群
    Haproxy------在windows下配置负载均衡
    负载均衡的软件
    HAProxy的三种不同类型配置方案
  • 原文地址:https://www.cnblogs.com/popodynasty/p/13684297.html
Copyright © 2011-2022 走看看