第四章,神经网络介绍
处理特征变量特别多时,多项式的量太多了(几何级数递增)
- 模拟人脑只需要一个算法:
将人的视觉信号发到大脑中原本处理听觉的部分,一段时间后就能处理视觉信号。所以处理视觉和听觉使用同一算法。
盲人在头上安装一个摄像头,并通过连接舌头传到大脑中,就能看见。
- 神经元:多个输入,细胞主体,输出神经。
- 神经网络计算方法:
4. 神经网络的最后一步操作类似逻辑回归,或者说就是,但是不是通过输入的特征变量直接带入,而是将输入进行变换到隐藏层,再带入。
5. 多分类问题:一对多,最后输出是一个向量,每个元素是一个[0,1]值。他们最后的输出和不一定等于1!!
编程实践
- 手写数字识别(0-9,10代表0)
- 整体代码:
训练集每个是20*20灰色手写图像,一共5000个,文件中每个图像被展开在了一行中,形成5000*400的矩阵。 数字0被当做10用,即10代表手写的0
input_layer_size = 400; num_labels = 10; load('ex3data1.mat'); %.mat是octave/matlab的矩阵格式,并且已经命名了! %训练集每个是20*20灰色手写图像,一共5000个,文件中每个图像被展开在了一行中,形成5000*400的矩阵。 %数字0被当做10用,即10代表手写的0 m = size(X, 1); rand_indices = randperm(m); sel = X(rand_indices(1:100), :); displayData(sel);%显示
lambda = 0.1;
[all_theta] = oneVsAll(X, y, num_labels, lambda);pred = predictOneVsAll(all_theta, X);
fprintf(' Training Set Accuracy: %f ', mean(double(pred == y)) * 100); -
displayData函数输出训练集的样子,diao
function [h, display_array] = displayData(X, example_width) %DISPLAYDATA Display 2D data in a nice grid % [h, display_array] = DISPLAYDATA(X, example_width) displays 2D data % stored in X in a nice grid. It returns the figure handle h and the % displayed array if requested. % Set example_width automatically if not passed in if ~exist('example_width', 'var') || isempty(example_width) example_width = round(sqrt(size(X, 2))); end % Gray Image colormap(gray); % Compute rows, cols [m n] = size(X); example_height = (n / example_width); % Compute number of items to display display_rows = floor(sqrt(m)); display_cols = ceil(m / display_rows); % Between images padding pad = 1; % Setup blank display display_array = - ones(pad + display_rows * (example_height + pad), ... pad + display_cols * (example_width + pad)); % Copy each example into a patch on the display array curr_ex = 1; for j = 1:display_rows for i = 1:display_cols if curr_ex > m, break; end % Copy the patch % Get the max value of the patch max_val = max(abs(X(curr_ex, :))); display_array(pad + (j - 1) * (example_height + pad) + (1:example_height), ... pad + (i - 1) * (example_width + pad) + (1:example_width)) = ... reshape(X(curr_ex, :), example_height, example_width) / max_val; curr_ex = curr_ex + 1; end if curr_ex > m, break; end end % Display Image h = imagesc(display_array, [-1 1]); % Do not show axis axis image off drawnow; end
-
只有两层(输入,输出)的多分类神经网络求theta
-
function [all_theta] = oneVsAll(X, y, num_labels, lambda) m = size(X, 1); n = size(X, 2); % You need to return the following variables correctly all_theta = zeros(num_labels, n + 1); % Add ones to the X data matrix X = [ones(m, 1) X]; % fmincg works similarly to fminunc, but is more efficient when we % are dealing with large number of parameters. for i=1:num_labels %多分类,多个输出分别使用逻辑回归 initial_theta = zeros(n + 1, 1); options = optimset('GradObj', 'on', 'MaxIter', 50); [all_theta(i,:)] = ... fmincg (@(t)(lrCostFunction(t, X, (y == i), lambda)), ... initial_theta, options); end end
-
4.predictOneVsAll,接着上个代码求得的theta求值,然后在函数外和y比较,求准确率
function p = predictOneVsAll(all_theta, X) m = size(X, 1); num_labels = size(all_theta, 1); p = zeros(size(X, 1), 1); X = [ones(m, 1) X]; [M,p]=max(X*all_theta',[],2);%M是最大的值,p是其下标。因为sigmoid函数是递增函数,且只要返回下标,所以省略了(先求sigmoid,再比较大小)
end
5.三层求准确率
function p = predict(Theta1, Theta2, X) m = size(X, 1); num_labels = size(Theta2, 1); p = zeros(size(X, 1), 1); X=[ones(m,1) X]; a2=sigmoid(Theta1*(X')); a2=[ones(size(a2,2),1) a2']; [M,p]=max((Theta2*a2')',[],2); %同上,省略了求sigmoid
end