zoukankan      html  css  js  c++  java
  • Task5.PyTorch实现L1,L2正则化以及Dropout

    1.了解知道Dropout原理  

      深度学习网路中,参数多,可能出现过拟合及费时问题。为了解决这一问题,通过实验,在2012年,Hinton在其论文《Improving neural networks by preventing co-adaptation of feature detectors》中提出Dropout。证明了其能有效解决过拟合的能力。

    dropout 是指在深度学习网络的训练过程中,按照一定的概率将一部分神经网络单元暂时从网络中丢弃,相当于从原始的网络中找到一个更瘦的网络示意图如下:

      其实现是以某种概率分布使得一些神经元为0,一些为1.这样在有N个神经元的神经网络中,其参数搭配可能有2^N种。
    具体介绍 见论文(我也不是很懂 实现得见)
    适用情况:
    1 Dropout主要用在数据量不够,容易过拟合,需要dropout。

    L1及L2可以使得结构化风险最小
    其中:
    L1的参数具有稀疏性(具有更多的0或1)
    L2的参数趋近于分散化 ,其参数值趋向于选择更简单(趋于0的参数),因此比较平滑


    2.用代码实现正则化(L1、L2、Dropout)

    L1范数

      L1范数是参数矩阵W中元素的绝对值之和,L1范数相对于L0范数不同点在于,L0范数求解是NP问题,而L1范数是L0范数的最优凸近似,求解较为容易。L1常被称为LASSO.

    1 regularization_loss = 0
    2 for param in model.parameters():
    3     regularization_loss += torch.sum(abs(param))
    4 
    5 for epoch in range(EPOCHS):
    6     y_pred = model(x_train)
    7     classify_loss = criterion(y_pred, y_train.float().view(-1, 1))
    8     loss = classify_loss + 0.001 * regularization_loss  # 引入L1正则化项

    L2范数

      L2范数是参数矩阵W中元素的平方之和,这使得参数矩阵中的元素更稀疏,与前两个范数不同的是,它不会让参数变为0,而是使得参数大部分都接近于0。L1追求稀疏化,从而丢弃了一部分特征(参数为0),而L2范数只是使参数尽可能为0,保留了特征。L2被称为Rigde.

    1 criterion  = torch.nn.BCELoss() #定义损失函数
    2 optimizer = torch.optim.SGD(model.parameters(),lr = 0.01, momentum=0, dampening=0,weight_decay=0) #weight_decay 表示使用L2正则化

    3.Dropout的numpy实现

     1 import numpy as np
     2 
     3 X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1] ])
     4 
     5 y = np.array([[0,1,1,0]]).T
     6 
     7 alpha,hidden_dim,dropout_percent,do_dropout = (0.5,4,0.2,True)
     8 
     9 synapse_0 = 2*np.random.random((3,hidden_dim)) - 1
    10 
    11 synapse_1 = 2*np.random.random((hidden_dim,1)) - 1
    12 
    13 for j in xrange(60000):
    14 
    15     layer_1 = (1/(1+np.exp(-(np.dot(X,synapse_0)))))
    16 
    17     if(do_dropout):
    18 
    19         layer_1 *= np.random.binomial([np.ones((len(X),hidden_dim))],1-dropout_percent)[0] * (1.0/(1-dropout_percent))
    20 
    21     layer_2 = 1/(1+np.exp(-(np.dot(layer_1,synapse_1))))
    22 
    23     layer_2_delta = (layer_2 - y)*(layer_2*(1-layer_2))
    24 
    25     layer_1_delta = layer_2_delta.dot(synapse_1.T) * (layer_1 * (1-layer_1))
    26 
    27     synapse_1 -= (alpha * layer_1.T.dot(layer_2_delta))
    28 
    29     synapse_0 -= (alpha * X.T.dot(layer_1_delta))

    4.完整代码

     1 import torch
     2 from torch import nn
     3 from torch.autograd import Variable
     4 import torch.nn.functional as F
     5 import torch.nn.init as init
     6 import math
     7 from sklearn import datasets
     8 from sklearn.model_selection import train_test_split
     9 from sklearn.metrics import classification_report
    10 import numpy as np
    11 import pandas as pd
    12 %matplotlib inline
    13 
    14 # 导入数据
    15 data = pd.read_csv(r'C:UsersettyDesktoppytorch学习data.txt')
    16 x, y = data.ix[:,:8],data.ix[:,-1]
    17 
    18 #测试集为30%,训练集为80%
    19 x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)
    20 
    21 x_train = Variable(torch.from_numpy(np.array(x_train)).float())
    22 y_train = Variable(torch.from_numpy(np.array(y_train).reshape(-1, 1)).float())    
    23 
    24 x_test = Variable(torch.from_numpy(np.array(x_test)).float())
    25 y_test= Variable(torch.from_numpy(np.array(y_test).reshape(-1,1)).float())    
    26 
    27 
    28 print(x_train.data.shape)
    29 print(y_train.data.shape)
    30 
    31 print(x_test.data.shape)
    32 print(y_test.data.shape)
    33 
    34 class Model(torch.nn.Module):
    35     def __init__(self):
    36         super(Model, self).__init__()
    37         self.l1 = torch.nn.Linear(8, 200)
    38         self.l2 = torch.nn.Linear(200, 50)
    39         self.l3 = torch.nn.Linear(50, 1)
    40 
    41     def forward(self, x):
    42         out1 = F.relu(self.l1(x))
    43         out2 = F.dropout(out1, p= 0.5)
    44         out3 = F.relu(self.l2(out2))
    45         out4 = F.dropout(out3, p=0.5)
    46         y_pred = F.sigmoid(self.l3(out3))
    47         return y_pred
    48 
    49 model = Model()
    50 
    51 criterion = torch.nn.BCELoss()
    52 optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=0.1)
    53 
    54 Loss=[]
    55 for epoch in range(2000):
    56         y_pred = model(x_train)
    57         loss = criterion(y_pred, y_train)
    58         if epoch % 400 == 0:
    59             print("epoch =", epoch, "loss", loss.item())
    60             Loss.append(loss.item())
    61         optimizer.zero_grad()
    62         loss.backward()
    63         optimizer.step()
    64 
    65 # 模型评估
    66 def label_flag(data):
    67     for i in range(len(data)):
    68         if(data[i]>0.5):
    69             data[i] = 1.0
    70         else:
    71             data[i] = 0.0
    72     return data
    73 
    74 y_pred = label_flag(y_pred)  
    75 print(classification_report(y_train.detach().numpy(), y_pred.detach().numpy()))
    76 
    77 # 测试
    78 y_test_pred = model(x_test)
    79 y_test_pred = label_flag(y_test_pred)       
    80 print(classification_report(y_test.detach().numpy(), y_test_pred.detach().numpy()))

    数据集下载链接:链接:https://pan.baidu.com/s/1LrJktjVQ1OM9mYt_cuE-FQ
    提取码:hatv

    原文链接:https://blog.csdn.net/wehung/article/details/89283583

  • 相关阅读:
    candy——动态规划
    HTTP协议(转)
    linked-list-cycle-ii——链表,找出开始循环节点
    linked-list-cycle——链表、判断是否循环链表、快慢指针
    转: utf16编码格式(unicode与utf16联系)
    【转】Nginx模块开发入门
    【转】依赖注入那些事儿
    转: OpenResty最佳实践
    转:linux的源码查看, c++语法 查看网站
    【转】所需即所获:像 IDE 一样使用 vim
  • 原文地址:https://www.cnblogs.com/NPC-assange/p/11360497.html
Copyright © 2011-2022 走看看