zoukankan      html  css  js  c++  java
  • IMPROVING ADVERSARIAL ROBUSTNESS REQUIRES REVISITING MISCLASSIFIED EXAMPLES

    Wang Y, Zou D, Yi J, et al. Improving Adversarial Robustness Requires Revisiting Misclassified Examples[C]. international conference on learning representations, 2020.

    @article{wang2020improving,
    title={Improving Adversarial Robustness Requires Revisiting Misclassified Examples},
    author={Wang, Yisen and Zou, Difan and Yi, Jinfeng and Bailey, James and Ma, Xingjun and Gu, Quanquan},
    year={2020}}

    作者认为, 错分样本对于提高网络的鲁棒性是很重要的, 为此提出了一个启发于此的新的损失函数.

    主要内容

    符号

    (h_{ heta}): 参数为( heta)的神经网络;
    ((x,y) in mathbb{R}^d imes {1,ldots, K}): 类别及其标签;

    [ ag{2} h_{oldsymbol{ heta}}left(mathbf{x}_{i} ight)=underset{k=1, ldots, K}{arg max } mathbf{p}_{k}left(mathbf{x}_{i}, oldsymbol{ heta} ight), quad mathbf{p}_{k}left(mathbf{x}_{i}, oldsymbol{ heta} ight)=exp left(mathbf{z}_{k}left(mathbf{x}_{i}, oldsymbol{ heta} ight) ight) / sum_{k^{prime}=1}^{K} exp left(mathbf{z}_{k^{prime}}left(mathbf{x}_{i}, oldsymbol{ heta} ight) ight) ]

    定义正分类样本和误分类样本

    [mathcal{S}_{h_{ heta}}^+ = {i : i in [n], h_{ heta} (x_i)=y_i } quad mathrm{and} quad mathcal{S}_{h_{ heta}}^- = {i : i in [n], h_{ heta} (x_i) ot =y_i }. ]

    MART

    在所有样本上的鲁棒分类误差:

    [ ag{3} mathcal{R}(h_{ heta}) = frac{1}{n} sum_{i=1}^n max_{x_i' in mathcal{B}_{epsilon}(x_i)} mathbb{1}(h_{ heta}(x_i') ot= y_i), ]

    并定义在错分样本上的鲁棒分类误差

    [ ag{4} mathcal{R}^- (h_{ heta}, x_i):= mathbb{1} (h_{ heta}(hat{x}_i') ot=y_i) + mathbb{1}(h_{ heta}(x_i) ot= h_{ heta} (hat{x}_i')) ]

    其中

    [ ag{5} hat{x}_i'=arg max_{x_i' in mathcal{B}_{epsilon} (x_i)} mathbb{1} (h_{ heta} (x_i') ot = y_i). ]

    以及正分样本上的鲁棒分类误差:

    [ ag{6} mathcal{R}^+(h_{ heta}, x_i):=mathbb{1}(h_{ heta}(hat{x}_i') ot = y_i). ]

    最后, 我们要最小化的是二者的混合误差:

    [ ag{7} egin{aligned} min _{oldsymbol{ heta}} mathcal{R}_{ ext {misc }}left(h_{oldsymbol{ heta}} ight): &=frac{1}{n}left(sum_{i in mathcal{S}_{h}^{+}} mathcal{R}^{+}left(h_{oldsymbol{ heta}}, mathbf{x}_{i} ight)+sum_{i in mathcal{S}_{oldsymbol{h}_{oldsymbol{ heta}}}^{-}} mathcal{R}^{-}left(h_{oldsymbol{ heta}}, mathbf{x}_{i} ight) ight) \ &=frac{1}{n} sum_{i=1}^{n}left{mathbb{1}left(h_{oldsymbol{ heta}}left(hat{mathbf{x}}_{i}^{prime} ight) eq y_{i} ight)+mathbb{1}left(h_{oldsymbol{ heta}}left(mathbf{x}_{i} ight) eq h_{oldsymbol{ heta}}left(hat{mathbf{x}}_{i}^{prime} ight) ight) cdot mathbb{1}left(h_{oldsymbol{ heta}}left(mathbf{x}_{i} ight) eq y_{i} ight) ight} end{aligned}. ]

    为了能够传递梯度, 需要利用一些替代函数"软化"上面的损失函数, 对于(mathbb{1}(h_{ heta}(hat{x}_i') ot = y_i))利用BCE损失函数替代

    [ ag{8} mathrm{BCE} (p(hat{x}_i, heta),y_i)= -log (p_{y_i} (hat{x}_i', heta))- log (1-max_{k ot=y_i} p_k(hat{x}_i', heta)), ]

    第一项为普通的交叉熵损失, 第二项用于提高分类边界.

    对于第二项(mathbb{1}(h_{ heta}(x_i) ot=h_{ heta}(hat{x}_i'))), 用KL散度作为替代

    [ ag{9} mathrm{KL} (p(x_i, heta)| p(hat{x}_i', heta))=sum_{k=1}^K p_k(x_i, heta)log frac{p_k(x_i, heta)}{p_k(hat{x}_i', heta)}. ]

    最后一项(mathbb{1}(h_{ heta}(x_i) ot =y_i))则可用 (1-p_{y_i}(x_i, heta))来代替.

    于是最后的损失函数便是

    [ ag{11} mathcal{L}^{mathrm{MART}}( heta)= frac{1}{n} sum_{i=1}^n ell(x_i, y_i, heta), ]

    其中

    [ell (x_i,y_i, heta):=mathrm{BCE}(p(hat{x}_i', heta),y_i)+lambda cdot mathrm{KL} (p(x_i, heta) |p(hat{x}_i, heta)) cdot (1-p_{y_i}(x_i, heta)). ]

  • 相关阅读:
    Java Jsch SFTP 递归下载文件夹
    spring-jms,spring-boot-starter-activemq JmsTemplate 发送方式
    Spring Boot 入门之消息中间件篇(转发)
    Springboot websocket使用
    FinalCutPro快捷键
    基本CSS布局三
    As Simple as One and Two
    Game of Credit Cards
    WOW Factor
    Lose it!
  • 原文地址:https://www.cnblogs.com/MTandHJ/p/13111604.html
Copyright © 2011-2022 走看看