zoukankan      html  css  js  c++  java
  • 论文笔记(4)—"Personalized Cross-Silo Federated Learning on Non-IID Data"

    Methods

    In their dissertation, in order to specialize clients' models, an intermediate result (U^k) was computed as an aggregated model and forced clients' model ({mathbf W}) to be closed to (U^k).

    Concretely, they solve the following problem:

    yrR2M8.png

    where (A) is a kind of similarity measure. The objective function is alternatively optimized by optimizing (U^k) and ({mathbf W}^k). First, (U^k) only depends on the second term (sum A(Vert{mathbf w}_i -{mathbf w}_j Vert^2)),where ({mathbf w}_i) denotes (i)-th column of ({mathbf W}) and the total number of clients is (m). (U^k) is computed by

    yrqWLR.png

    Then, ({mathbf W}^k) is the solution of

    yrXPqs.png

    The complete algorithm procedure is following

    yrXuM4.png

    Note: (W^{k}) is the optimal solution of the above problem in k-th step. In my opinion, it's unlikely to find a closed solution and needs multi-local steps to find ({mathbf W}^k). Because authors didn't publish their code, I can't check the optimal condition of ({mathbf W}^{k}).

    Convergence

    For simplicity, there is only a convergence analysis in convex setting in this blog. Before elaborating the process, let take a look at their assumption:

    yrjNt0.png

    (L-) smooth is the only required condition and their analysis is based on that solving ({mathbf W}^k) is feasible.

    As discussed in my previous bolg, we should prove ({mathbf W}^k) is convergent to ({mathbf W}^*) along with (k o infty). According to the definition of strongly convex, we have

    image-20210213164801692

    plugging in (U^k = {mathbf W}^{k-1}-alpha_k abla{mathcal A}({mathbf W}^{k-1})),

    [Vert {mathbf W}^{k}-{mathbf W}^*Vert^2leq -frac{2alpha_k}{lambda}({mathcal F}({mathbf W}^{k})-{mathcal F}({mathbf W}^{*}))+Vert{mathbf W}^{k-1}-alpha_k abla{mathcal A}({mathbf W}^{k-1}) -{mathbf W}^*Vert^2 ]

    we can have

    yrzEm4.png

    we still should find the relationship between ({mathcal F}({mathbf W}^{k})) and ({mathcal F}({mathbf W}^{k-1})).

    Summary

    • In their work, authors recognized that the similarity measure (Vert {mathbf w}_i-{mathbf w}_jVert) is improper in deep network.
    • As above said, find ({mathbf W}^k) in a closed form is difficult in most scenario
    • The message passing mechanism is ambiguous for me
  • 相关阅读:
    BOM 事件 navigator浏览器的判断
    闭包
    超简单超实用的走马灯效果实现
    对DOM的增删改查
    SVN版本回退与常用命令总结
    mongodb使用总结
    如何理解 IE 的文档兼容模式
    软件开发编码规范
    第一篇绑定数据 CMS
    关于IIS7.5下的web.config配置的一些问题
  • 原文地址:https://www.cnblogs.com/DemonHunter/p/14400433.html
Copyright © 2011-2022 走看看