zoukankan      html  css  js  c++  java
  • 论文阅读笔记A Latent Transformer for Disentangled Face Editing in Images and Videos

    论文题目:应用于图像和视频解纠缠面部编辑的潜在转换器

    一、introduction and related work(记了一些关键语句)

    (1)研究表明,在生成模型的潜在空间中,沿特定方向移动潜在代码可以导致相应生成图像中视觉属性的不变性。 

    (2)Firstly, successful manipulations can only be achieved in well disentangled and linearized latent spaces

    (3)用线性变换对人脸属性进行操作是非常有局限性的。

    (4)the state-of-the-art image generator to project real image to latent space:stylegan

    (5)The transformation network generates disentangled,identity-preserving and controllable attribute editing resultson real images

    (6)有关disentangled representations相关的工作

    • One  optimization-based  method,  Im-age2StyleGAN++ , carried out local editing along with global semantic edits on images by applying masked interpolation on the activation features of StyleGAN(?这是什么)
    • Collinsetal. performed a k-means clustering on the activations of StyleGAN and detected a disentanglement of semantic objects,  which enables further local semantic editing on the generated image
    • For high level semantic edits, Ganalyze[13] learned a manifold in the latent space of BigGAN [5] togenerate images of different memorability. 
    • InterFaceGAN[35] proposed to learn a hyper-plane for a binary classifi-cation in the latent space, which one can use to manipulatethe target facial attribute by simple interpolation.  Follow-ing their work,  StyleSpace [42] carried out a quantitativestudy on the latent spaces of StyleGAN [21] and realized ahighly localized and disentangled control of the visual attributes.
    •  StyleFlow [3] achieved conditional exploration ofthe latent space by training conditional normalizing flows.
    • 还有很多,具体看论文related work部分

    二、contributions

    We propose a latent transformation network for facial attribute editing, achieving disentangled and controllable manipulations on real images with good identity preservation. 

    Our method can carry out efficient sequential attribute editing on real images. 

    We introduce a pipeline to generalize the face editing to videos and generate realistic and stable manipulations on high resolution videos.

    三、method

    1、we propose a framework to edit faces inreal images and videos via the latent space of StyleGAN.

    2、假设总共有n个属性a,对于每个不同的attributes训练不同的transformer

    3、为了从latent code中predict attributes,用了一个latent classifier C,C是pre-trained

    Latent Classifier:To predict attributes on the manipu-lated latent codes, we train an attribute classifierC on the“latent code - label” pairs. 

    The classifier consists of three fully connected layers with ReLU activations in between.C is fixed during the training of the latent transformer.

    面部属性分类器引用于:(Harness-ing synthesized abstraction images to improve facial attributerecognition)

    4.Given a latent code w∈ W+,the latent transformer T generates the direction for a single attribute modification, where the amount of changes is controlled by a scaling factor α. The network is expressed with a single layer of linear transformation

     5.loss function

     四、evaluation metrics

    1、quantitative 

    We compare our method quantitatively with GANSpace and  InterFaceGAN  using  three  metrics: 

    (1) target  attribute change rate

    (2)attribute preservation rate

    (3)identity preser-vation score

    2、qualitative

  • 相关阅读:
    kubernetes入门(03)kubernetes的基本概念
    洛谷P3245 [HNOI2016]大数(莫队)
    洛谷P4462 [CQOI2018]异或序列(莫队)
    cf997C. Sky Full of Stars(组合数 容斥)
    cf1121F. Compress String(后缀自动机)
    洛谷P4704 太极剑(乱搞)
    洛谷P4926 [1007]倍杀测量者(差分约束)
    洛谷P4590 [TJOI2018]游园会(状压dp LCS)
    洛谷P4588 [TJOI2018]数学计算(线段树)
    洛谷P4592 [TJOI2018]异或(可持久化01Trie)
  • 原文地址:https://www.cnblogs.com/h694879357/p/15528988.html
Copyright © 2011-2022 走看看