zoukankan      html  css  js  c++  java
  • 论文阅读笔记A Latent Transformer for Disentangled Face Editing in Images and Videos

    论文题目:应用于图像和视频解纠缠面部编辑的潜在转换器

    一、introduction and related work(记了一些关键语句)

    (1)研究表明,在生成模型的潜在空间中,沿特定方向移动潜在代码可以导致相应生成图像中视觉属性的不变性。 

    (2)Firstly, successful manipulations can only be achieved in well disentangled and linearized latent spaces

    (3)用线性变换对人脸属性进行操作是非常有局限性的。

    (4)the state-of-the-art image generator to project real image to latent space:stylegan

    (5)The transformation network generates disentangled,identity-preserving and controllable attribute editing resultson real images

    (6)有关disentangled representations相关的工作

    • One  optimization-based  method,  Im-age2StyleGAN++ , carried out local editing along with global semantic edits on images by applying masked interpolation on the activation features of StyleGAN(?这是什么)
    • Collinsetal. performed a k-means clustering on the activations of StyleGAN and detected a disentanglement of semantic objects,  which enables further local semantic editing on the generated image
    • For high level semantic edits, Ganalyze[13] learned a manifold in the latent space of BigGAN [5] togenerate images of different memorability. 
    • InterFaceGAN[35] proposed to learn a hyper-plane for a binary classifi-cation in the latent space, which one can use to manipulatethe target facial attribute by simple interpolation.  Follow-ing their work,  StyleSpace [42] carried out a quantitativestudy on the latent spaces of StyleGAN [21] and realized ahighly localized and disentangled control of the visual attributes.
    •  StyleFlow [3] achieved conditional exploration ofthe latent space by training conditional normalizing flows.
    • 还有很多,具体看论文related work部分

    二、contributions

    We propose a latent transformation network for facial attribute editing, achieving disentangled and controllable manipulations on real images with good identity preservation. 

    Our method can carry out efficient sequential attribute editing on real images. 

    We introduce a pipeline to generalize the face editing to videos and generate realistic and stable manipulations on high resolution videos.

    三、method

    1、we propose a framework to edit faces inreal images and videos via the latent space of StyleGAN.

    2、假设总共有n个属性a,对于每个不同的attributes训练不同的transformer

    3、为了从latent code中predict attributes,用了一个latent classifier C,C是pre-trained

    Latent Classifier:To predict attributes on the manipu-lated latent codes, we train an attribute classifierC on the“latent code - label” pairs. 

    The classifier consists of three fully connected layers with ReLU activations in between.C is fixed during the training of the latent transformer.

    面部属性分类器引用于:(Harness-ing synthesized abstraction images to improve facial attributerecognition)

    4.Given a latent code w∈ W+,the latent transformer T generates the direction for a single attribute modification, where the amount of changes is controlled by a scaling factor α. The network is expressed with a single layer of linear transformation

     5.loss function

     四、evaluation metrics

    1、quantitative 

    We compare our method quantitatively with GANSpace and  InterFaceGAN  using  three  metrics: 

    (1) target  attribute change rate

    (2)attribute preservation rate

    (3)identity preser-vation score

    2、qualitative

  • 相关阅读:
    Mysql源代码分析系列(2): 源代码结构转载
    Python 元组、列表、字典、文件
    Mysql源代码分析系列(1): 编译和调试转载
    ETL测试参考文档
    MySql select into与set的区别
    STL container
    mysqlclient5.0.2614 RPM for ppc
    linux多线程的总结(pthread用法)
    给线程变量pthread_t *thread动态分配空间
    当SQL数据库日志文件已满,或者日志很大,怎么办
  • 原文地址:https://www.cnblogs.com/h694879357/p/15528988.html
Copyright © 2011-2022 走看看