论文题目:基于潜意识的真实图像编辑的关键性调整(基于stylegan进行的调整)
一、contributions
在本文中,我们介绍了一种新的方法,以减轻失真编辑性的权衡,使其在真实图像上的分布。
论点:对于一个编辑任务,对于真实图片的映射到隐层空间后已经out of domain,导致生成的图片会有伪影,因提出了训练生成器,扩大生成器的输入domain,使得编辑后的采样点也在生成器的输入域范围内。但是会出现distortion-editability tradeoff问题,code离w越近,editability的能力越好。
所以,本文在训练的时候是pivotal tuning,轻微调整生成器,使得那些从真实图片映射至隐空间可能out of domain的点也能生成和输入一样的图像。这样既能保持编辑能力又能保持重构能力。
we introduce a novel approach to mitigatethe distortion-editability trade-off, allowing convincing edits on real images that are out-of-distribution. Instead of projecting the input image into the learned manifold, we augment the manifold to include the image by slightly alter-ing the generator, in a process we callpivotal tuning. Thisadjustment is analogous to shooting a dart and then shiftingthe board itself to compensate for a near hit
二、method
we present a two-step method for inverting real images to highly editable latent codes:一个分两步的方法,把真实图像反转到高度可编辑的latent code
(1)First, we invert the given input to wp in the native latent space of StyleGAN,W.
(2)Then, we apply a Pivotal Tuning on this pivot code wp to tune the pre-trained StyleGAN to produce the desired image for input wp.
The driving intuition here is that since wp is close enough, training the generator to produce the input image from the pivot can be achieved through augmenting appearance-related weights only, without affecting the well-behaved structure of StyleGAN’s latent space
训练分两大步,首先是GAN inversion,将真实图片映射到wp,然后以这个wp去训练生成器来产生希望的图片,由于wp与真实图片的位置足够近,使得只需增强一些外形参数而不影响其他StyleGAN结构即可完成重构。(intuition的感觉就是先通过原始的GAN inversion生成一张相似的脸,再通过fine tune把这个相似的脸训成和真实图片一样的脸)(fine tune微调)
三、evaluation metrics
pixel-wise distance using MSE
perceptual similarity using LPIPS(图像感知相似度指标)
structuralsimilarity using MS−SSIM
identity similar-ity by employing a pretrained face recognition network
https://zhuanlan.zhihu.com/p/381040616