CVPR2021 Review: Synthesis

Semantic Image Generation

PISE: Person Image Synthesis and Editing with Decoupled GAN

J. Jang et al. paper | code


Person image synthesis


Propose two-stage model with per-region control to decouple the shape and style of clothing.
Propose joint global and local per-region encoding and normalization to predict the reasonable style of clothing for invisible regions, and preserve the original style of clothing in the target image.
Propose a spatial-aware normalization to retain the spatial context relationship in the source image, and transfer it by modulating the scale and bias of the generated image feature.


Image Generator SEAN

Image Synthesis

"Image Generators with Conditionally-Independent Pixel Synthesis"

I. Anokhin et al. oral | paper | code

Task : Image Synthesis


Recent methods rely heavily on spatial convolutions and, optionally, self-attention blocks in order to gradually synthesize images in a coarse-to-fine manner.
color value at pixel = G( random latent vector , position of pixel ) (NO Conv. Layer / propagate information across the pixel !)

Position encoding


Precision & Recall


Exploring Cross-Image Pixel Contrast for Semantic Segmentation

Wenguan Wang1*, Tianfei Zhou1 et al. paper | code


Recent works focus only on mining “local” context.
e.g) dependencies between pixels within individual images
by context-aggregation modules: dilated convolution, neural attention or structure-aware optimizations: IoU-like loss.
Thus, "Global" context of training data was ignored.
e.g) rich semantic relations between pixels across different images.

Proposed Method

Inspired by Unsupervised constrative representation learning
pixel-wise contrastive algorithm for semantic segmentation in the fully supervised setting.

Representation Learning / Latent Space Discovory

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

Tianfei Zhou et al. paper | code

Learning Statistical Texture for Semantic Segmentation

L. Zhu et. al. paper


Semantic segmentation

Proposed Method

Quantized and Counting Operation (QCO)
First quantize the input feature into multiple level (texture stastics)
then count the intensity of each level for texture feature encoding
Texture Enhancement Module (TEM)
Inspired by histogram equalization,
TEM is designed to build a graph to propagate information of all original quantization levels for texure enhancement.
Pyramid Texture Extraction Module (PTEM)
exploits the texture information from multiple scales with a texture feature extraction unit and pyramid structure.

Transferable Semantic Augmentation for Domain Adaptation

Shuang Li et. al. paper | code

Task: Domain Adaptation

Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation

Shuaijun Chen et al. paper | code

Task: Domain Adaptation for semantic segmentation

Few-shot learning

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

Chenchen Zhu et. al. paper | code