Generation Models

CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

Introduction

[ALGORITHM]

@inproceedings{zhu2017unpaired,
  title={Unpaired image-to-image translation using cycle-consistent adversarial networks},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle={Proceedings of the IEEE international conference on computer vision},
  pages={2223--2232},
  year={2017}
}

Results and Models

We use FID and IS metrics to evaluate the generation performance of CycleGAN.

FID evaluation:

Dataset facades facades-id0 summer2winter summer2winter-id0 winter2summer winter2summer-id0 horse2zebra horse2zebra-id0 zebra2horse zebra2horse-id0 average
official 123.626 119.726 77.342 76.773 72.631 74.239 62.111 77.202 138.646 137.050 95.935
ours 118.297 126.316 76.959 76.018 72.803 73.498 63.810 71.675 139.279 132.369 95.102

IS evaluation:

Dataset facades facades-id0 summer2winter summer2winter-id0 winter2summer winter2summer-id0 horse2zebra horse2zebra-id0 zebra2horse zebra2horse-id0 average
official 1.638 1.697 2.762 2.750 3.293 3.110 1.375 1.584 3.186 3.047 2.444
ours 1.584 1.957 2.768 2.735 3.069 3.130 1.430 1.542 3.093 2.958 2.427

Model and log downloads:

Dataset facades facades-id0 summer2winter summer2winter-id0 horse2zebra horse2zebra-id0
download model | log model | log model | log model | log model | log model | log

Note: With a larger identity loss, the image-to-image translation becomes more conservative, which makes less changes. The original authors did not say what is the best weight for identity loss. Thus, in addition to the default setting, we also set the weight of identity loss to 0 (denoting id0) to make a more comprehensive comparison.

pix2pix: Image-To-Image Translation With Conditional Adversarial Networks

Introduction

[ALGORITHM]

@inproceedings{isola2017image,
  title={Image-to-image translation with conditional adversarial networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1125--1134},
  year={2017}
}

Results and Models

We use FID and IS metrics to evaluate the generation performance of pix2pix.

FID evaluation:

Dataset facades maps-a2b maps-b2a edges2shoes average
official 119.135 149.731 102.072 75.774 111.678
ours 127.792 118.552 92.798 85.413 106.139

IS evaluation:

Dataset facades maps-a2b maps-b2a edges2shoes average
official 1.650 2.529 3.552 2.766 2.624
ours 1.745 2.689 3.473 2.747 2.664

Model and log downloads:

Dataset facades maps-a2b maps-b2a edges2shoes
download model | log model | log model | log model | log

Note: we strictly follow the paper setting in Section 3.3: “At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization using the statistics of the test batch, rather than aggregated statistics of the training batch.” (i.e., use model.train() mode), thus may lead to slightly different inference results every time.