Generation Models¶
CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks¶
Introduction¶
[ALGORITHM]
@inproceedings{zhu2017unpaired,
title={Unpaired image-to-image translation using cycle-consistent adversarial networks},
author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
booktitle={Proceedings of the IEEE international conference on computer vision},
pages={2223--2232},
year={2017}
}
Results and Models¶
We use FID
and IS
metrics to evaluate the generation performance of CycleGAN.
FID
evaluation:
Dataset | facades | facades-id0 | summer2winter | summer2winter-id0 | winter2summer | winter2summer-id0 | horse2zebra | horse2zebra-id0 | zebra2horse | zebra2horse-id0 | average |
---|---|---|---|---|---|---|---|---|---|---|---|
official | 123.626 | 119.726 | 77.342 | 76.773 | 72.631 | 74.239 | 62.111 | 77.202 | 138.646 | 137.050 | 95.935 |
ours | 118.297 | 126.316 | 76.959 | 76.018 | 72.803 | 73.498 | 63.810 | 71.675 | 139.279 | 132.369 | 95.102 |
IS
evaluation:
Dataset | facades | facades-id0 | summer2winter | summer2winter-id0 | winter2summer | winter2summer-id0 | horse2zebra | horse2zebra-id0 | zebra2horse | zebra2horse-id0 | average |
---|---|---|---|---|---|---|---|---|---|---|---|
official | 1.638 | 1.697 | 2.762 | 2.750 | 3.293 | 3.110 | 1.375 | 1.584 | 3.186 | 3.047 | 2.444 |
ours | 1.584 | 1.957 | 2.768 | 2.735 | 3.069 | 3.130 | 1.430 | 1.542 | 3.093 | 2.958 | 2.427 |
Model and log downloads:
Dataset | facades | facades-id0 | summer2winter | summer2winter-id0 | horse2zebra | horse2zebra-id0 |
---|---|---|---|---|---|---|
download | model | log | model | log | model | log | model | log | model | log | model | log |
Note: With a larger identity loss, the image-to-image translation becomes more conservative, which makes less changes. The original authors did not say what is the best weight for identity loss. Thus, in addition to the default setting, we also set the weight of identity loss to 0 (denoting id0
) to make a more comprehensive comparison.
pix2pix: Image-To-Image Translation With Conditional Adversarial Networks¶
Introduction¶
[ALGORITHM]
@inproceedings{isola2017image,
title={Image-to-image translation with conditional adversarial networks},
author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={1125--1134},
year={2017}
}
Results and Models¶
We use FID
and IS
metrics to evaluate the generation performance of pix2pix.
FID
evaluation:
Dataset | facades | maps-a2b | maps-b2a | edges2shoes | average |
---|---|---|---|---|---|
official | 119.135 | 149.731 | 102.072 | 75.774 | 111.678 |
ours | 127.792 | 118.552 | 92.798 | 85.413 | 106.139 |
IS
evaluation:
Dataset | facades | maps-a2b | maps-b2a | edges2shoes | average |
---|---|---|---|---|---|
official | 1.650 | 2.529 | 3.552 | 2.766 | 2.624 |
ours | 1.745 | 2.689 | 3.473 | 2.747 | 2.664 |
Model and log downloads:
Dataset | facades | maps-a2b | maps-b2a | edges2shoes |
---|---|---|---|---|
download | model | log | model | log | model | log | model | log |
Note: we strictly follow the paper setting in Section 3.3: “At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization using the statistics of the test batch, rather than aggregated statistics of the training batch.” (i.e., use model.train() mode), thus may lead to slightly different inference results every time.