Generation Models¶

CycleGAN: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks¶

Introduction¶

[ALGORITHM]

@inproceedings{zhu2017unpaired,
  title={Unpaired image-to-image translation using cycle-consistent adversarial networks},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle={Proceedings of the IEEE international conference on computer vision},
  pages={2223--2232},
  year={2017}
}

Results and Models¶

We use FID and IS metrics to evaluate the generation performance of CycleGAN.

FID evaluation:

Dataset	facades	facades-id0	summer2winter	summer2winter-id0	winter2summer	winter2summer-id0	horse2zebra	horse2zebra-id0	zebra2horse	zebra2horse-id0	average
official	123.626	119.726	77.342	76.773	72.631	74.239	62.111	77.202	138.646	137.050	95.935
ours	118.297	126.316	76.959	76.018	72.803	73.498	63.810	71.675	139.279	132.369	95.102

IS evaluation:

Dataset	facades	facades-id0	summer2winter	summer2winter-id0	winter2summer	winter2summer-id0	horse2zebra	horse2zebra-id0	zebra2horse	zebra2horse-id0	average
official	1.638	1.697	2.762	2.750	3.293	3.110	1.375	1.584	3.186	3.047	2.444
ours	1.584	1.957	2.768	2.735	3.069	3.130	1.430	1.542	3.093	2.958	2.427

Model and log downloads:

Dataset	facades	facades-id0	summer2winter	summer2winter-id0	horse2zebra	horse2zebra-id0
download	model \| log	model \| log	model \| log	model \| log	model \| log	model \| log

Note: With a larger identity loss, the image-to-image translation becomes more conservative, which makes less changes. The original authors did not say what is the best weight for identity loss. Thus, in addition to the default setting, we also set the weight of identity loss to 0 (denoting id0) to make a more comprehensive comparison.

pix2pix: Image-To-Image Translation With Conditional Adversarial Networks¶

Introduction¶

[ALGORITHM]

@inproceedings{isola2017image,
  title={Image-to-image translation with conditional adversarial networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  pages={1125--1134},
  year={2017}
}

Results and Models¶

We use FID and IS metrics to evaluate the generation performance of pix2pix.

FID evaluation:

Dataset	facades	maps-a2b	maps-b2a	edges2shoes	average
official	119.135	149.731	102.072	75.774	111.678
ours	127.792	118.552	92.798	85.413	106.139

IS evaluation:

Dataset	facades	maps-a2b	maps-b2a	edges2shoes	average
official	1.650	2.529	3.552	2.766	2.624
ours	1.745	2.689	3.473	2.747	2.664

Model and log downloads:

Dataset	facades	maps-a2b	maps-b2a	edges2shoes
download	model \| log	model \| log	model \| log	model \| log

Note: we strictly follow the paper setting in Section 3.3: “At inference time, we run the generator net in exactly the same manner as during the training phase. This differs from the usual protocol in that we apply dropout at test time, and we apply batch normalization using the statistics of the test batch, rather than aggregated statistics of the training batch.” (i.e., use model.train() mode), thus may lead to slightly different inference results every time.