Motivation

Data augmentation is used in machine learning to expand the number of examples in a dataset. For images, it is common to use image processing techniques like flipping, rotation, additive noise, alpha blending, and many others to slightly modify an image and add the edited image to the dataset.

Problem description

Your task is to design a basic library to perform data augmentations for paired image-to-image translation. Datasets for paired image-to-image translation consists of images from a source domain A and images from some target domain(s) B_i (i ≥ 1). This means that the dataset consists of a set of N input images from domain A and some number of other sets B_i that contain N corresponding images in other modalities. For example, let’s say you are training an image-to-image GAN which takes images of line drawings as input and generates images of watercolor paintings and oil paintings.

Requirements

Your solution should:

be written in Python
use a common image processing library like OpenCV (Python), Pillow, scikit-image
use appropriate and efficient data structures
be designed as a stand-alone command-line application
support running on 1 or more sets of images (where all sets have the same filenames but are contained in different directories)
use a random number generator for each of the manipulations
support at least 2 different image manipulations (we gave links above to flipping and rotating)

Data

You can download some sample data here. It contains rgb images (which you may think about as set A above), as well as depth and normal images (which you may think about as sets B₁ and B₂ above). This is a made-up dataset of a synthetic 3D scenes. Your script need-not care about the data types.

It contains the following files:

data
├── rgb
│   ├── 0000.png
│   ├── 0001.png
│   ├── 0002.png
│   ├── 0003.png
│   ├── 0004.png
│   ├── 0005.png
│   ├── 0006.png
│   ├── 0007.png
│   ├── 0008.png
│   └── 0009.png
├── normal
│   ├── 0000.png
│   ├── 0001.png
│   ├── 0002.png
│   ├── 0003.png
│   ├── 0004.png
│   ├── 0005.png
│   ├── 0006.png
│   ├── 0007.png
│   ├── 0008.png
│   └── 0009.png
└── depth
    ├── 0000.png
    ├── 0001.png
    ├── 0002.png
    ├── 0003.png
    ├── 0004.png
    ├── 0005.png
    ├── 0006.png
    ├── 0007.png
    ├── 0008.png
    └── 0009.png

Example

For example, your application could be run like this:

python3 augment.py data/rgb data/normal data/depth --output augmentations_output --count 10 

Which would create 10 different augmentations of each of the three directories of images. In each of these 10 augmentations some (random) amount of rotation and scaling (for example) will be applied to the image-sets and the results will be saved in sub-directories of augmentations_output.

For example, here’s an abbreviated file list that may result:

augmentations_output
├── rgb-00
│   ├── 0000.png
│   ├── ...
│   └── 0009.png
├── ...
├── rgb-09
│   ├── 0000.png
│   ├── ...
│   └── 0009.png
├── normal-00
│   ├── 0000.png
│   ├── ...
│   └── 0009.png
├── ...
├── normal-09
│   ├── 0000.png
│   ├── ...
│   └── 0009.png
├── depth-00
│   ├── 0000.png
│   ├── ...
│   └── 0009.png
├── ...
└── depth-09
    ├── 0000.png
    ├── ...
    └── 0009.png

Remember that, for exmaple, rgb-01/0004.png and normal-01/0004.png and depth-01/0004.png will all have had the same modifications applied to them (as shown in the figure at the top).

If you have any questions, please let us know.

augmenter

Batch image manipulation exercise

Motivation

Problem description

Requirements

Data

Example