tensorpack.dataflow.imgaug package

class tensorpack.dataflow.imgaug.Augmentor[source]

Bases: object

Base class for an augmentor

__repr__()[source]

Produce something like: “imgaug.MyAugmentor(field1={self.field1}, field2={self.field2})”

__str__()

Produce something like: “imgaug.MyAugmentor(field1={self.field1}, field2={self.field2})”

augment(d)[source]

Perform augmentation on the data.

augment_return_params(d)[source]
Returns:augmented data augmentation params
reset_state()[source]

reset rng and other state

class tensorpack.dataflow.imgaug.ImageAugmentor[source]

Bases: tensorpack.dataflow.imgaug.base.Augmentor

ImageAugmentor should take images of type uint8 in range [0, 255], or floating point images in range [0, 1] or [0, 255].

augment_coords(coords, param)[source]

Augment the coordinates given the param. By default, an augmentor keeps coordinates unchanged. If a subclass changes coordinates but couldn’t implement this method, it should raise NotImplementedError().

Parameters:coords – Nx2 floating point numpy array where each row is (x, y)
Returns:new coords
class tensorpack.dataflow.imgaug.AugmentorList(augmentors)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Augment by a list of augmentors

__init__(augmentors)[source]
Parameters:augmentors (list) – list of ImageAugmentor instance to be applied.
reset_state()[source]

Will reset state of each augmentor

class tensorpack.dataflow.imgaug.ColorSpace(mode, keepdims=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Convert into another color space.

__init__(mode, keepdims=True)[source]
Parameters:
  • mode – OpenCV color space conversion code (e.g., cv2.COLOR_BGR2HSV)

  • keepdims (bool) – keep the dimension of image unchanged if OpenCV changes it.

class tensorpack.dataflow.imgaug.Grayscale(keepdims=True, rgb=False)[source]

Bases: tensorpack.dataflow.imgaug.convert.ColorSpace

Convert image to grayscale.

__init__(keepdims=True, rgb=False)[source]
Parameters:
  • keepdims (bool) – return image of shape [H, W, 1] instead of [H, W]

  • rgb (bool) – interpret input as RGB instead of the default BGR

class tensorpack.dataflow.imgaug.ToUint8[source]

Bases: tensorpack.dataflow.imgaug.meta.MapImage

Convert image to uint8. Useful to reduce communication overhead.

class tensorpack.dataflow.imgaug.ToFloat32[source]

Bases: tensorpack.dataflow.imgaug.meta.MapImage

Convert image to float32, may increase quality of the augmentor.

class tensorpack.dataflow.imgaug.RandomCrop(crop_shape)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Randomly crop the image into a smaller one

__init__(crop_shape)[source]
Parameters:crop_shape – (h, w) tuple or a int
class tensorpack.dataflow.imgaug.CenterCrop(crop_shape)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Crop the image at the center

__init__(crop_shape)[source]
Parameters:crop_shape – (h, w) tuple or a int
class tensorpack.dataflow.imgaug.RandomCropRandomShape(wmin, hmin, wmax=None, hmax=None, max_aspect_ratio=None)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Random crop with a random shape

__init__(wmin, hmin, wmax=None, hmax=None, max_aspect_ratio=None)[source]

Randomly crop a box of shape (h, w), sampled from [min, max] (both inclusive). If max is None, will use the input image shape.

Parameters:
  • hmin, wmax, hmax (wmin,) – range to sample shape.

  • max_aspect_ratio (float) – the upper bound of max(w,h)/min(w,h).

class tensorpack.dataflow.imgaug.Shift(horiz_frac=0, vert_frac=0, border=cv2.BORDER_REPLICATE, border_value=0)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Random horizontal and vertical shifts

__init__(horiz_frac=0, vert_frac=0, border=cv2.BORDER_REPLICATE, border_value=0)[source]
Parameters:
  • horiz_frac (float) – max abs fraction for horizontal shift

  • vert_frac (float) – max abs fraction for horizontal shift

  • border – cv2 border method

  • border_value – cv2 border value for border=cv2.BORDER_CONSTANT

class tensorpack.dataflow.imgaug.Rotation(max_deg, center_range=(0, 1), interp=cv2.INTER_LINEAR, step_deg=None, border_value=0)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Random rotate the image w.r.t a random center

__init__(max_deg, center_range=(0, 1), interp=cv2.INTER_LINEAR, step_deg=None, border_value=0)[source]
Parameters:
  • max_deg (float) – max abs value of the rotation angle (in degree).

  • center_range (tuple) – (min, max) range of the random rotation center.

  • interp – cv2 interpolation method

  • border – cv2 border method

  • step_deg (float) – if not None, the stepping of the rotation angle. The rotation angle will be a multiple of step_deg. This option requires max_deg==180 and step_deg has to be a divisor of 180)

  • border_value – cv2 border value for border=cv2.BORDER_CONSTANT

class tensorpack.dataflow.imgaug.RotationAndCropValid(max_deg, interp=cv2.INTER_LINEAR, step_deg=None)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Random rotate and then crop the largest possible rectangle. Note that this will produce images of different shapes.

__init__(max_deg, interp=cv2.INTER_LINEAR, step_deg=None)[source]
Parameters:interp, step_deg (max_deg,) – same as Rotation
static largest_rotated_rect(w, h, angle)[source]

Get largest rectangle after rotation. http://stackoverflow.com/questions/16702966/rotate-image-and-crop-out-black-borders

class tensorpack.dataflow.imgaug.Affine(scale=None, translate_frac=None, rotate_max_deg=0.0, shear=0.0, interp=cv2.INTER_LINEAR, border_value=0)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Random affine transform of the image w.r.t to the image center. Transformations involve:

  • Translation (“move” image on the x-/y-axis)

  • Rotation

  • Scaling (“zoom” in/out)

  • Shear (move one side of the image, turning a square into a trapezoid)

__init__(scale=None, translate_frac=None, rotate_max_deg=0.0, shear=0.0, interp=cv2.INTER_LINEAR, border_value=0)[source]
Parameters:
  • scale (tuple of 2 floats) – scaling factor interval, e.g (a, b), then scale is randomly sampled from the range a <= scale <= b. Will keep original scale by default.

  • translate_frac (tuple of 2 floats) – tuple of max abs fraction for horizontal and vertical translation. For example translate_frac=(a, b), then horizontal shift is randomly sampled in the range 0 < dx < img_width * a and vertical shift is randomly sampled in the range 0 < dy < img_height * b. Will not translate by default.

  • shear (float) – max abs shear value in degrees between 0 to 180

  • interp – cv2 interpolation method

  • border – cv2 border method

  • border_value – cv2 border value for border=cv2.BORDER_CONSTANT

class tensorpack.dataflow.imgaug.Hue(range=(0, 180), rgb=None)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Randomly change color hue.

__init__(range=(0, 180), rgb=None)[source]
Parameters:
  • range (list or tuple) – range from which the applied hue offset is selected (maximum [-90,90] or [0,180])

  • rgb (bool) – whether input is RGB or BGR.

class tensorpack.dataflow.imgaug.Brightness(delta, clip=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Adjust brightness by adding a random number.

__init__(delta, clip=True)[source]
Parameters:
  • delta (float) – Randomly add a value within [-delta,delta]

  • clip (bool) – clip results to [0,255].

class tensorpack.dataflow.imgaug.BrightnessScale(range, clip=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Adjust brightness by scaling by a random factor.

__init__(range, clip=True)[source]
Parameters:
  • range (tuple) – Randomly scale the image by a factor in (range[0], range[1])

  • clip (bool) – clip results to [0,255].

class tensorpack.dataflow.imgaug.Contrast(factor_range, clip=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Apply x = (x - mean) * contrast_factor + mean to each channel.

__init__(factor_range, clip=True)[source]
Parameters:
  • factor_range (list or tuple) – an interval to randomly sample the contrast_factor.

  • clip (bool) – clip to [0, 255] if True.

class tensorpack.dataflow.imgaug.MeanVarianceNormalize(all_channel=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Linearly scales the image to have zero mean and unit norm. x = (x - mean) / adjusted_stddev where adjusted_stddev = max(stddev, 1.0/sqrt(num_pixels * channels))

This augmentor always returns float32 images.

__init__(all_channel=True)[source]
Parameters:all_channel (bool) – if True, normalize all channels together. else separately.
class tensorpack.dataflow.imgaug.GaussianBlur(max_size=3)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Gaussian blur the image with random window size

__init__(max_size=3)[source]
Parameters:max_size (int) – max possible Gaussian window size would be 2 * max_size + 1
class tensorpack.dataflow.imgaug.Gamma(range=(-0.5, 0.5))[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Randomly adjust gamma

__init__(range=(-0.5, 0.5))[source]
Parameters:range (list or tuple) – gamma range
class tensorpack.dataflow.imgaug.Clip(min=0, max=255)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Clip the pixel values

__init__(min=0, max=255)[source]
Parameters:max (min,) – the clip range
class tensorpack.dataflow.imgaug.Saturation(alpha=0.4, rgb=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Randomly adjust saturation. Follows the implementation in fb.resnet.torch.

__init__(alpha=0.4, rgb=True)[source]
Parameters:
  • alpha (float) – maximum saturation change.

  • rgb (bool) – whether input is RGB or BGR.

class tensorpack.dataflow.imgaug.Lighting(std, eigval, eigvec)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Lighting noise, as in the paper ImageNet Classification with Deep Convolutional Neural Networks. The implementation follows fb.resnet.torch.

__init__(std, eigval, eigvec)[source]
Parameters:
  • std (float) – maximum standard deviation

  • eigval – a vector of (3,). The eigenvalues of 3 channels.

  • eigvec – a 3x3 matrix. Each column is one eigen vector.

class tensorpack.dataflow.imgaug.MinMaxNormalize(min=0, max=255, all_channel=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Linearly scales the image to the range [min, max].

This augmentor always returns float32 images.

__init__(min=0, max=255, all_channel=True)[source]
Parameters:
  • max (float) – The new maximum value

  • min (float) – The new minimum value

  • all_channel (bool) – if True, normalize all channels together. else separately.

class tensorpack.dataflow.imgaug.RandomChooseAug(aug_lists)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Randomly choose one from a list of augmentors

__init__(aug_lists)[source]
Parameters:aug_lists (list) – list of augmentors, or list of (augmentor, probability) tuples
reset_state()[source]

reset rng and other state

class tensorpack.dataflow.imgaug.MapImage(func, coord_func=None)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Map the image array by a function.

__init__(func, coord_func=None)[source]
Parameters:func – a function which takes an image array and return an augmented one
class tensorpack.dataflow.imgaug.Identity[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

A no-op augmentor

class tensorpack.dataflow.imgaug.RandomApplyAug(aug, prob)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Randomly apply the augmentor with a probability. Otherwise do nothing

__init__(aug, prob)[source]
Parameters:
reset_state()[source]

reset rng and other state

class tensorpack.dataflow.imgaug.RandomOrderAug(aug_lists)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Apply the augmentors with randomized order.

__init__(aug_lists)[source]
Parameters:aug_lists (list) – list of augmentors. The augmentors are assumed to not change the shape of images.
reset_state()[source]

reset rng and other state

class tensorpack.dataflow.imgaug.Flip(horiz=False, vert=False, prob=0.5)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Random flip the image either horizontally or vertically.

__init__(horiz=False, vert=False, prob=0.5)[source]
Parameters:
  • horiz (bool) – use horizontal flip.

  • vert (bool) – use vertical flip.

  • prob (float) – probability of flip.

class tensorpack.dataflow.imgaug.Resize(shape, interp=cv2.INTER_LINEAR)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Resize image to a target size

__init__(shape, interp=cv2.INTER_LINEAR)[source]
Parameters:
  • shape – (h, w) tuple or a int

  • interp – cv2 interpolation method

class tensorpack.dataflow.imgaug.RandomResize(xrange, yrange=None, minimum=(0, 0), aspect_ratio_thres=0.15, interp=cv2.INTER_LINEAR)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Randomly rescale width and height of the image.

__init__(xrange, yrange=None, minimum=(0, 0), aspect_ratio_thres=0.15, interp=cv2.INTER_LINEAR)[source]
Parameters:
  • xrange (tuple) – a (min, max) tuple. If is floating point, the tuple defines the range of scaling ratio of new width, e.g. (0.9, 1.2). If is integer, the tuple defines the range of new width in pixels, e.g. (200, 350).

  • yrange (tuple) – similar to xrange, but for height. Should be None when aspect_ratio_thres==0.

  • minimum (tuple) – (xmin, ymin) in pixels. To avoid scaling down too much.

  • aspect_ratio_thres (float) – discard samples which change aspect ratio larger than this threshold. Set to 0 to keep aspect ratio.

  • interp – cv2 interpolation method

class tensorpack.dataflow.imgaug.ResizeShortestEdge(size, interp=cv2.INTER_LINEAR)[source]

Bases: tensorpack.dataflow.imgaug.transform.TransformAugmentorBase

Resize the shortest edge to a certain number while keeping the aspect ratio.

__init__(size, interp=cv2.INTER_LINEAR)[source]
Parameters:size (int) – the size to resize the shortest edge to.
class tensorpack.dataflow.imgaug.Transpose(prob=0.5)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Random transpose the image

__init__(prob=0.5)[source]
Parameters:prob (float) – probability of transpose.
class tensorpack.dataflow.imgaug.JpegNoise(quality_range=(40, 100))[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Random JPEG noise.

__init__(quality_range=(40, 100))[source]
Parameters:quality_range (tuple) – range to sample JPEG quality
class tensorpack.dataflow.imgaug.GaussianNoise(sigma=1, clip=True)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Add random Gaussian noise N(0, sigma^2) of the same shape to img.

__init__(sigma=1, clip=True)[source]
Parameters:
  • sigma (float) – stddev of the Gaussian distribution.

  • clip (bool) – clip the result to [0,255] in the end.

class tensorpack.dataflow.imgaug.SaltPepperNoise(white_prob=0.05, black_prob=0.05)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Salt and pepper noise. Randomly set some elements in image to 0 or 255, regardless of its channels.

__init__(white_prob=0.05, black_prob=0.05)[source]
Parameters:white_prob (float), black_prob (float) – probabilities setting an element to 255 or 0.
class tensorpack.dataflow.imgaug.CenterPaste(background_shape, background_filler=None)[source]

Bases: tensorpack.dataflow.imgaug.base.ImageAugmentor

Paste the image onto the center of a background canvas.

__init__(background_shape, background_filler=None)[source]
Parameters:
  • background_shape (tuple) – shape of the background canvas.

  • background_filler (BackgroundFiller) – How to fill the background. Defaults to zero-filler.

class tensorpack.dataflow.imgaug.BackgroundFiller[source]

Bases: object

Base class for all BackgroundFiller

fill(background_shape, img)[source]

Return a proper background image of background_shape, given img.

Parameters:
  • background_shape (tuple) – a shape (h, w)

  • img – an image

Returns:

a background image

class tensorpack.dataflow.imgaug.ConstantBackgroundFiller(value)[source]

Bases: tensorpack.dataflow.imgaug.paste.BackgroundFiller

Fill the background by a constant

__init__(value)[source]
Parameters:value (float) – the value to fill the background.
class tensorpack.dataflow.imgaug.RandomPaste(background_shape, background_filler=None)[source]

Bases: tensorpack.dataflow.imgaug.paste.CenterPaste

Randomly paste the image onto a background canvas.