# doctr.transforms#

Data transformations are part of both training and inference procedure. Drawing inspiration from the design of torchvision, we express transformations as composable modules.

## Supported transformations#

Here are all transformations that are available through docTR:

class doctr.transforms.Resize(output_size: Union[int, Tuple[int, int]], method: str = 'bilinear', preserve_aspect_ratio: bool = False, symmetric_pad: bool = False)[source]#

Resizes a tensor to a target size

```>>> import tensorflow as tf
>>> from doctr.transforms import Resize
>>> transfo = Resize((32, 32))
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters
• output_size – expected output size

• method – interpolation method

• preserve_aspect_ratio – if True, preserve aspect ratio and pad the rest with zeros

• symmetric_pad – if True while preserving aspect ratio, the padding will be done symmetrically

class doctr.transforms.Normalize(mean: , std: )[source]#

Normalize a tensor to a Gaussian distribution for each channel

```>>> import tensorflow as tf
>>> from doctr.transforms import Normalize
>>> transfo = Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters
• mean – average value per channel

• std – standard deviation per channel

class doctr.transforms.LambdaTransformation(fn: Callable[[Tensor], Tensor])[source]#

Normalize a tensor to a Gaussian distribution for each channel

```>>> import tensorflow as tf
>>> from doctr.transforms import LambdaTransformation
>>> transfo = LambdaTransformation(lambda x: x/ 255.)
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters

fn – the function to be applied to the input tensor

class doctr.transforms.ToGray(num_output_channels: int = 1)[source]#

Convert a RGB tensor (batch of images or image) to a 3-channels grayscale tensor

```>>> import tensorflow as tf
>>> from doctr.transforms import ToGray
>>> transfo = ToGray()
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
class doctr.transforms.ColorInversion(min_val: float = 0.5)[source]#

Applies the following tranformation to a tensor (image or batch of images): convert to grayscale, colorize (shift 0-values randomly), and then invert colors

```>>> import tensorflow as tf
>>> from doctr.transforms import ColorInversion
>>> transfo = ColorInversion(min_val=0.6)
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters

min_val – range [min_val, 1] to colorize RGB pixels

class doctr.transforms.RandomBrightness(max_delta: float = 0.3)[source]#

Randomly adjust brightness of a tensor (batch of images or image) by adding a delta to all pixels

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomBrightness
>>> transfo = RandomBrightness()
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters
• max_delta – offset to add to each pixel is randomly picked in [-max_delta, max_delta]

• p – probability to apply transformation

class doctr.transforms.RandomContrast(delta: float = 0.3)[source]#

Randomly adjust contrast of a tensor (batch of images or image) by adjusting each pixel: (img - mean) * contrast_factor + mean.

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomContrast
>>> transfo = RandomContrast()
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters

delta – multiplicative factor is picked in [1-delta, 1+delta] (reduce contrast if factor<1)

class doctr.transforms.RandomSaturation(delta: float = 0.5)[source]#

Randomly adjust saturation of a tensor (batch of images or image) by converting to HSV and increasing saturation by a factor.

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomSaturation
>>> transfo = RandomSaturation()
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters

delta – multiplicative factor is picked in [1-delta, 1+delta] (reduce saturation if factor<1)

class doctr.transforms.RandomHue(max_delta: float = 0.3)[source]#

Randomly adjust hue of a tensor (batch of images or image) by converting to HSV and adding a delta

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomHue
>>> transfo = RandomHue()
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters

max_delta – offset to add to each pixel is randomly picked in [-max_delta, max_delta]

class doctr.transforms.RandomGamma(min_gamma: float = 0.5, max_gamma: float = 1.5, min_gain: float = 0.8, max_gain: float = 1.2)[source]#

randomly performs gamma correction for a tensor (batch of images or image)

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomGamma
>>> transfo = RandomGamma()
>>> out = transfo(tf.random.uniform(shape=[8, 64, 64, 3], minval=0, maxval=1))
```
Parameters
• min_gamma – non-negative real number, lower bound for gamma param

• max_gamma – non-negative real number, upper bound for gamma

• min_gain – lower bound for constant multiplier

• max_gain – upper bound for constant multiplier

class doctr.transforms.RandomJpegQuality(min_quality: int = 60, max_quality: int = 100)[source]#

Randomly adjust jpeg quality of a 3 dimensional RGB image

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomJpegQuality
>>> transfo = RandomJpegQuality()
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters
• min_quality – int between [0, 100]

• max_quality – int between [0, 100]

class doctr.transforms.RandomRotate(max_angle: float = 5.0, expand: bool = False)[source]#

Randomly rotate a tensor image and its boxes Parameters
• max_angle – maximum angle for rotation, in degrees. Angles will be uniformly picked in [-max_angle, max_angle]

• expand – whether the image should be padded before the rotation

class doctr.transforms.RandomCrop(scale: = (0.08, 1.0), ratio: = (0.75, 1.33))[source]#

Randomly crop a tensor image and its boxes

Parameters
• scale – tuple of floats, relative (min_area, max_area) of the crop

• ratio – tuple of float, relative (min_ratio, max_ratio) where ratio = h/w

class doctr.transforms.GaussianBlur(kernel_shape: Union[int, Iterable[int]], std: )[source]#

Randomly adjust jpeg quality of a 3 dimensional RGB image

```>>> import tensorflow as tf
>>> from doctr.transforms import GaussianBlur
>>> transfo = GaussianBlur(3, (.1, 5))
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters
• kernel_shape – size of the blurring kernel

• std – min and max value of the standard deviation

class doctr.transforms.ChannelShuffle[source]#

Randomly shuffle channel order of a given image

class doctr.transforms.GaussianNoise(mean: float = 0.0, std: float = 1.0)[source]#

Adds Gaussian Noise to the input tensor

```>>> import tensorflow as tf
>>> from doctr.transforms import GaussianNoise
>>> transfo = GaussianNoise(0., 1.)
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters
• mean – mean of the gaussian distribution

• std – std of the gaussian distribution

class doctr.transforms.RandomHorizontalFlip(p: float)[source]#

Adds random horizontal flip to the input tensor/np.ndarray

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomHorizontalFlip
>>> transfo = RandomHorizontalFlip(p=0.5)
>>> image = tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1)
>>> target = {
>>> "boxes": np.array([[0.1, 0.1, 0.4, 0.5] ], dtype= np.float32),
>>> "labels": np.ones(1, dtype= np.int64)
>>> }
>>> out = transfo(image, target)
```
Parameters

p – probability of Horizontal Flip

```>>> import tensorflow as tf
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters

opacity_range – minimum and maximum opacity of the shade

## Composing transformations#

It is common to require several transformations to be performed consecutively.

class doctr.transforms.Compose(transforms: List[Callable[[Any], Any]])[source]#

Implements a wrapper that will apply transformations sequentially

```>>> import tensorflow as tf
>>> from doctr.transforms import Compose, Resize
>>> transfos = Compose([Resize((32, 32))])
>>> out = transfos(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters

transforms – list of transformation modules

class doctr.transforms.OneOf(transforms: List[Callable[[Any], Any]])[source]#

Randomly apply one of the input transformations

```>>> import tensorflow as tf
>>> from doctr.transforms import OneOf
>>> transfo = OneOf([JpegQuality(), Gamma()])
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters

transforms – list of transformations, one only will be picked

class doctr.transforms.RandomApply(transform: Callable[[Any], Any], p: float = 0.5)[source]#

Apply with a probability p the input transformation

```>>> import tensorflow as tf
>>> from doctr.transforms import RandomApply
>>> transfo = RandomApply(Gamma(), p=.5)
>>> out = transfo(tf.random.uniform(shape=[64, 64, 3], minval=0, maxval=1))
```
Parameters
• transform – transformation to apply

• p – probability to apply