Image Functions¶
JPEG Decoding¶
Hybrid JPEG decoding on CPU and GPU using Nvjpeg.
-
class
augpy.
Decoder
(device_padding: int = 16777216, host_padding: int = 8388608, gpu_huffman: bool = False)[source]¶ Bases:
augpy._augpy.pybind11_object
Wrapper for Nvjpeg-based JPEG decoding, created on the current_device.
- See:
- Parameters
-
__init__
(self: augpy._augpy.'Decoder', device_padding: int = 16777216, host_padding: int = 8388608, gpu_huffman: bool = False) → None[source]¶ - Return type
-
decode
(self: augpy._augpy.'Decoder', data: str, buffer: augpy._augpy.CudaTensor = None) → augpy._augpy.CudaTensor[source]¶ Decode a JPEG image using Nvjpeg. Output is in \((H,W,C)\) format and resides on the GPU device.
- Parameters
data (str) – compressed JPEG image as a JFIF string, i.e., the full file contents
buffer (CudaTensor) – optional buffer to use; may be
None
; if notNone
must be big enough to contain the decoded image
- Returns
new tensor with decoded image on GPU in \((H,W,C)\) format
- Return type
Affine Warp¶
Functions to apply affine transformations on 2D images.
-
augpy.
make_transform
(source_size: Tuple[int, int], target_size: Tuple[int, int], angle: float = 0, scale: float = 1, aspect: float = 1, shift: Optional[Tuple[float, float]] = None, shear: Optional[Tuple[float, float]] = None, hmirror: bool = False, vmirror: bool = False, scale_mode: Union[str, augpy._augpy.WarpScaleMode] = <augpy._augpy.WarpScaleMode object>, max_supersampling: int = 3, out: Optional[numpy.ndarray] = None, __template__=array([[0., 0., 0.], [0., 0., 0.]], dtype=float32), **__) → Tuple[numpy.ndarray, int][source]¶ Convenience wrapper for
make_affine_matrix()
.- See:
make_affine_matrix()
slightly faster, less convenient version of this function.
- Parameters
source_size (Tuple[int, int]) – source height (\(h_s\)) and width (\(w_s\))
target_size (Tuple[int, int]) – target height (\(h_t\)) and width (\(w_t\))
angle (float) – clockwise angle in degrees with image center as rotation axis
scale (float) – scale factor relative to output size; 1 means fill target height or width wise depending on
scale_mode
and whichever is longest/shortest; larger values will crop, smaller values leave empty space in the output canvasaspect (float) – controls the aspect ratio; 1 means same as input, values greater 1 increase the width and reduce the height
shift (Optional[Tuple[float, float]]) –
(shifty, shiftx)
orNone
for(0, 0)
; shift the image in y (vertical) and x (horizontal) direction; 0 centers the image on the output canvas; -1 means shift up/left as much as possible; 1 means shfit down/right as much as possible; the maximum distance to shift is \(max(scale \cdot h_s - h_t, h_t - scale \cdot h_s)\)shear (Optional[Tuple[float, float]]) –
(sheary, shearx)
orNone
for(0, 0)
; controls up/down and left/right shear; for every pixel in the x direction movesheary
pixels in y direction, same for y directionhmirror (bool) – if
True
flip image horizontallyvmirror (bool) – if
True
flip image verticallyscale_mode (Union[str, WarpScaleMode]) – if
WarpScaleMode.WARP_SCALE_SHORTEST
scale is relative to shortest side; this fills the output canvas, cropping the image if necessary; ifWarpScaleMode.WARP_SCALE_LONGEST
scale is relative to longest side; this ensures the image is contained inside the output canvas, but leaves empty spacemax_supersampling (int) – upper limit for recommended supersampling
out (Optional[numpy.ndarray]) – optional \(2 \times 3\)
float
output array
- Returns
transformation matrix and suggested supersampling factor
- Return type
Tuple[numpy.ndarray, int]
-
augpy.
make_affine_matrix
(out: buffer, source_height: int, source_width: int, target_height: int, target_width: int, angle: float = 0.0, scale: float = 1.0, aspect: float = 1.0, shifty: float = 0.0, shiftx: float = 0.0, sheary: float = 0.0, shearx: float = 0.0, hmirror: bool = False, vmirror: bool = False, scale_mode: augpy._augpy.WarpScaleMode = WarpScaleMode.WARP_SCALE_SHORTEST, max_supersampling: int = 3) → int[source]¶ Create a \(2 imes 3\) matrix for a set of affine transformations. This matrix is compatible with the warpAffine function of OpenCV with the WARP_INVERSE_MAP flag set.
Transforms are applied in the following order:
shear
scale & aspect ratio
horizontal & vertical mirror
rotation
horizontal & vertical shift
- See:
make_transform()
for a more convenient version of this function.
- Parameters
out (buffer) – output buffer that matrix is written to; must be a writeable \(2 imes 3\)
float
buffersource_height (int) – \(h_s\) height of the image in pixels
source_width (int) – \(w_s\) width of the image in pixels
target_height (int) – \(h_t\) height of the output canvas in pixels
target_width (int) – \(w_t\) width of the output canvas in pixels
angle (float) – clockwise angle in degrees with image center as rotation axis
scale (float) – scale factor relative to output size; 1 means fill target height or width wise depending on
scale_mode
and whichever is longest/shortest; larger values will crop, smaller values leave empty space in the output canvasaspect (float) – controls the aspect ratio; 1 means same as input, values greater 1 increase the width and reduce the height
shifty (float) – shift the image in y direction (vertical); 0 centers the image on the output canvas; -1 means shift up as much as possible; 1 means shfit down as much as possible; the maximum distance to shift is \(max(scale \cdot h_s - h_t, h_t - scale \cdot h_s)\)
shiftx (float) – same as
shifty
, but in x direction (horizontal)sheary (float) – controls up/down shear; for every pixel in the x direction move
sheary
pixels in y directionshearx (float) – same as
sheary
but controls left/right shearhmirror (bool) – if
True
flip image horizontallyvmirror (bool) – if
True
flip image verticallyscale_mode (WarpScaleMode) – if
WarpScaleMode.WARP_SCALE_SHORTEST
scale is relative to shortest side; this fills the output canvas, cropping the image if necessary; ifWarpScaleMode.WARP_SCALE_LONGEST
scale is relative to longest side; this ensures the image is contained inside the output canvas, but leaves empty spacemax_supersampling (int) – upper limit for recommended supersampling
- Returns
recommended supersampling factor for the warp
- Return type
-
augpy.
warp_affine
(src: augpy._augpy.CudaTensor, dst: augpy._augpy.CudaTensor, matrix: buffer, background: augpy._augpy.CudaTensor, supersampling: int) → None[source]¶ Takes an image in channels-last format \((H, W, C)\) and affine warps it into a given output tensor in channels-first format \((C, H, W)\). Any blank canvas is filled with a background color. The warp is performed with bi-linear and supersampling.
- Parameters
src (CudaTensor) – image tensor
dst (CudaTensor) – target tensor
matrix (buffer) – \(2 imes 3\)
float
transformation matrix, seemake_affine_matrix()
for detailsbackground (CudaTensor) – background color to fill empty canvas
supersampling (int) – supersampling factor, e.g., 3 means 9 samples are taken in a \(3 imes 3\) grid
- Return type
-
class
augpy.
WarpScaleMode
(arg0: int)[source]¶ Bases:
object
Enum whether to scale relative to the shortest or longest side of the image.
Members:
- WARP_SCALE_SHORTEST :
Scaling is relative to the shortest side of the image.
- WARP_SCALE_LONGEST :
Scaling is relative to the longest side of the image.
-
property
WARP_SCALE_LONGEST
¶ - Enum whether to scale relative to the
shortest or longest side of the image.
Members:
- WARP_SCALE_SHORTEST :
Scaling is relative to the shortest side of the image.
- WARP_SCALE_LONGEST :
Scaling is relative to the longest side of the image.
-
property
WARP_SCALE_SHORTEST
¶ - Enum whether to scale relative to the
shortest or longest side of the image.
Members:
- WARP_SCALE_SHORTEST :
Scaling is relative to the shortest side of the image.
- WARP_SCALE_LONGEST :
Scaling is relative to the longest side of the image.
Lighting¶
The following functions change the lighting of 2D images. Both input and output must be contiguous and in channel-first format \((C,H,W)\) (channel, height, width). All dtypes and an arbitrary number of channels is supported.
Output tensor out
may be None
, in which case a new tensor
of the same shape and dtype as the input is returned.
Output tensor must be same shape and dtype as
the input.
If output is given None
is returned.
-
augpy.
lighting
(imtensor: augpy._augpy.CudaTensor, gammagrays: augpy._augpy.CudaTensor, gammacolors: augpy._augpy.CudaTensor, contrasts: augpy._augpy.CudaTensor, vmin: float, vmax: float, out: augpy._augpy.CudaTensor = None) → augpy._augpy.CudaTensor[source]¶ Apply lighting augmentation to a batch of images. This is a four-step process:
Normalize values :math:`v_{norm} =
- rac{v - v_{min}}{v_{max}-v_{min}}`
with \(v_{max}\) the minimum and \(v_{max}\) the maximum lightness value
Apply contrast change
Apply gamma correction
Denormalize values \(v' = v_{norm} * (v_{max}-v_{min}) + v_{min}\)
To change contrast two reference functions are used. With contrast \(\mathcal{c} \ge 0\), i.e., increased contrast, the following function is used:
\[f_{pos}(v) =\]
rac{1.0037575963899724}{1 + exp(6.279 + v cdot 12.558)} - 0.0018787981949862
With contrast \(\mathcal{c} < 0\), i.e., decreased contrast, the following function is used:
\[f_{neg}(v) = 0.1755606108304832 \cdot atanh(v \cdot 1.986608 - 0.993304) + 0.5\]The final value is \(v' = (1-\mathcal{c}) \cdot v + \mathcal{c} \cdot f(v)\).
Brightness and color changes are done via gamma correction.
\[v' = v^{\gamma_{gray} \cdot \gamma_c}\]with \(\gamma_{gray}\) the gamma for overall lightness and \(\gamma_{c}\) the per-channel gamma.
- Parameters:
tensor: image tensor in \((N,C,H,W)\) format gammagrays: tensor of \(N\) gamma gray values gammacolors: tensor of \(C\cdot N\) gamma values in the format
\(\gamma_{1,1}, \gamma_{1,2}, ..., \gamma_{1,C}, \gamma_{2,1}, \gamma_{2,2}, ... \gamma_{N,C-1}, \gamma_{N,C}\)
contrasts: tensor of \(N\) contrast values in \([-1, 1]\) vmin: minimum lightness value in images vmax: maximum lightness value in images out: output tensor (may be
None
)- Returns:
new tensor if
out
isNone
, elseout
- rtype
CudaTensor
Blur¶
The following functions apply different types of blur on 2D images. Both input and output must be contiguous and in channel-first format (channel, height, width). All dtypes are supported. Edge values are repeated for locations that fall outside the input image.
Output tensor out
may be None
, in which case a new tensor
of the same shape and dtype as the input is returned.
Output tensor must be same shape and dtype as
the input.
If output is given None
is returned.
-
augpy.
box_blur_single
(input: augpy._augpy.CudaTensor, ksize: int, out: augpy._augpy.CudaTensor = None) → augpy._augpy.CudaTensor[source]¶ Apply box blur to a single image.
Kernel size describes both width and height in pixels of the area in the input that is averaged for each output pixel. Odd values are recommended for best results. For even values, the center of the kernel is below and to the right of the true center. This means the output is shifted up and left by half a pixel.
- Parameters
input (CudaTensor) – image tensor in channel-first format
ksize (int) – kernel size in pixels
out (CudaTensor) – output tensor (may be
None
)
- Returns
new tensor if
out
isNone
, elseout
- Return type
-
augpy.
gaussian_blur_single
(input: augpy._augpy.CudaTensor, sigma: float, out: augpy._augpy.CudaTensor = None) → augpy._augpy.CudaTensor[source]¶ Apply Gaussian blur to a single image.
Kernel size is calculated like this:
ksize = max(3, int(sigma * 6.6 - 2.3) | 1)
I.e.,
ksize
is at least 3 and always odd.- Parameters
input (CudaTensor) – image tensor in channel-first format
sigma (float) – standard deviation of the kernel
out (CudaTensor) – output tensor (may be
None
)
- Returns
new tensor if
out
isNone
, elseout
- Return type
-
augpy.
gaussian_blur
(input: augpy._augpy.CudaTensor, sigmas: augpy._augpy.CudaTensor, max_ksize: int, out: augpy._augpy.CudaTensor = None) → augpy._augpy.CudaTensor[source]¶ Apply Gaussian blur to a batch of images.
Maximum kernel size can be calculated like this:
ksize = max(3, int(max(sigmas) * 6.6 - 2.3) | 1)
I.e.,
ksize
is at least 3 and always odd.The given kernel size defines the upper limit. The actual kernel size is calculated with the formula above and clipped at the given maximum.
Smaller values can be given to trade speed vs quality. Bigger values typically do not visibly improve quality.
Odd values are strongly recommended for best results. For even values, the center of the kernel is below and to the right of the true center. This means the output is shifted up and left by half a pixel. This can lead to inconsistencies between images in the batch. Images with large sigmas may be shifted, while smaller sigmas mean no shift occurs.
- Parameters
input (CudaTensor) – batch tensor with images in first dimension
sigmas (CudaTensor) – float tensor with one sigma value per image in the batch
max_ksize (int) – maximum kernel size in pixels
out (CudaTensor) – output tensor (may be
None
)
- Returns
new tensor if
out
isNone
, elseout
- Return type
augpy.image¶
-
class
augpy.image.
DecodeWarp
(batch_size: int, shape: Tuple[int, int, int], background: augpy._augpy.CudaTensor = None, dtype: augpy._augpy.DLDataType = <augpy._augpy.DLDataType object>, cpu_threads: int = 1, num_buffers: int = 2, decode_buffer_size: Optional[int] = None)[source]¶ Bases:
object
Use
Decoder.decode()
to decode JPEG images in memory and applywarp_affine()
into batch tensor buffers.DecodeWarp instances allocate buffers and decoders on the current_device.
- Parameters
batch_size (int) – number of samples in a batch
shape (Tuple[int, int, int]) – shape of one image in the batch \((C,H,W)\)
background (CudaTensor) – tensor with \(C\) background color values
dtype (DLDataType) – type used for buffer tensors
cpu_threads (int) – number of parallel decoders
num_buffers (int) – number of buffers tensors
decode_buffer_size (Optional[int]) – size of pre-allocated buffer for decoding; must be larger than the number of subpixels; if
None
a new buffer is allocated every time
-
__call__
(batch: dict) → dict[source]¶ Decode a list of JPEG images under the
'image'
key and warp them into a batch tensor with the parameters defined by a list of augmentation dicts under key'augmentation'
.Each set of augmentation parameters is a dict that contains values for the parameters of the
warp_affine()
function. Additional parameters are ignored.
-
class
augpy.image.
Lighting
(batch_size: int, channels: int = 3, min_value: Union[int, float] = 0, max_value: Union[int, float] = 255)[source]¶ Bases:
object
Apply the
lighting()
function to a batch of images. The batch tensor must have format \((N,C,H,W)\), with batch size \(N\), number of channels \(C\), height \(H\), and width \(W\).Lighting instances allocate buffers on the current_device.
- Parameters
-
__call__
(batch)[source]¶ Apply lighting augmentation to a batch of images in a tensor under the
'image'
key, with parameters parameters defined by a list of augmentation dicts under key'augmentation'
. Modifies the image tensor in-place.Each set of augmentation parameters is a dict that contains values for the parameters of the
lighting()
function. Additional parameters are ignored.- Parameters
batch – dict
{'image': [JPEG, JPEG, ...], 'augmentation': [params, params, ...]}
- Returns
batch
where'image'
has been modified in-place according to given augmentation parameters.