models.segmentation

This file provides the definition of the convolutional heads used to predict masks, as well as the losses.

Classes

`MaskHeadSmallConv`	Simple convolutional head, using group norm.
`MHAttentionMap`	This is a 2D attention module, which only returns the attention softmax (no multiplication by value)
`PostProcessPanoptic`	This class converts the output of the model to the final panoptic result, in the format expected by the

`dice_loss`(inputs, targets, num_boxes)	Compute the DICE loss, similar to generalized IOU for masks
`sigmoid_focal_loss`(inputs, targets, num_boxes[, ...])	Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002.

class models.segmentation.MaskHeadSmallConv(dim, fpn_dims, context_dim)[source]

Simple convolutional head, using group norm.

Upsampling is done using a FPN approach

class models.segmentation.MHAttentionMap(query_dim, hidden_dim, num_heads, dropout=0.0, bias=True)[source]: This is a 2D attention module, which only returns the attention softmax (no multiplication by value)

models.segmentation.dice_loss(inputs, targets, num_boxes)[source]

Compute the DICE loss, similar to generalized IOU for masks :param inputs: A float tensor of arbitrary shape.

The predictions for each example.

Parameters:

targets –

A float tensor with the same shape as inputs. Stores the binary: classification label for each element in inputs

(0 for the negative class and 1 for the positive class).

models.segmentation.sigmoid_focal_loss(inputs, targets, num_boxes, alpha: float = 0.25, gamma: float = 2)[source]

Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002. :param inputs: A float tensor of arbitrary shape.

The predictions for each example.

Parameters:

targets –

A float tensor with the same shape as inputs. Stores the binary
classification label for each element in inputs

(0 for the negative class and 1 for the positive class).
alpha – (optional) Weighting factor in range (0,1) to balance positive vs negative examples. Default = -1 (no weighting).
gamma – Exponent of the modulating factor (1 - p_t) to balance easy vs hard examples.

Returns:

Loss tensor

class models.segmentation.PostProcessPanoptic(is_thing_map, threshold=0.85)[source]

This class converts the output of the model to the final panoptic result, in the format expected by the coco panoptic API.

forward(outputs, processed_sizes, target_sizes=None)[source]

This function computes the panoptic prediction from the model’s predictions.

Parameters:

outputs – This is a dict coming directly from the model. See the model doc for the content.
processed_sizes – This is a list of tuples (or torch tensors) of sizes of the images that were passed to the model, ie the size after data augmentation but before batching.
target_sizes – This is a list of tuples (or torch tensors) corresponding to the requested final size of each prediction. If left to None, it will default to the processed_sizes