models.segmentation

This file provides the definition of the convolutional heads used to predict masks, as well as the losses.

Classes

MaskHeadSmallConv

Simple convolutional head, using group norm.

MHAttentionMap

This is a 2D attention module, which only returns the attention softmax (no multiplication by value)

PostProcessPanoptic

This class converts the output of the model to the final panoptic result, in the format expected by the

Functions

dice_loss(inputs, targets, num_boxes)

Compute the DICE loss, similar to generalized IOU for masks

sigmoid_focal_loss(inputs, targets, num_boxes[, ...])

Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002.

Module Contents

class models.segmentation.MaskHeadSmallConv(dim, fpn_dims, context_dim)[source]

Simple convolutional head, using group norm.

Upsampling is done using a FPN approach

class models.segmentation.MHAttentionMap(query_dim, hidden_dim, num_heads, dropout=0.0, bias=True)[source]

This is a 2D attention module, which only returns the attention softmax (no multiplication by value)

models.segmentation.dice_loss(inputs, targets, num_boxes)[source]

Compute the DICE loss, similar to generalized IOU for masks :param inputs: A float tensor of arbitrary shape.

The predictions for each example.

Parameters:

targets

A float tensor with the same shape as inputs. Stores the binary

classification label for each element in inputs

(0 for the negative class and 1 for the positive class).

models.segmentation.sigmoid_focal_loss(inputs, targets, num_boxes, alpha: float = 0.25, gamma: float = 2)[source]

Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002. :param inputs: A float tensor of arbitrary shape.

The predictions for each example.

Parameters:
  • targets

    A float tensor with the same shape as inputs. Stores the binary

    classification label for each element in inputs

    (0 for the negative class and 1 for the positive class).

  • alpha – (optional) Weighting factor in range (0,1) to balance positive vs negative examples. Default = -1 (no weighting).

  • gamma – Exponent of the modulating factor (1 - p_t) to balance easy vs hard examples.

Returns:

Loss tensor

class models.segmentation.PostProcessPanoptic(is_thing_map, threshold=0.85)[source]

This class converts the output of the model to the final panoptic result, in the format expected by the coco panoptic API.

forward(outputs, processed_sizes, target_sizes=None)[source]

This function computes the panoptic prediction from the model’s predictions.

Parameters:
  • outputs – This is a dict coming directly from the model. See the model doc for the content.

  • processed_sizes – This is a list of tuples (or torch tensors) of sizes of the images that were passed to the model, ie the size after data augmentation but before batching.

  • target_sizes – This is a list of tuples (or torch tensors) corresponding to the requested final size of each prediction. If left to None, it will default to the processed_sizes