models.segmentation
This file provides the definition of the convolutional heads used to predict masks, as well as the losses.
Classes
Simple convolutional head, using group norm. |
|
This is a 2D attention module, which only returns the attention softmax (no multiplication by value) |
|
This class converts the output of the model to the final panoptic result, in the format expected by the |
Functions
|
Compute the DICE loss, similar to generalized IOU for masks |
|
Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002. |
Module Contents
- class models.segmentation.MaskHeadSmallConv(dim, fpn_dims, context_dim)[source]
Simple convolutional head, using group norm.
Upsampling is done using a FPN approach
- class models.segmentation.MHAttentionMap(query_dim, hidden_dim, num_heads, dropout=0.0, bias=True)[source]
This is a 2D attention module, which only returns the attention softmax (no multiplication by value)
- models.segmentation.dice_loss(inputs, targets, num_boxes)[source]
Compute the DICE loss, similar to generalized IOU for masks :param inputs: A float tensor of arbitrary shape.
The predictions for each example.
- Parameters:
targets –
- A float tensor with the same shape as inputs. Stores the binary
classification label for each element in inputs
(0 for the negative class and 1 for the positive class).
- models.segmentation.sigmoid_focal_loss(inputs, targets, num_boxes, alpha: float = 0.25, gamma: float = 2)[source]
Loss used in RetinaNet for dense detection: https://arxiv.org/abs/1708.02002. :param inputs: A float tensor of arbitrary shape.
The predictions for each example.
- Parameters:
targets –
- A float tensor with the same shape as inputs. Stores the binary
classification label for each element in inputs
(0 for the negative class and 1 for the positive class).
alpha – (optional) Weighting factor in range (0,1) to balance positive vs negative examples. Default = -1 (no weighting).
gamma – Exponent of the modulating factor (1 - p_t) to balance easy vs hard examples.
- Returns:
Loss tensor
- class models.segmentation.PostProcessPanoptic(is_thing_map, threshold=0.85)[source]
This class converts the output of the model to the final panoptic result, in the format expected by the coco panoptic API.
- forward(outputs, processed_sizes, target_sizes=None)[source]
This function computes the panoptic prediction from the model’s predictions.
- Parameters:
outputs – This is a dict coming directly from the model. See the model doc for the content.
processed_sizes – This is a list of tuples (or torch tensors) of sizes of the images that were passed to the model, ie the size after data augmentation but before batching.
target_sizes – This is a list of tuples (or torch tensors) corresponding to the requested final size of each prediction. If left to None, it will default to the processed_sizes