Deep Neural Network Library (DNNL)
1.2.0

Performance library for Deep Learning

Pooling

The pooling primitive performs forward or backward max or average pooling operation on 1D, 2D, or 3D spatial data.

The pooling operation is defined by the following formulas. We show formulas only for 2D spatial data which are straightforward to generalize to cases of higher and lower dimensions. Variable names follow the standard Naming Conventions.

Max pooling:

\[ dst(n, c, oh, ow) = \max\limits_{kh, kw} \left( src(n, c, oh \cdot SH + kh - ph_0, ow \cdot SW +kw - pw_0) \right) \]

Average pooling:

\[ dst(n, c, oh, ow) = \frac{1}{DENOM} \sum\limits_{kh, kw} src(n, c, oh \cdot SH + kh - ph_0, ow \cdot SW +kw - pw_0) \]

where \(ph_0, pw_0\) are `padding_l[0]`

and `padding_l[1]`

respectively, and output spatial dimensions are calculated similarly to how they are done in convolution.

Average pooling supports two algorithms:

- dnnl_pooling_avg_include_padding, in which case \(DENOM = KH \cdot KW\),
- dnnl_pooling_avg_exclude_padding, in which case \(DENOM\) equals to the size of overlap between an averaging window and images.

TODO: a picture would be nice here.

- Max pooling requires
`workspace`

output for the dnnl_forward_training propagation kind, and doesn't require it for dnnl_forward_inference (see details below).

The backward propagation computes \(diff\_src(n, c, h, w)\), based on \(diff\_dst(n, c, h, w)\) and (in case of max pooling) `workspace`

.

- During training, max pooling requires a workspace on forward (dnnl_forward_training) and backward passes to save indices where a maximum was found. The workspace format is opaque, and the indices cannot be restored from it. However, one can use backward pooling to perform up-sampling (used in some detection topologies).
- A user can use memory format tag dnnl_format_tag_any for
`dst`

memory descriptor when creating pooling forward propagation. The library would derive the appropriate format from the`src`

memory descriptor. However, the`src`

itself must be defined. Similarly, a user can use memory format tag dnnl_format_tag_any for the`diff_src`

memory descriptor when creating pooling backward propagation.

The pooling primitive supports the following combinations of data types:

Propagation | Source / Destination | Accumulation data type (used for average pooling only) |
---|---|---|

forward / backward | f32, bf16 | f32 |

forward | f16 | f16 |

forward | s8, u8, s32 | s32 |

- Warning
- There might be hardware and/or implementation specific restrictions. Check Implementation Limitations section below.

Like other CNN primitives, the pooling primitive expects data to be an \(N \times C \times W\) tensor for the 1D spatial case, an \(N \times C \times H \times W\) tensor for the 2D spatial case, and an \(N \times C \times D \times H \times W\) tensor for the 3D spatial case.

The pooling primitive is optimized for the following memory formats:

Spatial | Logical tensor | Data type | Implementations optimized for memory formats |
---|---|---|---|

1D | NCW | f32 | dnnl_ncw (dnnl_abc), dnnl_nwc (dnnl_acb), optimized^ |

1D | NCW | s32, s8, u8 | dnnl_nwc (dnnl_acb), optimized^ |

2D | NCHW | f32 | dnnl_nchw (dnnl_abcd), dnnl_nhwc (dnnl_acdb), optimized^ |

2D | NCHW | s32, s8, u8 | dnnl_nhwc (dnnl_acdb), optimized^ |

3D | NCDHW | f32 | dnnl_ncdhw (dnnl_abcde), dnnl_ndhwc (dnnl_acdeb), optimized^ |

3D | NCDHW | s32, s8, u8 | dnnl_ndhwc (dnnl_acdeb), optimized^ |

Here *optimized^* means the format that comes out of any preceding compute-intensive primitive.

The pooling primitive does not support any post-ops or attributes.

- No primitive specific limitations. Refer to Data Types for limitations related to data types support.

N/A