intel_npu_acceleration_library.functional package#
Submodules#
intel_npu_acceleration_library.functional.scaled_dot_product_attention module#
- intel_npu_acceleration_library.functional.scaled_dot_product_attention.scaled_dot_product_attention(query: Tensor, key: Tensor, value: Tensor, attn_mask: Tensor | None = None, dropout_p: float = 0.0, is_causal: bool = False, scale: float | None = None) Tensor #
Execute SDPA kernel.
- Parameters:
query (torch.Tensor) – query tensor
key (torch.Tensor) – key tensor
value (torch.Tensor) – value tensor
attn_mask (torch.Tensor, optional) – attention mask tensor. Defaults to None.
dropout_p (float, optional) – optional dropout. Defaults to 0.0.
is_causal (bool, optional) – enable causal mask. Defaults to False.
scale (Optional[float], optional) – custom scale. Defaults to None.
- Raises:
RuntimeError – _description_
- Returns:
_description_
- Return type:
torch.Tensor
Module contents#
- intel_npu_acceleration_library.functional.scaled_dot_product_attention(query: Tensor, key: Tensor, value: Tensor, attn_mask: Tensor | None = None, dropout_p: float = 0.0, is_causal: bool = False, scale: float | None = None) Tensor #
Execute SDPA kernel.
- Parameters:
query (torch.Tensor) – query tensor
key (torch.Tensor) – key tensor
value (torch.Tensor) – value tensor
attn_mask (torch.Tensor, optional) – attention mask tensor. Defaults to None.
dropout_p (float, optional) – optional dropout. Defaults to 0.0.
is_causal (bool, optional) – enable causal mask. Defaults to False.
scale (Optional[float], optional) – custom scale. Defaults to None.
- Raises:
RuntimeError – _description_
- Returns:
_description_
- Return type:
torch.Tensor