fastchat.serve.monkey_patch_non_inplace
Monkey patch the llama implementation in the huggingface/transformers library. Avoid bugs in mps backend by not using in-place operations.
Module Contents
Functions
|
Rotates half the hidden dims of the input. |
Avoid bugs in mps backend by not using in-place operations. |