fastchat.serve.compression

Module Contents

Classes

CompressionConfig

Group-wise quantization.

CLinear

Compressed Linear Layer.

Functions

compress(tensor, config)

Simulate group-wise quantization.

decompress(packed_data, config)

Simulate group-wise dequantization.

class fastchat.serve.compression.CompressionConfig[source]

Group-wise quantization.

class fastchat.serve.compression.CLinear(weight=None, bias=None, device=None)[source]

Compressed Linear Layer.

fastchat.serve.compression.compress(tensor, config)[source]

Simulate group-wise quantization.

fastchat.serve.compression.decompress(packed_data, config)[source]

Simulate group-wise dequantization.