kvcdn quant

Quantize a .kv artifact to a lower-precision dtype.

Usage

kvcdn quant --input <FILE> --output <FILE> --dtype <DTYPE> [OPTIONS]

Options

OptionRequiredDescription
--inputyesPath to the source .kv artifact
--outputyesPath to write the quantized artifact
--dtypeyesTarget data type, e.g. F16, BF16, I8

Example

kvcdn quant --input ./context_F32_3tok.kv --output ./context_F16_3tok.kv --dtype F16

See also