Model support
kvcdn and the hosted service treat the artifact format as a model-agnostic container. The CLI verifies, quantizes, and uploads .kv files without requiring model weights. Supported model-specific generation is on the roadmap.
Current behavior
You produce.kv artifacts with your own inference stack. The artifact metadata includes model_name, dtype, and num_tokens. The actual tensor contents are opaque to the portal; only the SHA-256 checksum is verified on upload.
Roadmap
Planned future work includes:- Integrating a local transformer inference pipeline so the CLI can generate artifacts with real key/value tensors.
- Supporting common decoder-only architectures such as Llama, Mistral, Mixtral, Qwen, and Yi.
- Dispatching on the model’s
config.jsonarchitecturesfield to select the right adapter.
Adding a new family
When real generation is implemented, adding support for a new decoder-only architecture will require:- Implementing a
CausalLMtrait in the generation module. - Registering the architecture name in a model registry.
.kv files with your own tooling and use kvcdn verify, kvcdn upload, and the dashboard to manage them.