Local development workflow
This repository is the hosted web application: the Rust portal, the Cloudflare Worker, and the dashboard frontend. The customer CLI is maintained separately inkvcachestore/kvcdn-cli so its release cycle can follow inference-tooling changes without coupling to the web service.
When you are working on the portal or Worker, you often need a .kv file to exercise upload, metadata parsing, storage routing, and dashboard workflows. The kvcdn-cli repository includes an offline placeholder generator that writes a syntactically valid .kv artifact with the same JSON envelope but no tensor data.
Why a placeholder generator exists
Real KV caches are large, model-specific, and produced by a transformer forward pass. The right place to build them is inside the inference stack that owns the model weights and tokenizer. The customer CLI’s job is to validate, package, and transport artifacts once they exist. The placeholder generator exists only to unblock development and integration testing. It emits a valid artifact with correct metadata so you can test presigned uploads, worker routing, artifact listing, and visibility controls without a GPU.Generate a placeholder artifact
Build the CLI from the external repository and run the development generator:Options
| Option | Required | Default | Description |
|---|---|---|---|
--model | yes | — | Model name, used in the artifact metadata |
--dtype | yes | — | Data type, e.g. F32, F16, BF16, I8 |
--embedding | yes | — | Embedding name or variant |
--d | yes | — | Model dimension / hidden size |
--r | yes | — | Number of KV heads or rank |
--prompt | no | prompt | Prompt text used to infer token count for metadata |
--output | no | . | Directory where the artifact file is written |
<model>_<dtype>_<N>tok.kv, with /, \, and : replaced by _ in the model name. The token count is derived from the number of whitespace-separated words in --prompt.
Placeholder format
The generated file is JSON with the following structure:kv array is empty in the placeholder. To produce real KV-cache tensors, integrate the artifact writer with your local transformer inference pipeline so it serializes key/value tensors in the same JSON envelope.
Upload the artifact
Once you have a.kv file, upload it with the customer CLI:
View uploaded artifacts
Open the dashboard at https://kvcachestore.com/app to see artifacts in your active project.Next steps
- Read the per-command CLI reference.
- Learn how to upload artifacts to the hosted service.
- Understand the architecture of the CLI, portal, and Worker.