Quickstart

This guide walks you through installing kvcdn and uploading your first KV cache artifact to KV Cache Store.

1. Install the CLI

kvcdn is an x86-64 Linux binary. ARM64 support is planned for a future release.
curl -L https://kvcachestore.com/download/kvcdn -o kvcdn
chmod +x kvcdn
sudo mv kvcdn /usr/local/bin/
kvcdn --version
See Installation for build-from-source options.

2. Sign in to the portal

Open https://kvcachestore.com and sign in with the configured OIDC provider. The first time you sign in, KV Cache Store creates:
  • a customer record,
  • a default organization,
  • a default project,
  • and a storage namespace for your uploads.

3. Create an API key

In the dashboard, go to Settings > API Keys and create a key. Copy the secret — it is shown only once. Set it in the CLI:
kvcdn api-key set kv_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
kvcdn api-key verify

4. Generate a KV cache artifact

kvcdn verify runs a model against a context, produces a reusable .kv artifact, and checks that loading the artifact gives the same result as a full prefill:
kvcdn verify \
  --kv-path ./context.kv \
  --context-file ./context.txt \
  --question "Summarize the key claim."
The command prints the generated artifact path. The file contains the KV-cache tensors in safetensors format plus a JSON sidecar with model_name, dtype, and num_tokens.

5. Upload the artifact

kvcdn upload ./context.kv --name "Qwen system prompt" --visibility private
kvcdn upload reads the artifact metadata, asks the portal for a presigned upload URL, PUTs the file to object storage, and then confirms the upload.

6. View the artifact in the dashboard

Go to https://kvcachestore.com/app. The artifact appears with its model, dtype, token count, size, and visibility. You can toggle visibility between private and public from the artifact page.

7. Share or consume the artifact

For a public artifact, the dashboard shows a public URL:
https://x.kvcdn.io/{org_slug}/{project_slug}/{artifact_id}
Anyone can fetch it anonymously:
curl -L -o context.kv https://x.kvcdn.io/my-org-a18620ee/default/198dc22d-...
Then load the KV cache into your inference pipeline before generating tokens. The model skips re-prefilling the cached context, so you pay only for new tokens.

Next steps