Skip to content

Silly Tavern × SiliconFlow: chat, embeddings, Global / China

SiliconFlow offers OpenAI-compatible inference and embedding; the catalog often includes open and Chinese models—handy for Chinese workflows. Silly Tavern 1.17.0+ adds SiliconFlow under Chat Completion and supports SiliconFlow in the Vectors extension—one key and either Global (api.siliconflow.com) or China (api.siliconflow.cn) for both.

Important: In ST, SiliconFlow lives under Chat Completion, not Text Completion → Ollama.


Website screenshots (not the ST UI)

Registered user views on siliconflow.cn / .com—for orientation only.

SiliconFlow site banner

Model list excerpt

MiniMax on SiliconFlow

See also SiliconFlow’s blog post and API docs.


Before you start

  1. Silly Tavern ≥ 1.17.0, Node.js 20+ (per release notes).
  2. Register and create an API Key on SiliconFlow.
  3. Pricing and rate limits—only in the provider console.

Step 1: Chat Completion → SiliconFlow

  1. API Connections → main API: Chat Completion.
  2. Data source: SiliconFlow.
  3. Paste API Key, Connect.
  4. SiliconFlow Endpoint:
    • Global → https://api.siliconflow.com/v1
    • China → https://api.siliconflow.cn/v1
  5. SiliconFlow Model — your chat model.

Typical failures: wrong key, account region vs endpoint mismatch, network to the chosen host.


Step 2 (optional): embeddings in Vectors

In the Vectors extension pick SiliconFlow:

  • Same key as Chat Completion.
  • Endpoint follows siliconflow_endpoint from chat.
  • Choose an embedding model from ST’s list; empty list → check key and line.

Useful for lore / history RAG; chat-only is fine.


SiliconFlow vs OpenRouter

Both use Chat Completion in ST. OpenRouter is multi-provider routing; SiliconFlow is a platform with explicit CN/Global plus built-in embedding in ST 1.17. Pick by model list, price, and latency.


“Uncensor” / “unfiltered” labels

Colloquial; not an official store category. Read each model’s policy and local laws.



About the author

花

花(Hana)

AI工具評価の専門家。東京・新宿三丁目周辺で活動し、最新のAIアプリケーションやツールを実際に使用してレビューを提供しています。


FAQ

Why no SiliconFlow under Text Completion?

It is wired to Chat Completion; local GGUF stacks usually use Text Completion + Ollama/KoboldCpp.

Connected but embedding list is empty?

Check key permissions, endpoint region, network, ST/browser logs.

Works with presets and lore?

Yes; streaming and parameters depend on the model.

Below 1.17?

Integration landed in 1.17.0; upgrade ST and Node.


Published: March 31, 2026
Updated: March 31, 2026


Updated: