Skip to content

AI Settings

These settings control which AI model powers your experience and how it behaves. They apply globally across all worlds. You'll find them in Settings > AI Configuration.

Choosing a model

Yumina offers a curated lineup of models across four cost tiers:

TierExamplesNotes
BudgetYumina Free, Gemini 2.5 Flash Lite, Grok 4.1 Fast, DeepSeek V3.2Good for casual play. Grok 4.1 Fast is the platform default.
StandardGemini 2.5 Flash, Gemini 3 Flash, DeepSeek V4 ProBetter writing quality and instruction following.
PremiumClaude Haiku 4.5, Grok 4.20, Gemini 3.1 ProNoticeably better characterization and narrative coherence. Requires Go plan or above.
UltraClaude Sonnet 4.6, Claude Opus 4.7Best writing quality available. Requires Plus plan or above.

Higher tiers cost more credits per response but produce better writing. If you're unsure, start with the default (Grok 4.1 Fast) and experiment from there.

Pinned models

You can pin up to 8 models for quick access in the model picker. Four are pinned by default. Go to Settings > AI Configuration > Your Models to manage pins. Click any pinned model to set it as your default.

Recently used

Models you've used recently appear below your pinned list (if they aren't already pinned). You can pin them from there.

Context size

What it is: How much conversation history the AI can "see" when generating a response, measured in tokens. More context means the AI remembers more of what happened earlier in your session.

Default: 64,000 tokens (Free plan) / 96,000 tokens (Gold) / up to 2M with BYOK.

Recommendation: For most play, 42k-62k is the sweet spot -- enough context for the AI to maintain narrative consistency without unnecessary cost. Going above 96k rarely improves the experience unless you're in a very long session with complex state. The setting is in Settings > AI Configuration > Context Size.

Free plan users have a cap on context size. Upgrading your plan or using BYOK removes the cap.

Creativity (temperature)

The temperature slider controls how random/creative the AI's responses are:

  • Lower (toward 0.5): More predictable, focused, consistent. Good for strategy games or worlds where precision matters.
  • Higher (toward 1.5): More creative, varied, surprising. Good for creative writing and exploration.
  • Default: 1.0 -- balanced for most use cases.

The slider in Settings runs from 0.5 to 1.5. Don't go past 1.3 unless you want the AI to get noticeably more unpredictable.

Response length (max tokens)

Controls the maximum length of a single AI response. Default is 12,000 tokens. Increase for longer, more detailed responses; decrease for snappier, more concise ones. Range: 256 to 32,768.

Reasoning effort

For models that support reasoning (Claude, GPT-5), this controls how much "thinking" the AI does before responding:

LevelEffect
MinimalLeast thinking, fastest responses, lowest cost
LowLight reasoning (default)
MediumMore careful responses
HighMost thorough, slowest, highest cost

For most roleplay and interactive fiction, Low is fine. Bump it up if the AI is making logical errors or forgetting constraints.

Streaming

When on (default), AI responses appear token by token as they're generated. When off, the full response appears at once after generation completes. Keep this on unless your connection is unstable.

Advanced sampling parameters

Under the Advanced Parameters toggle in AI Configuration:

ParameterDefaultWhat it does
Top P1.0Nucleus sampling -- limits the candidate pool to the top P% of likely tokens. Lower = more focused.
Frequency Penalty0.0Reduces word repetition. Try 0.3-0.5 if the AI keeps repeating itself.
Presence Penalty0.0Encourages new topics. Try 0.2-0.3 if the AI keeps circling the same ideas.
Top K0 (off)Hard limit on candidate tokens. Usually not needed alongside Top P.
Min P0 (off)Minimum probability threshold. Smarter alternative to Top K.

Rule of thumb: Adjust temperature first. Only touch these if temperature alone doesn't solve your problem, and change one at a time.

Bring Your Own Key (BYOK)

You can use your own API key instead of Yumina credits. Go to Settings > AI Configuration and switch to Private Key mode.

Supported providers:

ProviderWhere to get a key
OpenRouteropenrouter.ai/keys -- one key unlocks hundreds of models
Anthropicconsole.anthropic.com
OpenAIplatform.openai.com
Googleaistudio.google.dev
Ollamaollama.com -- run models locally

Setup:

  1. Switch the provider toggle from Yumina API to Private Key
  2. Select your provider and enter your key
  3. Click verify to test the key

Your key is encrypted at rest (AES-256-GCM). The raw key is never returned from the server after storage -- only metadata (provider, label, masked suffix).

With BYOK, you have no context size cap and access to whatever models your provider offers. Costs go directly to your API provider instead of Yumina credits.

Custom prompts

An advanced feature for tuning AI behavior across all worlds. Found in Settings > AI Configuration at the bottom.

You can inject your own prompts at three positions:

  • System -- into the system prompt (strongest effect)
  • In-Chat -- into the middle of the chat history
  • Final -- at the very end, right before the AI responds

Use this if the AI consistently misbehaves in a specific way (always forgetting a rule, always responding in the wrong language, etc.). Most players won't need this.

Prompt presets

Every world's creator sets up default prompt presets. You can choose:

  • Use Creator's -- use what the creator intended (recommended)
  • Use My Own -- override with your own configuration

Unless you understand the prompt architecture, leave this on Creator's. Changing presets can break worlds in subtle ways.