Skip to content

Audio Design Guide

This isn't a single recipe — it's a collection of techniques for using Yumina's audio system to craft a complete soundscape: BGM that follows scene changes, keyword-triggered sound effects, looping ambient sounds, silky crossfades, and every way to control music from custom UI and AI narration.


Overview

Yumina's audio system has multiple entry points for controlling sound at different levels:

Control MethodWhere to Set ItCharacteristics
Playlist autoplayAudio tabSimplest — background music starts as soon as the player enters the world
Conditional BGMAudio tabAutomatically switches tracks when variable/keyword/turn-count conditions are met, no behaviors needed
Behavior + Play Audio actionBehaviors tabCrossfade on scene transitions, combine with variable changes and entry toggling
AI audio directivesAI writes [audio: ...] in its replyAI decides what to play and when — most flexible, also least predictable
Message renderer APIMessage Renderer tabTrigger from custom buttons, great for jukebox-style interactive UI

This guide covers 7 patterns one by one. Pick what you need, or combine them.


Pattern 1: Basic BGM Setup

What you'll build

Background music that starts playing the moment a player enters your world — looping, no extra setup required.

Step by step

Step 1: Upload audio tracks

Editor → Audio tab → click "Add Track"

FieldValueWhy
Display NameMain ThemeFor your own reference
IDmain_themeAll references use this ID
TypeBGMBackground music
Audio FileUpload your .mp3 or .ogg fileCommon audio formats supported
LoopOnBGM usually needs to loop
Volume0.7Don't go too loud — leave room for SFX and ambient
Fade In2 secondsGradually fades in so it's not jarring

Want more than one track? Repeat the steps above to create additional tracks — e.g. explore_bgm (exploration), battle_bgm (combat), town_bgm (town).

Step 2: Set up the playlist

Still in the Audio tab, find the "BGM Playlist" section:

FieldValueWhy
Track ListSelect main_theme (select all if you have multiple)Tracks in the list play in order
Play Modeloop or shuffleloop = play in order, repeat; shuffle = randomize
AutoplayOnMusic starts when the player enters the world
Wait for First MessageOff (or On, depending on your preference)If On, music waits until the player sends their first message
Gap Seconds0 (or 2)Pause between tracks; 0 = seamless transition

Result

Player enters the world → music starts automatically (with a 2-second fade-in) → when one track ends, the next one starts → after the last track, it loops back to the beginning.

Only have one BGM track?

If you've only got one song, just turn on loop for that track and put it alone in the playlist. You can even skip the playlist entirely and control it via Conditional BGM or AI directives instead.


Pattern 2: Crossfading BGM on Scene Transitions

What you'll build

When the player moves from "Village" to "Dungeon", the village's gentle music gradually fades out while the dungeon's ominous music fades in — both tracks play simultaneously for a brief overlap, making the transition smooth and cinematic.

How it works

Player clicks "Go to Dungeon" button
  → Behavior fires: sets variable location = "dungeon"
  → Same behavior's Play Audio action: crossfade to dungeon_bgm
  → Old track fades out + new track fades in, transition duration 2 seconds

Step by step

Step 1: Prepare audio tracks

In the Audio tab, create two (or more) BGM tracks:

  • village_bgm — village music, type BGM, loop on
  • dungeon_bgm — dungeon music, type BGM, loop on

Put village_bgm in the playlist as the default track, with autoplay on.

Step 2: Create a variable

Editor → Variables tab → Add Variable

FieldValue
Display NameCurrent Location
IDlocation
TypeString
Default Valuevillage

Step 3: Create a behavior

Editor → Behaviors tab → Add Behavior

Behavior name: Go to Dungeon

Trigger: Action → Action ID: go-dungeon

Actions (in order):

#Action TypeSettingPurpose
1Set Variablelocation set to dungeonRecord that the player went to the dungeon
2Play AudioTrack dungeon_bgm, operation: crossfade, fade duration 2 secondsSilky-smooth track switch
3Enable EntryDungeon AtmosphereTurn on the dungeon lore
4Disable EntryVillage AtmosphereTurn off the village lore

Create a matching "Return to Village" behavior with action ID go-village that does the reverse (crossfade to village_bgm, swap entry toggles).

Step 4: Trigger from the message renderer

In the Message Renderer TSX code, call executeAction on button click:

tsx
<button onClick={() => api.executeAction("go-dungeon")}>
  Go to Dungeon
</button>

The behavior executes all its actions in sequence — changes the variable, switches the music, toggles entries — all from one button click.

What is crossfade? Cross-fade — the old track gradually gets quieter while the new track gradually gets louder. Both tracks play simultaneously for a brief period, sounding like a cinematic scene transition instead of an abrupt cut. A fade duration of 2-3 seconds works well.


Pattern 3: Keyword-Triggered Sound Effects

What you'll build

An explosion sound plays automatically when the AI writes "explosion". A creaky door sound plays when the player says "open the door". No behaviors needed — you configure this directly in the Audio tab's Conditional BGM section.

How it works

Conditional BGM has trigger types called ai-keyword (AI keyword) and keyword (player keyword). The engine scans each message's text and plays the corresponding track when a keyword matches. Despite the name "Conditional BGM", it can point to any track type — including SFX.

Step by step

Step 1: Create SFX tracks

In the Audio tab, create sound effect tracks:

Explosion SFX:

FieldValue
Display NameExplosion
IDexplosion_sfx
TypeSFX
LoopOff (sound effects usually play once)
Volume0.9

Door Open SFX:

FieldValue
Display NameDoor Open
IDdoor_open_sfx
TypeSFX
LoopOff
Volume0.8

Step 2: Create Conditional BGM rules

Still in the Audio tab, find the "Conditional BGM" section → click "Add Rule"

Rule 1: Play SFX when AI says "explosion"

FieldValueWhy
NameAI Explosion SFXFor your own reference
Trigger TypeAI Keyword (ai-keyword)Fires when the AI's reply contains the specified keyword
Keywordsexplosion, blast, detonateYou can add multiple synonyms — matching any one triggers it
Target Trackexplosion_sfxPlay the explosion sound
Stop Current BGMOffSFX layers on top of the BGM — don't stop the music

Rule 2: Play SFX when player says "open the door"

FieldValueWhy
NamePlayer Door SFXFor your own reference
Trigger TypePlayer Keyword (keyword)Fires when the player's message contains the specified keyword
Keywordsopen the door, push the door, open doorMultiple synonyms
Target Trackdoor_open_sfxPlay the door sound
Stop Current BGMOffSame as above

Result

AI writes: "BOOM — a deafening explosion rocks the hillside, fire lighting up the entire sky."
  → Engine scans and matches "explosion" → auto-plays explosion_sfx
  → Player hears the explosion sound while BGM keeps playing

Player types: "I walk to the door and open it."
  → Engine scans and matches "open the door" → auto-plays door_open_sfx

SFX vs BGM

SFX (sound effects) play once and stop. BGM (background music) loops or continues per the playlist. When a Conditional BGM rule targets an SFX-type track, it plays once and doesn't replace the current background music. But if stopPreviousBGM is set to true, it stops the current BGM first before playing the track — SFX usually doesn't need this.


Pattern 4: Conditional BGM — Variable-Driven Auto-Switching

What you'll build

No behaviors needed — configure a rule directly in the Audio tab: when hp drops below 20, automatically switch to tense crisis music; when hp returns above 20, switch back to the default track.

How it works

Conditional BGM's variable trigger type checks automatically after every variable change. Condition met → switch to the target track; condition no longer met → fall back based on the fallback setting (return to the playlist's default track, or to the previously playing track).

Step by step

Step 1: Prepare audio tracks

Make sure the Audio tab has:

  • explore_bgm — default exploration music (in the playlist)
  • crisis_bgm — crisis music (only plays when the condition triggers; doesn't need to be in the playlist)

Step 2: Create a Conditional BGM rule

Audio tab → Conditional BGM → Add Rule

FieldValueWhy
NameLow HP Crisis MusicFor your own reference
Trigger TypeVariable (variable)Decides based on variable values
Conditionhp < 20Triggers when HP is below 20
Condition LogicAll (all)Only one condition here, so all and any work the same
Target Trackcrisis_bgmSwitch to crisis music
Priority10If multiple rules match simultaneously, higher priority wins
Fade In Duration1 secondNew track gradually fades in
Fade Out Duration1 secondOld track gradually fades out
Stop Current BGMOnStop the exploration music before playing crisis music
FallbackdefaultWhen the condition is no longer met (HP goes back above 20), automatically return to the playlist's default track

Result

Player is exploring, BGM is explore_bgm
  → AI replies: [hp: -15] (hp drops from 30 to 15)
  → Engine detects hp < 20, condition met
  → explore_bgm fades out over 1 second, crisis_bgm fades in over 1 second
  → Atmosphere instantly gets tense

Player uses a healing potion
  → AI replies: [hp: +20] (hp goes from 15 back to 35)
  → Engine detects hp is no longer < 20, condition not met
  → fallback: "default" → automatically switches back to explore_bgm

Multi-condition combos

You can add multiple conditions to a single rule. For example: hp < 20 AND location == "dungeon" → crisis music only plays when you're in the dungeon with low HP. Set the condition logic to all (all must match).


Pattern 5: Ambient Sound Loops

What you'll build

Continuously playing ambient sounds in the scene background — rain, wind, tavern chatter — layered on top of the BGM to deepen immersion.

How it works

Ambient is the third track type. It plays independently of BGM — you can have a BGM track + an ambient track playing simultaneously. Ambient is usually set to loop at low volume, serving as a constant atmospheric backdrop.

Step by step

Step 1: Create Ambient tracks

Audio tab → Add Track

FieldValue
Display NameRain
IDrain_ambient
TypeAmbient
LoopOn
Volume0.3 (ambient should be quieter than BGM — it's the backdrop)
Fade In3 seconds (appears gradually, not jarring)
Fade Out3 seconds

Create more as needed: wind_ambient (wind), tavern_ambient (tavern chatter), forest_ambient (birdsong and insects).

Step 2: Control ambient via Conditional BGM

Same as Pattern 4 — use a Conditional BGM rule to control when ambient plays.

Rule: Play forest ambient when in the forest

FieldValue
NameForest Ambient
Trigger TypeVariable (variable)
Conditionlocation == forest
Target Trackforest_ambient
Stop Current BGMOff
Fallbackdefault

Key: stopPreviousBGM must be Off. Ambient layers on top of BGM — it shouldn't stop the background music. If you turn it on, switching ambient tracks will also kill whatever BGM is currently playing.

You can also control ambient via behaviors

If you already have scene-switching behaviors (like Pattern 2), just add a "Play Audio" action to the behavior's action list, targeting the ambient track:

#Action TypeSettingPurpose
1Set Variablelocation set to forestRecord the location
2Play Audioforest_bgm, operation: crossfade, fade 2sSwitch the BGM
3Play Audioforest_ambient, operation: play, fade in 3sLayer in the ambient sound
4Play Audiotavern_ambient, operation: stop, fade out 3sStop the old ambient sound

This way, a single behavior handles both the BGM switch and the ambient swap.

Volume recommendations

BGM: typically 0.5-0.7. Ambient: 0.2-0.4. SFX: 0.7-1.0. With these three layers at different levels, they won't fight each other.


Pattern 6: Controlling Audio from Custom Components

What you'll build

A "jukebox" in the message renderer — a few buttons that each play a different track, plus a "Stop" button. This is pure UI control — no behaviors or conditional rules needed.

How it works

useYumina() provides two audio APIs:

  • api.playAudio?.(trackId, opts) — play the specified track
  • api.stopAudio?.(trackId?) — stop the specified track (omit the ID to stop everything)

Both methods can be called directly in message renderer TSX code.

Step by step

Step 1: Prepare audio tracks

Make sure the Audio tab has the tracks you want to play (create them as in Pattern 1). Let's say you have:

  • jazz_bgm — jazz
  • rock_bgm — rock
  • classical_bgm — classical

Step 2: Write the message renderer code

Editor → Message Renderer tab → add the jukebox UI to your renderer code:

tsx
export default function Renderer({ content, renderMarkdown, messageIndex }) {
  const api = useYumina();
  const msgs = api.messages || [];
  const isLastMsg = messageIndex === msgs.length - 1;

  const tracks = [
    { id: "jazz_bgm", label: "Jazz", color: "#7c3aed" },
    { id: "rock_bgm", label: "Rock", color: "#dc2626" },
    { id: "classical_bgm", label: "Classical", color: "#0891b2" },
  ];

  return (
    <div>
      <div
        style={{ color: "#e2e8f0", lineHeight: 1.7 }}
        dangerouslySetInnerHTML={{ __html: renderMarkdown(content) }}
      />

      {isLastMsg && (
        <div style={{
          marginTop: "12px",
          padding: "12px",
          background: "rgba(30,41,59,0.5)",
          borderRadius: "8px",
          border: "1px solid #334155",
        }}>
          <div style={{ fontSize: "12px", color: "#94a3b8", marginBottom: "8px" }}>
            Jukebox
          </div>
          <div style={{ display: "flex", gap: "8px", flexWrap: "wrap" }}>
            {tracks.map((t) => (
              <button
                key={t.id}
                onClick={() => api.playAudio?.(t.id, { fadeDuration: 1.5 })}
                style={{
                  padding: "8px 16px",
                  background: t.color,
                  border: "none",
                  borderRadius: "6px",
                  color: "#fff",
                  fontSize: "13px",
                  cursor: "pointer",
                }}
              >
                {t.label}
              </button>
            ))}
            <button
              onClick={() => api.stopAudio?.()}
              style={{
                padding: "8px 16px",
                background: "#475569",
                border: "none",
                borderRadius: "6px",
                color: "#e2e8f0",
                fontSize: "13px",
                cursor: "pointer",
              }}
            >
              Stop
            </button>
          </div>
        </div>
      )}
    </div>
  );
}

Line-by-line breakdown:

  • api.playAudio?.(t.id, { fadeDuration: 1.5 }) — plays the specified track with a 1.5-second fade-in. If another track is currently playing, it automatically stops it first
  • api.stopAudio?.() — called with no arguments = stops all currently playing audio
  • isLastMsg — only shows the jukebox on the last message, so it doesn't repeat on every message

More advanced usage

You can read variables to control UI state. For example, use a now_playing variable to track the current track ID, then show a "Now Playing" indicator on the button:

tsx
const nowPlaying = String(api.variables.now_playing || "");

// Update the variable alongside playback
onClick={() => {
  api.playAudio?.(t.id, { fadeDuration: 1.5 });
  api.setVariable("now_playing", t.id);
}}

// Show status on the button
{nowPlaying === t.id ? "♪ " + t.label : t.label}

Pattern 7: AI-Driven Audio Control

What you'll build

Let the AI naturally control music during narration — play a lively accordion tune when describing entering a tavern, switch to intense battle BGM when a fight breaks out, play a pain sound effect when a character gets hurt.

How it works

The AI can embed [audio: trackId action] directives in its replies. The engine automatically recognizes and executes these directives while stripping them from the text the player sees — like stage directions in a screenplay that the audience never reads, but the crew follows.

Step by step

Step 1: Register all tracks the AI might use

In the Audio tab, create all the tracks you want the AI to control:

  • tavern_bgm — tavern music
  • battle_bgm — battle music
  • sword_clash_sfx — sword clash sound effect
  • pain_sfx — pain/injury sound effect
  • rain_ambient — rain ambient

Step 2: Tell the AI what tracks are available via the system prompt

The AI won't automatically know which tracks you've registered. You need to create an entry in the Entries tab listing the available tracks and usage rules:

Entry name: Audio Directive Reference

Section: System Presets

Content:

[Audio Control System]
You can use the following audio directives in your replies to control music and sound effects. Directives are automatically executed and stripped from the text the player sees.

Available directive formats:
- [audio: trackId play] — play
- [audio: trackId play 2.0] — play with a 2-second fade-in
- [audio: trackId stop] — stop
- [audio: trackId stop 1.5] — stop with a 1.5-second fade-out
- [audio: trackId crossfade 2.0] — crossfade transition, 2-second overlap
- [audio: trackId volume 0.5] — adjust volume to 0.5
- [audio: trackId play chain:nextTrackId] — after this track finishes, automatically start the next one

Available tracks:
- tavern_bgm — lively tavern accordion music (good for social scenes, shopping)
- battle_bgm — intense battle music (good for combat, chase scenes)
- sword_clash_sfx — sword clash sound effect (good for melee action descriptions)
- pain_sfx — pain/injury sound effect (good for when a character gets hurt)
- rain_ambient — rain ambient sound (good for rainy scenes)

Usage guidelines:
- Insert audio directives at natural narrative points
- Use crossfade for scene transitions, with a duration of 1.5-2.5 seconds
- Pair sound effects with action descriptions, placing them near the corresponding text
- Don't overdo it — 2-3 audio directives per response at most

Step 3: Example AI response

After telling the AI these rules, its replies might look like this:

You push open the tavern's heavy wooden door, and a rush of warm air hits your face. [audio: tavern_bgm crossfade 2.0]

The tavern is buzzing — someone's playing accordion in the corner, and a dwarf at the bar is shouting over a dice game. You've barely found a seat when a masked figure suddenly draws a blade and lunges at you!

[audio: battle_bgm crossfade 0.5] [audio: sword_clash_sfx play]

You throw yourself sideways on instinct. The table splits in two behind you.

The player sees clean narrative text while hearing: tavern music fading in → abrupt switch to battle music + sword clash sound effect.

The chain directive — special usage

chain lets one track automatically start another when it finishes:

The sound of war horns echoes through the valley — the battle is about to begin! [audio: war_horn_sfx play chain:battle_bgm]

After the war_horn_sfx horn blast finishes playing, battle_bgm starts automatically — an intro into the main track, more ceremonial than a direct switch.

The AI might forget to use directives

The AI won't always remember to insert audio directives, especially in long conversations. For critical scene BGM changes (like entering a combat zone), set up Conditional BGM rules (Pattern 4) as a fallback. AI directives are the icing on the cake; Conditional BGM is the safety net.


Comprehensive Quick Reference

Track types

TypePurposeTypical Settings
BGMBackground musicLoop on, volume 0.5-0.7
SFXOne-shot sound effectsLoop off, volume 0.7-1.0
AmbientLooping ambient soundsLoop on, volume 0.2-0.4

5 ways to control audio

What you want to doWhich methodWhere to set it up
Auto-play BGM when entering the worldPlaylist + autoplayAudio tab → BGM Playlist
Auto-switch track when variable conditions are metConditional BGM (variable trigger)Audio tab → Conditional BGM
Play SFX when AI reply contains a keywordConditional BGM (ai-keyword trigger)Audio tab → Conditional BGM
Play SFX when player message contains a keywordConditional BGM (keyword trigger)Audio tab → Conditional BGM
Switch track at a specific turn numberConditional BGM (turn-count trigger)Audio tab → Conditional BGM
Crossfade on scene transitionBehavior + Play Audio actionBehaviors tab
Play/stop from a button clickMessage renderer api.playAudio?.() / api.stopAudio?.()Message Renderer tab
AI triggers audio during narrationAI audio directives [audio: trackId action]Entries tab (tell the AI the rules)

AI audio directive reference

DirectiveEffect
[audio: trackId play]Play
[audio: trackId play 2.0]Play with 2-second fade-in
[audio: trackId stop]Stop
[audio: trackId stop 1.5]Stop with 1.5-second fade-out
[audio: trackId crossfade 2.0]Crossfade transition, 2-second overlap
[audio: trackId volume 0.5]Adjust volume
[audio: trackId play chain:nextId]After finishing, automatically start next track

Conditional BGM trigger types

Trigger TypeWhen It FiresTypical Use
variableWhen variable conditions are methp < 20 plays crisis music
ai-keywordWhen AI reply contains keywordAI writes "explosion" plays explosion SFX
keywordWhen player message contains keywordPlayer says "perform" plays music
turn-countWhen a specific turn is reachedTurn 10 plays countdown music
session-startWhen the session startsFixed opening track

Behavior play-audio action parameters

ParameterDescription
Track ID (trackId)Matches the track ID registered in the Audio tab
Operation (action)play (play), stop (stop), crossfade (crossfade switch), volume (adjust volume)
Volume (volume)0-1, optional
Fade Duration (fadeDuration)Seconds, optional; 1.5-3 seconds recommended for crossfade

Message renderer audio API

MethodDescription
api.playAudio?.(trackId, opts)Play a track. opts can include fadeDuration, etc.
api.stopAudio?.(trackId?)Stop a track. Omit the ID to stop everything

Common Issues

SymptomLikely CauseFix
No sound at allBrowser blocks autoplayModern browsers require user interaction (click, type) before allowing audio playback. Have the player send a message first, or turn on "Wait for First Message" in the playlist
BGM transition sounds choppyNot using crossfadeMake sure the behavior's "Play Audio" operation is set to crossfade with a fade duration of at least 1.5 seconds
SFX and BGM interrupt each otherstopPreviousBGM is set to trueSFX-type Conditional BGM rules should have "Stop Current BGM" turned off
AI doesn't use audio directivesEntry not telling the AI about themCreate a System Presets entry listing all available track IDs and directive formats (see Pattern 7)
Ambient too loudVolume too highAmbient should be 0.2-0.4, with BGM at 0.5-0.7 to maintain separation
Conditional BGM not triggeringVariable value type mismatchMake sure the condition's value type matches the variable type (e.g. numeric variables need numeric comparisons, not string comparisons)
Multiple rules conflictingSame priorityGive different Conditional BGM rules different priority values — higher numbers take precedence

This is Recipe #14: Audio Design Guide

The audio system's design philosophy is: simple things just work (playlist + autoplay), and complexity unlocks in layers (Conditional BGM → behavior control → AI directives → custom API). You don't need to learn every pattern at once — start with Pattern 1, and come back for more when you need finer control.