Files
market/skills/minimax-music-gen/SKILL.md
Yige bffb9d24b4 fix: 修复 desirecore PR #533 Copilot 评审反馈的 7 项内容问题 (#2)
[desirecore PR #533](https://github.com/desirecore/desirecore/pull/533) 把
market 全局技能快照同步到主仓库,Copilot 自动评审命中 7 处文档与代码
不一致问题,全部根因在 market 的 skill 文档;本 PR 在源头修复,让下次
sync-global-skills 自然带过去。

修复内容:

1. disable-model-invocation 语义描述反向(3 处文件 × 2 语言 = 5 处编辑)
   - skill-creator/SKILL.md (en-US)
   - skill-creator/SKILL.zh-CN.md
   - manage-skills/SKILL.md (en-US)
   - manage-skills/SKILL.zh-CN.md
   - 注:references/desirecore-format.md 已在 PR #1 解冲突时一并修好

   实际代码逻辑(lib/agent-service/skills/parser.ts):只有显式
   `disable-model-invocation: false` 才会被加入 system prompt 自动加载列表,
   `true` 或缺省都会跳过自动注入、需显式 Skill 工具调用。文档原描述把这两个
   值的语义对调了,且错误地宣称存在 L0/L1 vs L0+L1+L2 的"分层加载机制"
   (runtime 不区分这三个层级,加载就是整篇 SKILL.md)。

2. dev-environment-setup/references/probe-snapshot.md 协议字段类型 / 超时承诺
   - desirecore_port_file: string → boolean(probe.sh 输出 ${PORT_FILE_EXISTS}
     原生 bool;probe.ps1 输出 PowerShell bool;JSON 序列化均为 true/false)
   - "CLI 调用最长 5s" → "CLI 调用依赖工具自身实现,无显式 timeout 包装,
     正常情况通常 <5s 完成"(HTTP probe 确有 0.5s/1s timeout,但 --version
     这类 CLI 没有 timeout 5s 包装,文档原文承诺超出实现)

3. minimax-music-gen 使用过时的 provider 字段(应为 providerId)
   - skills/minimax-music-gen/SKILL.md(3 处)
   - skills/minimax-music-gen/SKILL.zh-CN.md(3 处)
   - 与 sibling minimax-tts/image-gen/video-gen 对齐,使用
     `"providerId": "provider-minimax-media-001"`,避免 media-proxy 路由到
     coding/token plan 等同名 provider

版本与日期:

- skill-creator: 1.0.1 → 1.0.2
- manage-skills: 1.0.2 → 1.0.3
- dev-environment-setup: 2.0.1 → 2.0.2
- minimax-music-gen: 1.1.1 → 1.1.2
- 上述 4 个 SKILL.md 的 metadata.updated_at 与 manifest.json#stats.lastUpdated
  统一为 2026-05-05

i18n 处理:

按 PR #1 修复模式(commit 2a21e8e),同步编辑英文源(SKILL.md = en-US default)
与中文翻译(SKILL.zh-CN.md = source),不动 metadata.i18n.<locale>.source_hash /
translated_at 字段(CI translate.py 维护)。
2026-05-05 01:08:46 +08:00

11 KiB

name, description, license, version, type, risk_level, status, disable-model-invocation, provider, tags, requires, metadata, market
name description license version type risk_level status disable-model-invocation provider tags requires metadata market
minimax-music-gen Use this skill when the user wants to generate music using MiniMax's Music Generation API. Supports text-to-music with lyrics, instrumental generation, and music cover. Use when 用户提到 生成音乐、文生音乐、 AI 作曲、创作歌曲、写一首歌、音乐生成、AI 音乐、MiniMax 音乐、 作词作曲、纯音乐、伴奏、翻唱、cover。 Complete terms in LICENSE.txt 1.1.2 procedural low enabled true minimax
media
audio
music
generation
minimax
tools
Bash
author updated_at i18n
desirecore 2026-05-05
default_locale source_locale locales zh-CN en-US
en-US zh-CN
zh-CN
en-US
name short_desc description body source_hash translated_by
MiniMax 音乐生成 基于 MiniMax Music 2.6 的文本生成音乐技能 Use this skill when the user wants to generate music using MiniMax's Music Generation API. Supports text-to-music with lyrics, instrumental generation, and music cover. Use when 用户提到 生成音乐、文生音乐、 AI 作曲、创作歌曲、写一首歌、音乐生成、AI 音乐、MiniMax 音乐、 作词作曲、纯音乐、伴奏、翻唱、cover。 ./SKILL.zh-CN.md sha256:403153a9c1da2ad9 human
name short_desc description body source_hash translated_by translated_at
MiniMax Music Generation Text-to-music skill powered by MiniMax Music 2.6 Use this skill when the user wants to generate music using MiniMax's Music Generation API. Supports text-to-music with lyrics, instrumental generation, and music cover. Use when the user mentions generating music, text-to-music, AI composing, creating songs, writing a song, music generation, AI music, MiniMax music, songwriting, instrumental music, accompaniment, cover, or remake. ./SKILL.md sha256:403153a9c1da2ad9 ai:claude-opus-4-7 2026-05-04
icon category maintainer channel
<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none"><rect x="3" y="3" width="18" height="18" rx="3" stroke="#AF52DE" stroke-width="1.5" fill="#AF52DE" fill-opacity="0.1"/><path d="M9 18V6l10-2v12" stroke="#AF52DE" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round"/><circle cx="6.5" cy="18" r="2.5" fill="#AF52DE" fill-opacity="0.6"/><circle cx="16.5" cy="16" r="2.5" fill="#AF52DE" fill-opacity="0.6"/></svg> media
name verified
DesireCore Official true
latest

minimax-music-gen Skill

Mandatory Rules (violations will cause functionality to fail)

  1. Must access agent-service over HTTPShttps://127.0.0.1:${PORT} with -k to skip certificate verification
  2. Use Bash curl throughout — do not use the HttpRequest tool or Python
  3. Do not use output_format: "url" — URL downloads will return empty files in scenarios such as Token Plan due to CDN authentication failures. Always use the default hex format; audio data is returned directly in the API response

Full Execution Flow

Prerequisites

  • The user has configured the MiniMax Provider (regular API or Token Plan) in Resource Manager → Compute and filled in the API Key
  • agent-service is running

Core Concepts

MiniMax Music Generation is a synchronous API (not an asynchronous task model); it returns audio data directly when called. Three modes are supported:

Mode model Description
Song generation music-2.6 Provide prompt + lyrics to generate a song with vocals
Pure instrumental music-2.6 Set is_instrumental: true; only a prompt is needed
Cover music-cover Provide a reference audio + prompt; rearrange based on the melodic skeleton

Lyrics Structure Tags

The lyrics field supports the following structure tags to organize song sections:

Tag Meaning
[verse] Verse
[chorus] Chorus
[bridge] Bridge
[intro] Intro
[outro] Outro
[interlude] Interlude

Example lyrics format:

[verse]
夜晚的城市灯火阑珊
我独自走在回家的路上

[chorus]
这一刻时间仿佛停止
所有的喧嚣都已远去

Generate a Song (with Vocals)

Note: Do not pass the output_format parameter; use the default hex format.

PORT=$(cat ~/.desirecore/agent-service.port)
curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
  -H "Content-Type: application/json" \
  -d '{
    "providerId": "provider-minimax-media-001",
    "endpoint": "/music_generation",
    "body": {
      "model": "music-2.6",
      "prompt": "独立民谣,温暖,治愈,吉他伴奏",
      "lyrics": "[verse]\n歌词内容\n\n[chorus]\n副歌内容",
      "audio_setting": {
        "format": "mp3",
        "sample_rate": 44100,
        "bitrate": 256000
      }
    },
    "responseType": "json"
  }'

Generate Pure Instrumental

PORT=$(cat ~/.desirecore/agent-service.port)
curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
  -H "Content-Type: application/json" \
  -d '{
    "providerId": "provider-minimax-media-001",
    "endpoint": "/music_generation",
    "body": {
      "model": "music-2.6",
      "prompt": "电子音乐,氛围感,空灵,合成器铺底",
      "is_instrumental": true,
      "audio_setting": {
        "format": "mp3",
        "sample_rate": 44100,
        "bitrate": 256000
      }
    },
    "responseType": "json"
  }'

Response Handling and Saving

The API returns JSON; audio data is hex-encoded and stored in the data.data.audio.data field.

Response structure:

{
  "success": true,
  "data": {
    "data": {
      "audio": {
        "data": "hex编码的音频数据...",
        "status": 2
      }
    },
    "extra_info": {
      "music_duration": 180000,
      "music_sample_rate": 44100,
      "music_channel": 2,
      "bitrate": 256000,
      "music_size": 1234567
    },
    "base_resp": { "status_code": 0, "status_msg": "success" }
  },
  "statusCode": 200
}

Note: The status field means 1 = synthesizing (streaming scenario), 2 = synthesis complete. In non-streaming mode, the returned status is 2.

Save the hex Audio Data to media-store

Extract the hex string from the data.data.audio.data field of the response JSON, convert it to binary, and upload:

PORT=$(cat ~/.desirecore/agent-service.port)
# Save the API response to a temporary file (avoid letting large hex data overflow shell variables)
# Assume the curl output of the previous step has been saved to /tmp/minimax-music-resp.json

# Extract hex data and convert to binary (pure Bash, no Python dependency)
jq -r '.data.data.audio.data' /tmp/minimax-music-resp.json | xxd -r -p > /tmp/minimax-music.mp3

# Verify the file is valid (greater than 1KB and in audio format)
FILE_SIZE=$(stat -f%z /tmp/minimax-music.mp3 2>/dev/null || stat -c%s /tmp/minimax-music.mp3 2>/dev/null)
if [ "$FILE_SIZE" -lt 1024 ]; then
  echo "ERROR: 音频文件异常(${FILE_SIZE} 字节),可能生成失败"
  exit 1
fi

# Upload to media-store
curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \
  -F "file=@/tmp/minimax-music.mp3;type=audio/mpeg"

Extract the mediaId field from the upload response JSON.

Display the Result

In the reply, use a dc-media protocol reference (the frontend will automatically detect the audio extension and render a player):

![音乐生成结果](dc-media://这里替换为mediaId)

Parameter Descriptions

Parameter Description Required Default
model Model name Yes "music-2.6"
prompt Music style/mood description Optional when lyrics are present; required for pure instrumental/cover
lyrics Lyrics (structure tags supported) Required when not in pure instrumental mode
is_instrumental Whether to generate pure instrumental No false
lyrics_optimizer Auto-generate lyrics from the prompt No false
audio_setting.format Audio format: mp3/wav/pcm No "mp3"
audio_setting.sample_rate Sample rate: 16000/24000/32000/44100 No 32000
audio_setting.bitrate Bitrate: 32000/64000/128000/256000 No 128000

Tips for Writing Prompts

The prompt is used to describe the music's style, mood, and instrumentation; commas are recommended to separate keywords:

  • Style: 独立民谣, 电子舞曲, 古典钢琴, 摇滚, R&B, 爵士, 嘻哈
  • Mood: 温暖, 忧郁, 欢快, 史诗感, 空灵, 治愈
  • Instruments: 吉他伴奏, 钢琴独奏, 弦乐铺底, 合成器, 鼓点强劲
  • Structure: 渐进式编曲, 开场留白渐入高潮, 轻柔开头爆发副歌

Example: "独立民谣,温暖治愈,木吉他为主,轻柔的鼓点,渐进式编曲"

Auto-generated Lyrics Mode

If the user only describes the desired music style without providing lyrics, set lyrics_optimizer: true and the model will auto-generate lyrics from the prompt:

PORT=$(cat ~/.desirecore/agent-service.port)
curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
  -H "Content-Type: application/json" \
  -d '{
    "providerId": "provider-minimax-media-001",
    "endpoint": "/music_generation",
    "body": {
      "model": "music-2.6",
      "prompt": "一首关于夏日海边回忆的歌,独立民谣,温暖,吉他",
      "lyrics_optimizer": true,
      "audio_setting": {
        "format": "mp3",
        "sample_rate": 44100,
        "bitrate": 256000
      }
    },
    "responseType": "json"
  }'

Error Handling

  • base_resp.status_code: 1002: rate limit reached, retry later
  • base_resp.status_code: 1004: API Key authentication failed
  • base_resp.status_code: 1008: insufficient balance
  • base_resp.status_code: 1026: content sensitive, modify the lyrics or prompt and retry
  • base_resp.status_code: 2013: parameter error, check required fields
  • success: false + error: "未找到匹配的供应商": MiniMax Provider not configured

Notes

  • The prompt length limit is 1-2000 characters; the lyrics length limit is 1-3500 characters
  • Token Plan users: all plans use music-2.6 for free (100 tracks/day, each track ≤5 minutes)
  • Unless the user specifies otherwise, default to music-2.6 + mp3 format + 44100 sample rate
  • If the user only gives a theme without lyrics, use lyrics_optimizer: true to auto-generate lyrics
  • If the user requests pure music/accompaniment, set is_instrumental: true
  • Music generation takes a relatively long time (typically 30-90 seconds); please be patient
  • The hex data volume is large (several MB); always use a temporary file as intermediary, do not store it in shell variables