fix: 补全 dashscope-image-gen 和 xiaomi-tts 的 i18n CI 校验 (#4)

## 变更说明修复 dashscope-image-gen 和 xiaomi-tts 的 i18n CI 校验、补全英文翻译，并连带修复其他 stale skill 的 source_hash 漂移问题。 ### dashscope-image-gen / xiaomi-tts（PR 主线） - `name` 字段从中文改为目录名（CI rule-1 要求 lowercase ASCII + hyphens）。 - 补全 `metadata.i18n` 块：`locales`、`zh-CN` (含 body 指向 SKILL.zh-CN.md)、`en-US`（含 description / body=./SKILL.md）。 - 新增 `SKILL.zh-CN.md`（zh-CN body 文件）。 - **root SKILL.md 改写为英文 body**（与 SKILL.zh-CN.md 内容对应），由本 PR 手工翻译；`default_locale=en-US`、`source_locale=zh-CN`，与 docs/I18N.md 约定一致：root SKILL.md = default_locale body (en-US)、SKILL.zh-CN.md = source_locale body (zh-CN)。 - 两 locale 锁为 `translated_by: human` + 正确 `source_hash`。 - 内容质量修复：流程标题 "严格按此两步执行" 改为 "严格按此三步执行"；强制规则 2 措辞精确化（/tmp 仅作中转）；xiaomi-tts 用户意图映射表中 `response_format` 改为 `audio.format` 与请求体参数表一致；zh-CN.description 改为纯中文。 - locale header 由 shell 转义残留 `<\!--` 修正为标准 ``。 ### 连带：6 个 main 上已 stale 的 skill（避免 translate workflow 失败） - `manage-skills` / `minimax-music-gen` / `minimax-video-gen` / `skill-creator` / `web-access`：`en-US.source_hash` 重新计算为当前 zh-CN source 实际 hash；`translated_by` 由 `ai:claude-opus-4-7` 改为 `human` 以锁定现有翻译不被自动重译覆盖。 - `markdown`：补正 `en-US.source_hash`（之前是占位 `sha256:0000000000000000`）。 - 这些 skill 的 `en-US` 翻译内容保持不变，仅修正元数据。 ### scripts/i18n/translate.py 容错增强 - 413 Payload Too Large 时不再 retry（payload 不会变小，retry 浪费时间）。 - 主循环 catch RuntimeError，把单个 skill 的失败写入 `plan["errors"]` 后继续处理下一个 skill，避免一个大文件 fail 整个 workflow。 - `--check` 模式下 plans 含 errors 也 exit 1（之前仅看 needs_translation，broad except 会把异常吃掉导致误报通过）。 ## Test plan - [x] `i18n-validate` 通过 - [x] `i18n-translate --check` 显示所有 skill `up-to-date` 或 `human-locked, skipping` - [x] CI 上 `validate` / `translate` / `wait-for-copilot-review` 全绿 - [ ] Copilot 评审 conversation 全部 resolve - [ ] Squash merge --------- Co-authored-by: yi-ge <a@wyr.me>
2026-07-23 03:23:41 +08:00 · 2026-05-13 12:57:25 +08:00
parent b8101406fb
commit 0cb3758669
11 changed files with 562 additions and 163 deletions
--- a/scripts/i18n/translate.py
+++ b/scripts/i18n/translate.py
@@ -248,6 +248,13 @@ def _post_with_retries(url: str, headers: dict, payload: dict, *, extract) -> st
        try:
            with httpx.Client(timeout=HTTP_TIMEOUT) as client:
                resp = client.post(url, headers=headers, json=payload)
+            # Don't retry on 413: payload won't get smaller on next attempt.
+            if resp.status_code == 413:
+                raise RuntimeError(
+                    f"413 Payload Too Large from {url} — skill body too big for this backend. "
+                    f"Switch backend (TRANSLATE_BACKEND=anthropic), use a model with larger input budget, "
+                    f"or set translated_by: human to lock the locale."
+                )
            if resp.status_code == 429 or resp.status_code >= 500:
                raise httpx.HTTPStatusError(f"{resp.status_code}", request=resp.request, response=resp)
            resp.raise_for_status()
@@ -499,11 +506,19 @@ def main(argv: list[str]) -> int:
        if not (skill_dir.is_dir() and (skill_dir / "SKILL.md").is_file()):
            continue
        for tl in target_locales:
-            plans.append(translate_skill(
-                skill_dir, tl,
-                check_only=args.check, mark_human=args.human,
-                backend=backend, model=model, endpoint=endpoint,
-            ))
+            try:
+                plans.append(translate_skill(
+                    skill_dir, tl,
+                    check_only=args.check, mark_human=args.human,
+                    backend=backend, model=model, endpoint=endpoint,
+                ))
+            except Exception as e:  # don't let one bad skill abort the entire run
+                plans.append({
+                    "skill": skill_dir.name,
+                    "target": tl,
+                    "actions": [],
+                    "errors": [f"unhandled exception: {e}"],
+                })

    needs = [p for p in plans if p.get("needs_translation")]
    errs = [p for p in plans if p.get("errors")]
@@ -514,7 +529,7 @@ def main(argv: list[str]) -> int:
        for p in errs:
            for e in p["errors"]:
                print(f"  ERROR [{p['skill']}/{p['target']}]: {e}")
-        return 1 if needs else 0
+        return 1 if (needs or errs) else 0

    print(f"Backend: {backend}  Model: {model}  Endpoint: {endpoint}\n")
    for p in plans:
--- a/skills/dashscope-image-gen/SKILL.md
+++ b/skills/dashscope-image-gen/SKILL.md
@@ -29,6 +29,24 @@ metadata:
  i18n:
    default_locale: en-US
    source_locale: zh-CN
+    locales:
+      - zh-CN
+      - en-US
+    zh-CN:
+      name: 阿里云 文生图
+      short_desc: 基于阿里云通义万相的文本生成图片技能
+      description: >-
+        当用户希望使用阿里云 DashScope 的通义万相系列模型生成图片时使用此技能。支持多种模型层级（wan2.7-image-pro / wan2.7-image）的文生图，通过 OpenAI 兼容的 chat/completions API 同步生成图片。用户提到 生成图片、画图、文生图、创建图片、AI 绘画、生成插图、画一张、帮我画、设计图片、通义万相、万相、阿里云画图、dashscope 画图。
+      body: ./SKILL.zh-CN.md
+      source_hash: sha256:d24415cd18ebf5d2
+      translated_by: human
+    en-US:
+      name: DashScope Image Generation
+      short_desc: Text-to-image generation using Alibaba Cloud Wan (通义万相) models
+      description: "Use this skill when the user wants to generate images using Alibaba Cloud DashScope's Wan (通义万相) series models. Supports text-to-image with multiple model tiers (wan2.7-image-pro, wan2.7-image) via the OpenAI-compatible chat/completions API. Trigger keywords: generate image, draw, text-to-image, create image, AI painting, illustration, design picture, Wan, Tongyi Wanxiang, DashScope."
+      body: ./SKILL.md
+      source_hash: sha256:d24415cd18ebf5d2
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
@@ -46,35 +64,35 @@ market:
  channel: latest
 ---

-# dashscope-image-gen 技能
+# dashscope-image-gen Skill

-## 强制规则（违反将导致功能失败）
+## Mandatory Rules (violations cause failure)

-1. **必须用 HTTPS 访问 agent-service** — `https://127.0.0.1:${PORT}` 加 `-k` 跳过证书验证
-2. **必须通过 `/api/media/upload` 上传到 media-store** — 禁止保存到本地路径
-3. **必须使用 `dc-media://` 协议展示图片** — 唯一能让前端正确渲染的方式
-4. **全程使用 Bash curl** — 不要使用 HttpRequest 工具或 Python
-5. **使用 compatible-mode（/chat/completions）** — 同步调用，响应直接包含图片 URL
+1. **Must access agent-service over HTTPS** — use `https://127.0.0.1:${PORT}` with `-k` to skip certificate verification
+2. **Must upload to media-store via `/api/media/upload`** — `/tmp` is only a transient download/decode location, never use a local path as the final output
+3. **Must use the `dc-media://` protocol to display images** — the only form the frontend can render correctly
+4. **Use Bash curl throughout** — do not use the HttpRequest tool or Python
+5. **Use compatible-mode (`/chat/completions`)** — synchronous call; the response contains the image URL directly

-## 模型选择指南
+## Model Selection

-| 模型 | 特点 | 适用场景 |
+| Model | Characteristics | When to use |
 |------|------|---------|
-| wan2.7-image-pro | 旗舰，4K 分辨率，thinking_mode | 用户要求最高画质、4K、细节丰富 |
-| wan2.7-image | 标准高画质，thinking_mode | **默认首选**，无特殊要求时使用 |
+| wan2.7-image-pro | Flagship, 4K resolution, thinking_mode | User asks for top quality, 4K, or rich detail |
+| wan2.7-image | Standard high quality, thinking_mode | **Default**, for unspecified requests |

-**默认规则**：用户未指定模型时，使用 `wan2.7-image`。
+**Default rule**: if the user does not specify a model, use `wan2.7-image`.

-## 完整执行流程（严格按此两步执行）
+## Full Execution Flow (strictly three steps)

-### 前置条件
+### Prerequisites

- 用户已在资源管理器-算力中配置阿里云 DashScope Provider 并填写 API Key
- agent-service 正在运行
+- The user has configured an Alibaba Cloud DashScope provider in Resource Manager → Compute and filled in an API Key
+- agent-service is running

-### 第一步：调用文生图 API（同步）
+### Step 1: Call the text-to-image API (synchronous)

-通过 media-proxy 的 compatible-mode 端点生成图片，响应直接包含图片 URL：
+Generate the image via media-proxy's compatible-mode endpoint; the response includes the image URL directly:

 ```bash
 PORT=$(cat ~/.desirecore/agent-service.port)
@@ -90,7 +108,7 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
        {
          "role": "user",
          "content": [
-            {"type": "text", "text": "这里替换为图片描述（建议英文效果更好）"}
+            {"type": "text", "text": "Replace this with the image description (English usually gives better results)"}
          ]
        }
      ]
@@ -99,7 +117,7 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
  }'
 ```

-**响应示例**：
+**Example response**:
 ```json
 {
  "success": true,
@@ -126,39 +144,39 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
 }
 ```

-从 `data.output.choices[0].message.content` 中找到 `type: "image"` 的项，提取其 `image` URL。
+Locate the item with `type: "image"` inside `data.output.choices[0].message.content` and extract its `image` URL.

-### 第二步：下载并上传到 media-store
+### Step 2: Download and upload to media-store

-图片 URL 有时效，必须立即下载并保存到本地 media-store：
+The image URL is time-limited; download and persist it to the local media-store immediately:

 ```bash
 PORT=$(cat ~/.desirecore/agent-service.port)
-IMAGE_URL="第一步响应中的 image URL"
+IMAGE_URL="image URL from step 1's response"
 curl -sL "$IMAGE_URL" -o /tmp/dashscope-gen.png && \
 curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \
  -F "file=@/tmp/dashscope-gen.png;type=image/png"
 ```

-从 JSON 响应中提取 `mediaId` 字段（格式如 `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.png`）。
+Pick the `mediaId` field from the JSON response (format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.png`).

-### 第三步：用 dc-media 协议展示图片
+### Step 3: Render the image via the dc-media protocol

-在你的回复文本中直接写 Markdown 图片语法：
+In your reply text, write Markdown image syntax directly:

 ```
-![图片描述](dc-media://这里替换为mediaId)
+![Image description](dc-media://replace-with-mediaId)
 ```

-例如：`![森林中的白色狐狸](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.png)`
+For example: `![White fox in a forest](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.png)`

-前端会自动将 `dc-media://` 转为可访问的图片 URL 并渲染出来。
+The frontend will translate `dc-media://` into a reachable image URL and render it.

-## 参数映射
+## Parameter Mapping

-### 尺寸选择
+### Size selection

-通义万相通过 compatible-mode 调用时，尺寸通过 `size` 参数传入（放在请求体顶层）：
+When calling Wan via compatible-mode, the size is passed as the top-level `size` parameter:

 ```json
 {
@@ -168,40 +186,40 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \
 }
 ```

-| 用户意图 | size 参数 |
+| User intent | size value |
 |---------|-----------|
-| 正方形/头像/默认 | "1024x1024" |
-| 横版/风景/壁纸 | "1792x1024" |
-| 竖版/手机/海报 | "1024x1792" |
+| Square / avatar / default | "1024x1024" |
+| Landscape / scenery / wallpaper | "1792x1024" |
+| Portrait / mobile / poster | "1024x1792" |

-### 可选参数（加入请求体顶层）
+### Optional parameters (top-level body fields)

-| 参数 | 说明 |
+| Parameter | Description |
 |------|------|
-| `n` | 生成数量 1-4，默认 1 |
-| `size` | 图片尺寸，如 "1024x1024" |
+| `n` | Number of images, 1–4, default 1 |
+| `size` | Image size, e.g. "1024x1024" |

-## 多图生成
+## Multiple Image Generation

-当 `n > 1` 时，`choices` 数组会有多个元素，每个 `message.content` 中都有一张图片。需要为每张图片执行下载+上传，然后逐一展示：
+When `n > 1`, the `choices` array contains multiple entries, each with an image inside `message.content`. Download and upload each image, then render them one by one:

 ```
-![图片1描述](dc-media://mediaId1)
-![图片2描述](dc-media://mediaId2)
+![Image 1 description](dc-media://mediaId1)
+![Image 2 description](dc-media://mediaId2)
 ```

-## 错误处理
+## Error Handling

- `success: false` + `error: "未找到匹配的供应商"`：未配置 DashScope Provider 或未启用
- `success: false` + `error: "未配置 API Key"`：未填写 API Key
- `statusCode: 401`：API Key 无效或已过期
- `statusCode: 429`：频率限制，稍后重试
- `statusCode: 400` + `InvalidParameter`：参数错误（如尺寸不支持）
- `statusCode: 403` + `AccessDenied.Unpurchased`：模型未开通，需要在阿里云控制台开通
+- `success: false` + `error: "No matching provider"`: DashScope provider not configured or disabled
+- `success: false` + `error: "API Key not configured"`: API Key missing
+- `statusCode: 401`: API Key invalid or expired
+- `statusCode: 429`: rate limited, retry later
+- `statusCode: 400` + `InvalidParameter`: bad parameters (e.g. unsupported size)
+- `statusCode: 403` + `AccessDenied.Unpurchased`: model not activated; enable it in the Alibaba Cloud console

-## 注意事项
+## Notes

- 通过 compatible-mode 调用是同步的，通常 10-60 秒返回（wan2.7-image-pro 可能更长）
- 结果图片 URL 有时效，必须及时下载
- 提示词建议用英文以获得最佳效果，中文也支持
- 如果用户未明确要求模型/尺寸，默认使用 `wan2.7-image` + `1024x1024`
+- compatible-mode calls are synchronous and typically return in 10–60 seconds (wan2.7-image-pro can take longer)
+- Image URLs expire; download promptly
+- English prompts usually produce the best results; Chinese is also supported
+- When the user does not specify a model or size, default to `wan2.7-image` + `1024x1024`
--- a/skills/dashscope-image-gen/SKILL.zh-CN.md
+++ b/skills/dashscope-image-gen/SKILL.zh-CN.md
@@ -0,0 +1,161 @@
+<!-- locale: zh-CN -->
+
+# dashscope-image-gen 技能
+
+## 强制规则（违反将导致功能失败）
+
+1. **必须用 HTTPS 访问 agent-service** — `https://127.0.0.1:${PORT}` 加 `-k` 跳过证书验证
+2. **必须通过 `/api/media/upload` 上传到 media-store** — /tmp 仅作下载/解码中转，不可直接以本地路径作为最终输出
+3. **必须使用 `dc-media://` 协议展示图片** — 唯一能让前端正确渲染的方式
+4. **全程使用 Bash curl** — 不要使用 HttpRequest 工具或 Python
+5. **使用 compatible-mode（/chat/completions）** — 同步调用，响应直接包含图片 URL
+
+## 模型选择指南
+
+| 模型 | 特点 | 适用场景 |
+|------|------|---------|
+| wan2.7-image-pro | 旗舰，4K 分辨率，thinking_mode | 用户要求最高画质、4K、细节丰富 |
+| wan2.7-image | 标准高画质，thinking_mode | **默认首选**，无特殊要求时使用 |
+
+**默认规则**：用户未指定模型时，使用 `wan2.7-image`。
+
+## 完整执行流程（严格按此三步执行）
+
+### 前置条件
+
+- 用户已在资源管理器-算力中配置阿里云 DashScope Provider 并填写 API Key
+- agent-service 正在运行
+
+### 第一步：调用文生图 API（同步）
+
+通过 media-proxy 的 compatible-mode 端点生成图片，响应直接包含图片 URL：
+
+```bash
+PORT=$(cat ~/.desirecore/agent-service.port)
+curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "provider": "dashscope",
+    "serviceType": "image_gen",
+    "endpoint": "/chat/completions",
+    "body": {
+      "model": "wan2.7-image",
+      "messages": [
+        {
+          "role": "user",
+          "content": [
+            {"type": "text", "text": "这里替换为图片描述（建议英文效果更好）"}
+          ]
+        }
+      ]
+    },
+    "responseType": "json"
+  }'
+```
+
+**响应示例**：
+```json
+{
+  "success": true,
+  "data": {
+    "request_id": "...",
+    "output": {
+      "choices": [
+        {
+          "message": {
+            "role": "assistant",
+            "content": [
+              {
+                "type": "image",
+                "image": "https://dashscope-result.oss.aliyuncs.com/..."
+              }
+            ]
+          },
+          "finish_reason": "stop"
+        }
+      ]
+    }
+  },
+  "statusCode": 200
+}
+```
+
+从 `data.output.choices[0].message.content` 中找到 `type: "image"` 的项，提取其 `image` URL。
+
+### 第二步：下载并上传到 media-store
+
+图片 URL 有时效，必须立即下载并保存到本地 media-store：
+
+```bash
+PORT=$(cat ~/.desirecore/agent-service.port)
+IMAGE_URL="第一步响应中的 image URL"
+curl -sL "$IMAGE_URL" -o /tmp/dashscope-gen.png && \
+curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \
+  -F "file=@/tmp/dashscope-gen.png;type=image/png"
+```
+
+从 JSON 响应中提取 `mediaId` 字段（格式如 `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.png`）。
+
+### 第三步：用 dc-media 协议展示图片
+
+在你的回复文本中直接写 Markdown 图片语法：
+
+```
+![图片描述](dc-media://这里替换为mediaId)
+```
+
+例如：`![森林中的白色狐狸](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.png)`
+
+前端会自动将 `dc-media://` 转为可访问的图片 URL 并渲染出来。
+
+## 参数映射
+
+### 尺寸选择
+
+通义万相通过 compatible-mode 调用时，尺寸通过 `size` 参数传入（放在请求体顶层）：
+
+```json
+{
+  "model": "wan2.7-image",
+  "size": "1024x1024",
+  "messages": [...]
+}
+```
+
+| 用户意图 | size 参数 |
+|---------|-----------|
+| 正方形/头像/默认 | "1024x1024" |
+| 横版/风景/壁纸 | "1792x1024" |
+| 竖版/手机/海报 | "1024x1792" |
+
+### 可选参数（加入请求体顶层）
+
+| 参数 | 说明 |
+|------|------|
+| `n` | 生成数量 1-4，默认 1 |
+| `size` | 图片尺寸，如 "1024x1024" |
+
+## 多图生成
+
+当 `n > 1` 时，`choices` 数组会有多个元素，每个 `message.content` 中都有一张图片。需要为每张图片执行下载+上传，然后逐一展示：
+
+```
+![图片1描述](dc-media://mediaId1)
+![图片2描述](dc-media://mediaId2)
+```
+
+## 错误处理
+
+- `success: false` + `error: "未找到匹配的供应商"`：未配置 DashScope Provider 或未启用
+- `success: false` + `error: "未配置 API Key"`：未填写 API Key
+- `statusCode: 401`：API Key 无效或已过期
+- `statusCode: 429`：频率限制，稍后重试
+- `statusCode: 400` + `InvalidParameter`：参数错误（如尺寸不支持）
+- `statusCode: 403` + `AccessDenied.Unpurchased`：模型未开通，需要在阿里云控制台开通
+
+## 注意事项
+
+- 通过 compatible-mode 调用是同步的，通常 10-60 秒返回（wan2.7-image-pro 可能更长）
+- 结果图片 URL 有时效，必须及时下载
+- 提示词建议用英文以获得最佳效果，中文也支持
+- 如果用户未明确要求模型/尺寸，默认使用 `wan2.7-image` + `1024x1024`
--- a/skills/manage-skills/SKILL.md
+++ b/skills/manage-skills/SKILL.md
@@ -38,9 +38,8 @@ metadata:
      description: >-
        Manage the Skill lifecycle of an Agent: import, install, update, and delete Skills via HTTP API, or directly author standards-compliant SKILL.md files via the AgentFS filesystem. Use when the user requests to install Skills, import Skills from URL/Git, author new Skills, or manage existing Skills.
      body: ./SKILL.md
-      source_hash: sha256:7f116cc5de352822
-      translated_by: ai:claude-opus-4-7
-      translated_at: '2026-05-03'
+      source_hash: sha256:e67016840ba430ae
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
--- a/skills/markdown/SKILL.md
+++ b/skills/markdown/SKILL.md
@@ -52,7 +52,7 @@ metadata:
        particular format. Ensures files are written via Write tool, absolute
        path is reported, and attachment is sent via SendUserMessage.
      body: ./SKILL.md
-      source_hash: sha256:0000000000000000
+      source_hash: sha256:2434b01b42d751c0
      translated_by: human
 market:
  icon: >-
--- a/skills/minimax-music-gen/SKILL.md
+++ b/skills/minimax-music-gen/SKILL.md
@@ -45,9 +45,8 @@ metadata:
      description: >-
        Use this skill when the user wants to generate music using MiniMax's Music Generation API. Supports text-to-music with lyrics, instrumental generation, and music cover. Use when the user mentions generating music, text-to-music, AI composing, creating songs, writing a song, music generation, AI music, MiniMax music, songwriting, instrumental music, accompaniment, cover, or remake.
      body: ./SKILL.md
-      source_hash: sha256:403153a9c1da2ad9
-      translated_by: ai:claude-opus-4-7
-      translated_at: '2026-05-04'
+      source_hash: sha256:f3785e1da2fc5a11
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
--- a/skills/minimax-video-gen/SKILL.md
+++ b/skills/minimax-video-gen/SKILL.md
@@ -45,9 +45,8 @@ metadata:
      description: >-
        Use this skill when the user wants to generate videos using MiniMax's Hailuo model. Supports text-to-video, image-to-video, and subject reference. The API is asynchronous — submit a task, poll for status, then download. Use when the user mentions generating videos, text-to-video, AI video, creating videos, video generation, animation generation, MiniMax video, Hailuo, image-to-video.
      body: ./SKILL.md
-      source_hash: sha256:57314c8d07d63585
-      translated_by: ai:claude-opus-4-7
-      translated_at: '2026-05-03'
+      source_hash: sha256:3b2855b9ff2d0ef1
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
--- a/skills/skill-creator/SKILL.md
+++ b/skills/skill-creator/SKILL.md
@@ -39,9 +39,8 @@ metadata:
      description: >-
        Guides users to create and edit standards-compliant SKILL.md skill packages. Supports the DesireCore full format (frontmatter metadata + L0/L1/L2 layered content + scripts/references/assets) and the Claude Code basic format. Use when the user requests to create a new Skill, update an existing Skill, or package experience into a reusable Skill bundle.
      body: ./SKILL.md
-      source_hash: sha256:fa0f3136371f236c
-      translated_by: ai:claude-opus-4-7
-      translated_at: '2026-05-03'
+      source_hash: sha256:2e8b886dc0b77dd1
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
--- a/skills/web-access/SKILL.md
+++ b/skills/web-access/SKILL.md
@@ -61,9 +61,8 @@ metadata:
      short_desc: Web search, page fetching, logged-in browser access via CDP, research workflows
      description: A three-layer web-access toolkit — search public pages, fetch heavy pages via Jina Reader, and reach logged-in sites via Chrome CDP.
      body: ./SKILL.md
-      source_hash: sha256:0ba170b3126a0823
-      translated_by: ai:claude-opus-4-7
-      translated_at: '2026-05-03'
+      source_hash: sha256:1d044824f5ab31bc
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
--- a/skills/xiaomi-tts/SKILL.md
+++ b/skills/xiaomi-tts/SKILL.md
@@ -30,6 +30,24 @@ metadata:
  i18n:
    default_locale: en-US
    source_locale: zh-CN
+    locales:
+      - zh-CN
+      - en-US
+    zh-CN:
+      name: 小米 MiMo 语音合成
+      short_desc: 基于小米 MiMo 的文本转语音技能
+      description: >-
+        当用户希望使用小米 MiMo 的 TTS 模型（mimo-v2.5-tts）将文本转为语音时使用此技能。基于 OpenAI 兼容的 chat/completions API，响应中携带音频。支持多种预置音色和自定义音色设计。用户提到 语音合成、文字转语音、TTS、朗读、读出来、生成语音、生成音频、文本转音频、配音、念出来、小米语音、MiMo 语音、小米 TTS。
+      body: ./SKILL.zh-CN.md
+      source_hash: sha256:2dd06b13152349e5
+      translated_by: human
+    en-US:
+      name: Xiaomi MiMo TTS
+      short_desc: Text-to-speech synthesis using Xiaomi MiMo models
+      description: "Use this skill when the user wants to convert text to speech using Xiaomi MiMo's TTS models (mimo-v2.5-tts). Built on the OpenAI-compatible chat/completions API with audio response, supporting multiple preset voices and custom voice design. Trigger keywords: text-to-speech, TTS, read aloud, narrate, generate audio, voice synthesis, MiMo voice, Xiaomi TTS."
+      body: ./SKILL.md
+      source_hash: sha256:2dd06b13152349e5
+      translated_by: human
 market:
  icon: >-
    <svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0
@@ -46,58 +64,58 @@ market:
  channel: latest
 ---

-# xiaomi-tts 技能
+# xiaomi-tts Skill

-## 强制规则（违反将导致功能失败）
+## Mandatory Rules (violations cause failure)

-1. **必须用 HTTPS 访问 agent-service** — `https://127.0.0.1:${PORT}` 加 `-k` 跳过证书验证
-2. **必须通过 `/api/media/upload` 上传到 media-store** — 禁止保存到本地路径
-3. **必须使用 `dc-media://` 协议展示音频** — 唯一能让前端正确渲染的方式
-4. **全程使用 Bash curl** — 不要使用 HttpRequest 工具或 Python
-5. **使用 /chat/completions 端点** — 小米 MiMo TTS 使用 OpenAI 兼容格式
+1. **Must access agent-service over HTTPS** — use `https://127.0.0.1:${PORT}` with `-k` to skip certificate verification
+2. **Must upload to media-store via `/api/media/upload`** — `/tmp` is only a transient download/decode location, never use a local path as the final output
+3. **Must use the `dc-media://` protocol to display audio** — the only form the frontend can render correctly
+4. **Use Bash curl throughout** — do not use the HttpRequest tool or Python
+5. **Use the `/chat/completions` endpoint** — Xiaomi MiMo TTS speaks OpenAI-compatible chat format

-## 模型选择指南
+## Model Selection

-| 模型 | 特点 | 适用场景 |
+| Model | Characteristics | When to use |
 |------|------|---------|
-| mimo-v2.5-tts | 标准 TTS，多种预置音色 | **默认首选**，常规语音合成 |
-| mimo-v2.5-tts-voicedesign | 自定义音色设计 | 需要特定音色描述生成 |
-| mimo-v2.5-tts-voiceclone | 声音克隆 | 需要克隆特定人声（需上传参考音频） |
+| mimo-v2.5-tts | Standard TTS, multiple preset voices | **Default**, regular speech synthesis |
+| mimo-v2.5-tts-voicedesign | Custom voice design | When you need a voice generated from a description |
+| mimo-v2.5-tts-voiceclone | Voice cloning | When you need to clone a specific voice (reference audio required) |

-**默认规则**：用户未指定模型时，使用 `mimo-v2.5-tts`。
+**Default rule**: if the user does not specify a model, use `mimo-v2.5-tts`.

-## 音色选择指南
+## Voice Selection

-### 预置音色
+### Preset Voices

-| voice_id | 名称 | 特点 |
+| voice_id | Name | Characteristics |
 |----------|------|------|
-| default_zh | 默认中文 | 中文通用女声 |
-| default_en | 默认英文 | 英文通用女声 |
-| mimo_default | MiMo 默认 | MiMo 特色音色 |
-| Bingtang | 冰糖 | 甜美女声 |
-| Moli | 茉莉 | 温柔女声 |
-| Suda | 苏打 | 年轻男声 |
-| Baihua | 白桦 | 成熟男声 |
-| Mia | Mia | 英文女声 |
-| Chloe | Chloe | 英文女声 |
-| Milo | Milo | 英文男声 |
-| Dean | Dean | 英文男声 |
+| default_zh | Default Chinese | General-purpose Chinese female voice |
+| default_en | Default English | General-purpose English female voice |
+| mimo_default | MiMo Default | MiMo's signature voice |
+| Bingtang | Bingtang | Sweet female voice |
+| Moli | Moli | Soft, gentle female voice |
+| Suda | Suda | Young male voice |
+| Baihua | Baihua | Mature male voice |
+| Mia | Mia | English female voice |
+| Chloe | Chloe | English female voice |
+| Milo | Milo | English male voice |
+| Dean | Dean | English male voice |

-**默认规则**：中文内容用 `Bingtang`，英文内容用 `Mia`，用户未指定时按内容语言自动选择。
+**Default rule**: use `Bingtang` for Chinese text and `Mia` for English text; if the user doesn't specify, pick automatically by content language.

-## 完整执行流程（严格按此三步执行）
+## Full Execution Flow (strictly three steps)

-### 前置条件
+### Prerequisites

- 用户已在资源管理器-算力中配置小米 MiMo Provider 并填写 API Key
- agent-service 正在运行
+- The user has configured a Xiaomi MiMo provider in Resource Manager → Compute and filled in an API Key
+- agent-service is running

-### 第一步：调用 TTS API
+### Step 1: Call the TTS API

-通过 media-proxy 的 /chat/completions 端点生成语音。
+Generate speech via media-proxy's `/chat/completions` endpoint.

-**重要**：messages 必须使用 `assistant` role（不是 user），要合成的文本放在 assistant 消息的 content 中。
+**Important**: `messages` must use the `assistant` role (not `user`); the text to synthesize goes in the assistant message's content.

 ```bash
 PORT=$(cat ~/.desirecore/agent-service.port)
@@ -112,7 +130,7 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
      "messages": [
        {
          "role": "assistant",
-          "content": "这里替换为要合成的文本内容"
+          "content": "Replace this with the text to synthesize"
        }
      ],
      "voice": "Bingtang",
@@ -122,7 +140,7 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
  }'
 ```

-**响应示例**：
+**Example response**:
 ```json
 {
  "success": true,
@@ -134,7 +152,7 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
        "message": {
          "role": "assistant",
          "audio": {
-            "data": "base64编码的音频数据...",
+            "data": "base64-encoded audio data...",
            "format": "mp3"
          }
        },
@@ -146,17 +164,17 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
 }
 ```

-从 `data.choices[0].message.audio.data` 提取 base64 编码的音频数据。
+Pull the base64-encoded audio data from `data.choices[0].message.audio.data`.

-### 第二步：解码并上传到 media-store
+### Step 2: Decode and upload to media-store

-音频以 base64 返回，需要解码后保存到本地 media-store。
+The audio comes back as base64; decode it and save to the local media-store.

-**推荐方式**（先保存完整响应到文件，避免 shell 参数过长）：
+**Recommended approach** (write the full response to a file first to avoid overlong shell arguments):

 ```bash
 PORT=$(cat ~/.desirecore/agent-service.port)
-# 将完整请求和响应保存到文件
+# Save the full request and response to a file
 curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
  -H "Content-Type: application/json" \
  -d '{
@@ -165,74 +183,74 @@ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
    "endpoint": "/chat/completions",
    "body": {
      "model": "mimo-v2.5-tts",
-      "messages": [{"role": "assistant", "content": "要合成的文本"}],
+      "messages": [{"role": "assistant", "content": "Text to synthesize"}],
      "voice": "Bingtang",
      "audio": {"format": "mp3"}
    },
    "responseType": "json"
  }' > /tmp/xiaomi-tts-response.json

-# 提取 base64 音频数据并解码
+# Extract and decode the base64 audio data
 cat /tmp/xiaomi-tts-response.json | jq -r '.data.choices[0].message.audio.data' | base64 -d > /tmp/xiaomi-tts.mp3

-# 上传到 media-store
+# Upload to media-store
 curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \
  -F "file=@/tmp/xiaomi-tts.mp3;type=audio/mpeg"
 ```

-从 JSON 响应中提取 `mediaId` 字段（格式如 `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.mp3`）。
+Pick the `mediaId` field from the JSON response (format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.mp3`).

-### 第三步：用 dc-media 协议展示音频
+### Step 3: Render the audio via the dc-media protocol

-在你的回复文本中直接写 Markdown 语法：
+In your reply text, write Markdown syntax directly:

 ```
-![语音合成结果](dc-media://这里替换为mediaId)
+![TTS result](dc-media://replace-with-mediaId)
 ```

-例如：`![TTS: 你好世界](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.mp3)`
+For example: `![TTS: Hello world](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.mp3)`

-前端会自动检测 `.mp3` 扩展名并渲染为音频播放器。
+The frontend detects the `.mp3` extension and renders an audio player.

-## 参数映射
+## Parameter Mapping

-### 请求体参数（放在 body 中）
+### Request body parameters (inside `body`)

-| 参数 | 说明 | 默认值 |
+| Parameter | Description | Default |
 |------|------|--------|
-| `model` | 模型名称 | "mimo-v2.5-tts" |
-| `messages[0].role` | **必须为 "assistant"** | "assistant"（固定） |
-| `messages[0].content` | 要合成的文本 | 必填 |
-| `voice` | 音色 ID | "Bingtang"（中文）/ "Mia"（英文） |
-| `audio.format` | 音频格式 | "mp3"（可选 "wav"） |
+| `model` | Model name | "mimo-v2.5-tts" |
+| `messages[0].role` | **Must be "assistant"** | "assistant" (fixed) |
+| `messages[0].content` | Text to synthesize | required |
+| `voice` | Voice ID | "Bingtang" (Chinese) / "Mia" (English) |
+| `audio.format` | Audio format | "mp3" (also accepts "wav") |

-### 用户意图映射
+### User intent mapping

-| 用户意图 | 参数选择 |
+| User intent | Parameter |
 |---------|---------|
-| 甜美/可爱 | voice: "Bingtang" |
-| 温柔/知性 | voice: "Moli" |
-| 年轻男声 | voice: "Suda" |
-| 成熟男声 | voice: "Baihua" |
-| 英文女声 | voice: "Mia" 或 "Chloe" |
-| 英文男声 | voice: "Milo" 或 "Dean" |
-| 高音质/无损 | response_format: "wav" |
+| Sweet / cute | voice: "Bingtang" |
+| Gentle / refined | voice: "Moli" |
+| Young male | voice: "Suda" |
+| Mature male | voice: "Baihua" |
+| English female | voice: "Mia" or "Chloe" |
+| English male | voice: "Milo" or "Dean" |
+| High fidelity / lossless | audio.format: "wav" |

-## 错误处理
+## Error Handling

- `success: false` + `error: "未找到匹配的供应商"`：未配置小米 MiMo Provider 或未启用
- `success: false` + `error: "未配置 API Key"`：未填写 API Key
- `statusCode: 401`：API Key 无效或已过期
- `statusCode: 429`：频率限制，稍后重试
- `statusCode: 400`：参数错误（如 voice 不存在、文本为空）
- `statusCode: 403`：模型未开通或权限不足
+- `success: false` + `error: "No matching provider"`: Xiaomi MiMo provider not configured or disabled
+- `success: false` + `error: "API Key not configured"`: API Key missing
+- `statusCode: 401`: API Key invalid or expired
+- `statusCode: 429`: rate limited, retry later
+- `statusCode: 400`: bad parameters (e.g. unknown voice, empty text)
+- `statusCode: 403`: model not activated or insufficient permission

-## 注意事项
+## Notes

- 调用是同步的，通常 3-15 秒返回（视文本长度而定）
- 音频以 base64 返回，无外部 URL 时效问题，但数据量较大时注意 shell 参数长度限制
- 长文本建议分段合成（每段不超过 500 字），然后逐段上传展示
- 如果用户未明确要求音色/格式，默认使用 `mimo-v2.5-tts` + 按语言选音色 + `mp3`
- Token Plan 密钥（tp- 前缀）使用 `https://token-plan-cn.xiaomimimo.com/v1` 端点
- 按量付费密钥使用 `https://api.xiaomimimo.com/v1` 端点
- media-proxy 会自动根据配置选择正确的端点，技能无需区分
+- Calls are synchronous, typically 3–15 seconds depending on text length
+- Audio is returned as base64, so URL expiry is not a concern, but watch shell argument length on long responses
+- For long text, split into segments (no more than ~500 chars each), then upload and render each segment
+- When the user doesn't specify, default to `mimo-v2.5-tts` + auto-selected voice by language + `mp3`
+- Token Plan keys (prefix `tp-`) use the `https://token-plan-cn.xiaomimimo.com/v1` endpoint
+- Pay-as-you-go keys use the `https://api.xiaomimimo.com/v1` endpoint
+- media-proxy picks the correct endpoint based on configuration; the skill does not need to differentiate
--- a/skills/xiaomi-tts/SKILL.zh-CN.md
+++ b/skills/xiaomi-tts/SKILL.zh-CN.md
@@ -0,0 +1,192 @@
+<!-- locale: zh-CN -->
+
+# xiaomi-tts 技能
+
+## 强制规则（违反将导致功能失败）
+
+1. **必须用 HTTPS 访问 agent-service** — `https://127.0.0.1:${PORT}` 加 `-k` 跳过证书验证
+2. **必须通过 `/api/media/upload` 上传到 media-store** — /tmp 仅作下载/解码中转，不可直接以本地路径作为最终输出
+3. **必须使用 `dc-media://` 协议展示音频** — 唯一能让前端正确渲染的方式
+4. **全程使用 Bash curl** — 不要使用 HttpRequest 工具或 Python
+5. **使用 /chat/completions 端点** — 小米 MiMo TTS 使用 OpenAI 兼容格式
+
+## 模型选择指南
+
+| 模型 | 特点 | 适用场景 |
+|------|------|---------|
+| mimo-v2.5-tts | 标准 TTS，多种预置音色 | **默认首选**，常规语音合成 |
+| mimo-v2.5-tts-voicedesign | 自定义音色设计 | 需要特定音色描述生成 |
+| mimo-v2.5-tts-voiceclone | 声音克隆 | 需要克隆特定人声（需上传参考音频） |
+
+**默认规则**：用户未指定模型时，使用 `mimo-v2.5-tts`。
+
+## 音色选择指南
+
+### 预置音色
+
+| voice_id | 名称 | 特点 |
+|----------|------|------|
+| default_zh | 默认中文 | 中文通用女声 |
+| default_en | 默认英文 | 英文通用女声 |
+| mimo_default | MiMo 默认 | MiMo 特色音色 |
+| Bingtang | 冰糖 | 甜美女声 |
+| Moli | 茉莉 | 温柔女声 |
+| Suda | 苏打 | 年轻男声 |
+| Baihua | 白桦 | 成熟男声 |
+| Mia | Mia | 英文女声 |
+| Chloe | Chloe | 英文女声 |
+| Milo | Milo | 英文男声 |
+| Dean | Dean | 英文男声 |
+
+**默认规则**：中文内容用 `Bingtang`，英文内容用 `Mia`，用户未指定时按内容语言自动选择。
+
+## 完整执行流程（严格按此三步执行）
+
+### 前置条件
+
+- 用户已在资源管理器-算力中配置小米 MiMo Provider 并填写 API Key
+- agent-service 正在运行
+
+### 第一步：调用 TTS API
+
+通过 media-proxy 的 /chat/completions 端点生成语音。
+
+**重要**：messages 必须使用 `assistant` role（不是 user），要合成的文本放在 assistant 消息的 content 中。
+
+```bash
+PORT=$(cat ~/.desirecore/agent-service.port)
+curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "provider": "xiaomi",
+    "serviceType": "tts",
+    "endpoint": "/chat/completions",
+    "body": {
+      "model": "mimo-v2.5-tts",
+      "messages": [
+        {
+          "role": "assistant",
+          "content": "这里替换为要合成的文本内容"
+        }
+      ],
+      "voice": "Bingtang",
+      "audio": {"format": "mp3"}
+    },
+    "responseType": "json"
+  }'
+```
+
+**响应示例**：
+```json
+{
+  "success": true,
+  "data": {
+    "id": "chatcmpl-...",
+    "choices": [
+      {
+        "index": 0,
+        "message": {
+          "role": "assistant",
+          "audio": {
+            "data": "base64编码的音频数据...",
+            "format": "mp3"
+          }
+        },
+        "finish_reason": "stop"
+      }
+    ]
+  },
+  "statusCode": 200
+}
+```
+
+从 `data.choices[0].message.audio.data` 提取 base64 编码的音频数据。
+
+### 第二步：解码并上传到 media-store
+
+音频以 base64 返回，需要解码后保存到本地 media-store。
+
+**推荐方式**（先保存完整响应到文件，避免 shell 参数过长）：
+
+```bash
+PORT=$(cat ~/.desirecore/agent-service.port)
+# 将完整请求和响应保存到文件
+curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "provider": "xiaomi",
+    "serviceType": "tts",
+    "endpoint": "/chat/completions",
+    "body": {
+      "model": "mimo-v2.5-tts",
+      "messages": [{"role": "assistant", "content": "要合成的文本"}],
+      "voice": "Bingtang",
+      "audio": {"format": "mp3"}
+    },
+    "responseType": "json"
+  }' > /tmp/xiaomi-tts-response.json
+
+# 提取 base64 音频数据并解码
+cat /tmp/xiaomi-tts-response.json | jq -r '.data.choices[0].message.audio.data' | base64 -d > /tmp/xiaomi-tts.mp3
+
+# 上传到 media-store
+curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \
+  -F "file=@/tmp/xiaomi-tts.mp3;type=audio/mpeg"
+```
+
+从 JSON 响应中提取 `mediaId` 字段（格式如 `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.mp3`）。
+
+### 第三步：用 dc-media 协议展示音频
+
+在你的回复文本中直接写 Markdown 语法：
+
+```
+![语音合成结果](dc-media://这里替换为mediaId)
+```
+
+例如：`![TTS: 你好世界](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.mp3)`
+
+前端会自动检测 `.mp3` 扩展名并渲染为音频播放器。
+
+## 参数映射
+
+### 请求体参数（放在 body 中）
+
+| 参数 | 说明 | 默认值 |
+|------|------|--------|
+| `model` | 模型名称 | "mimo-v2.5-tts" |
+| `messages[0].role` | **必须为 "assistant"** | "assistant"（固定） |
+| `messages[0].content` | 要合成的文本 | 必填 |
+| `voice` | 音色 ID | "Bingtang"（中文）/ "Mia"（英文） |
+| `audio.format` | 音频格式 | "mp3"（可选 "wav"） |
+
+### 用户意图映射
+
+| 用户意图 | 参数选择 |
+|---------|---------|
+| 甜美/可爱 | voice: "Bingtang" |
+| 温柔/知性 | voice: "Moli" |
+| 年轻男声 | voice: "Suda" |
+| 成熟男声 | voice: "Baihua" |
+| 英文女声 | voice: "Mia" 或 "Chloe" |
+| 英文男声 | voice: "Milo" 或 "Dean" |
+| 高音质/无损 | audio.format: "wav" |
+
+## 错误处理
+
+- `success: false` + `error: "未找到匹配的供应商"`：未配置小米 MiMo Provider 或未启用
+- `success: false` + `error: "未配置 API Key"`：未填写 API Key
+- `statusCode: 401`：API Key 无效或已过期
+- `statusCode: 429`：频率限制，稍后重试
+- `statusCode: 400`：参数错误（如 voice 不存在、文本为空）
+- `statusCode: 403`：模型未开通或权限不足
+
+## 注意事项
+
+- 调用是同步的，通常 3-15 秒返回（视文本长度而定）
+- 音频以 base64 返回，无外部 URL 时效问题，但数据量较大时注意 shell 参数长度限制
+- 长文本建议分段合成（每段不超过 500 字），然后逐段上传展示
+- 如果用户未明确要求音色/格式，默认使用 `mimo-v2.5-tts` + 按语言选音色 + `mp3`
+- Token Plan 密钥（tp- 前缀）使用 `https://token-plan-cn.xiaomimimo.com/v1` 端点
+- 按量付费密钥使用 `https://api.xiaomimimo.com/v1` 端点
+- media-proxy 会自动根据配置选择正确的端点，技能无需区分