--- name: dashscope-image-gen description: >- Use this skill when the user wants to generate images using Alibaba Cloud DashScope's Wan (通义万相) series models. Supports text-to-image with multiple model tiers (wan2.7-image-pro, wan2.7-image). Uses OpenAI-compatible chat/completions API for synchronous image generation. Use when 用户提到 生成图片、画图、文生图、创建图片、AI 绘画、 生成插图、画一张、帮我画、设计图片、通义万相、万相、阿里云画图、dashscope 画图。 license: Complete terms in LICENSE.txt version: 1.2.0 type: procedural risk_level: low status: enabled disable-model-invocation: false provider: auto tags: - media - image - generation - dashscope - alibaba requires: tools: - Bash metadata: author: desirecore updated_at: '2026-05-08' i18n: default_locale: en-US source_locale: zh-CN locales: - zh-CN - en-US zh-CN: name: 阿里云 文生图 short_desc: 基于阿里云通义万相的文本生成图片技能 description: >- 当用户希望使用阿里云 DashScope 的通义万相系列模型生成图片时使用此技能。支持多种模型层级(wan2.7-image-pro / wan2.7-image)的文生图,通过 OpenAI 兼容的 chat/completions API 同步生成图片。用户提到 生成图片、画图、文生图、创建图片、AI 绘画、生成插图、画一张、帮我画、设计图片、通义万相、万相、阿里云画图、dashscope 画图。 body: ./SKILL.zh-CN.md source_hash: sha256:135b99cdd33441fb translated_by: human en-US: name: DashScope Image Generation short_desc: Text-to-image generation using Alibaba Cloud Wan (通义万相) models description: "Use this skill when the user wants to generate images using Alibaba Cloud DashScope's Wan (通义万相) series models. Supports text-to-image with multiple model tiers (wan2.7-image-pro, wan2.7-image) via the OpenAI-compatible chat/completions API. Trigger keywords: generate image, draw, text-to-image, create image, AI painting, illustration, design picture, Wan, Tongyi Wanxiang, DashScope." body: ./SKILL.md source_hash: sha256:135b99cdd33441fb translated_by: human market: icon: >- short_desc: 基于阿里云通义万相的文本生成图片技能 category: media maintainer: name: DesireCore Official verified: true channel: latest --- # dashscope-image-gen Skill ## Mandatory Rules (violations cause failure) 1. **Must access agent-service over HTTPS** — use `https://127.0.0.1:${PORT}` with `-k` to skip certificate verification 2. **Must upload to media-store via `/api/media/upload`** — `/tmp` is only a transient download/decode location, never use a local path as the final output 3. **Must use the `dc-media://` protocol to display images** — the only form the frontend can render correctly 4. **Use Bash curl throughout** — do not use the HttpRequest tool or Python 5. **Use compatible-mode (`/chat/completions`)** — synchronous call; the response contains the image URL directly ## Model Selection | Model | Characteristics | When to use | |------|------|---------| | wan2.7-image-pro | Flagship, 4K resolution, thinking_mode | User asks for top quality, 4K, or rich detail | | wan2.7-image | Standard high quality, thinking_mode | **Default**, for unspecified requests | **Default rule**: if the user does not specify a model, use `wan2.7-image`. ## Full Execution Flow (strictly three steps) ### Prerequisites - At least one enabled compute provider supports `image_gen` service type (e.g. DashScope, or the default DesireCore Cloud provider) - agent-service is running ### Step 1: Call the text-to-image API (synchronous) Generate the image via media-proxy's compatible-mode endpoint; the response includes the image URL directly: ```bash PORT=$(cat ${DESIRECORE_ROOT}/agent-service.port) curl -sk -X POST "https://127.0.0.1:${PORT}/api/media-proxy" \ -H "Content-Type: application/json" \ -d '{ "serviceType": "image_gen", "endpoint": "/chat/completions", "body": { "model": "wan2.7-image", "messages": [ { "role": "user", "content": [ {"type": "text", "text": "Replace this with the image description (English usually gives better results)"} ] } ] }, "responseType": "json" }' ``` **Example response**: ```json { "success": true, "data": { "request_id": "...", "output": { "choices": [ { "message": { "role": "assistant", "content": [ { "type": "image", "image": "https://dashscope-result.oss.aliyuncs.com/..." } ] }, "finish_reason": "stop" } ] } }, "statusCode": 200 } ``` Locate the item with `type: "image"` inside `data.output.choices[0].message.content` and extract its `image` URL. ### Step 2: Download and upload to media-store The image URL is time-limited; download and persist it to the local media-store immediately: ```bash PORT=$(cat ${DESIRECORE_ROOT}/agent-service.port) IMAGE_URL="image URL from step 1's response" curl -sL "$IMAGE_URL" -o /tmp/dashscope-gen.png && \ curl -sk -X POST "https://127.0.0.1:${PORT}/api/media/upload" \ -F "file=@/tmp/dashscope-gen.png;type=image/png" ``` Pick the `mediaId` field from the JSON response (format `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.png`). ### Step 3: Render the image via the dc-media protocol In your reply text, write Markdown image syntax directly: ``` ![Image description](dc-media://replace-with-mediaId) ``` For example: `![White fox in a forest](dc-media://a1b2c3d4-e5f6-47a8-b9c0-d1e2f3a4b5c6.png)` The frontend will translate `dc-media://` into a reachable image URL and render it. ## Parameter Mapping ### Size selection When calling Wan via compatible-mode, the size is passed as the top-level `size` parameter: ```json { "model": "wan2.7-image", "size": "1024x1024", "messages": [...] } ``` | User intent | size value | |---------|-----------| | Square / avatar / default | "1024x1024" | | Landscape / scenery / wallpaper | "1792x1024" | | Portrait / mobile / poster | "1024x1792" | ### Optional parameters (top-level body fields) | Parameter | Description | |------|------| | `n` | Number of images, 1–4, default 1 | | `size` | Image size, e.g. "1024x1024" | ## Multiple Image Generation When `n > 1`, the `choices` array contains multiple entries, each with an image inside `message.content`. Download and upload each image, then render them one by one: ``` ![Image 1 description](dc-media://mediaId1) ![Image 2 description](dc-media://mediaId2) ``` ## Error Handling - `success: false` + `error: "No matching provider"`: No enabled provider supports `image_gen` service type - `success: false` + `error: "API Key not configured"`: API Key missing - `statusCode: 401`: API Key invalid or expired - `statusCode: 429`: rate limited, retry later - `statusCode: 400` + `InvalidParameter`: bad parameters (e.g. unsupported size) - `statusCode: 403` + `AccessDenied.Unpurchased`: model not activated; enable it in the Alibaba Cloud console ## Notes - compatible-mode calls are synchronous and typically return in 10–60 seconds (wan2.7-image-pro can take longer) - Image URLs expire; download promptly - English prompts usually produce the best results; Chinese is also supported - When the user does not specify a model or size, default to `wan2.7-image` + `1024x1024`