feat: 新增 web-access 和 frontend-design 两个内置技能

根据 docs 推荐补齐 5 个内置技能中的 c) 和 e):

web-access v1.1.0:
- 三层架构:L1 WebSearch/WebFetch + L2 Jina Reader + L3 CDP Browser
- 添加 Chrome CDP 前置条件(macOS/Linux/Windows 启动命令)
- 支持登录态访问 小红书/B站/微博/知乎/飞书/Twitter/公众号
- Jina Reader 重新定位为默认 token 优化层(非兜底)
- 新增 references/cdp-browser.md(Python Playwright 详细操作手册)
- 触发词扩充:小红书、B站、微博、飞书、Twitter、推特、X、知乎、公众号

frontend-design v1.0.0:
- 从 Claude Code 官方 frontend-design 技能适配
- 保留原版 bold aesthetic 设计理念
- 新增 Project Context Override 章节:在 DesireCore 主仓库内工作时
  自动遵循 3+2 色彩体系(Green/Blue/Purple + Orange/Red)
- 添加 Output Rule 要求告知用户文件路径

builtin-skills.json: 12 → 14 skills
This commit is contained in:
张馨元
2026-04-07 15:35:59 +08:00
parent de6cfa0b23
commit 98322aa930
6 changed files with 1015 additions and 0 deletions

View File

@@ -0,0 +1,122 @@
# Jina Reader — Default Token-Optimization Layer
[Jina Reader](https://jina.ai/reader) is a free public service that renders any URL server-side and returns clean Markdown. In this skill's three-layer architecture, **Jina is Layer 2: the default extractor for heavy/JS-rendered pages**, not just a fallback.
---
## Positioning in the three-layer model
```
L1 WebFetch ── simple public static pages (docs, Wikipedia, HN)
│ WebFetch empty/truncated/garbled
L2 Jina Reader ── DEFAULT for JS-heavy SPAs, long articles, Medium, Twitter
│ Strips nav/ads automatically, saves 50-80% tokens
│ Login required, or Jina also fails
L3 CDP Browser ── user's logged-in Chrome (小红书/B站/微博/飞书/Twitter)
```
**Key insight**: Don't wait for WebFetch to fail before trying Jina. For any URL you expect to be JS-heavy (any major SPA, Medium, Dev.to, long-form articles), go straight to Jina for the token savings.
---
## Basic Usage (no API key)
```bash
curl -sL "https://r.jina.ai/https://example.com/article"
```
The original URL goes after `r.jina.ai/`. The response is plain Markdown — pipe to a file or read directly.
---
## When to use each layer
| Scenario | Primary choice | Why |
|----------|---------------|-----|
| Wikipedia, MDN, official docs | L1 WebFetch | Static clean HTML, fastest |
| GitHub README (public) | L1 WebFetch | Simple markup |
| Medium articles | **L2 Jina** | Member walls + heavy JS |
| Dev.to, Hashnode | **L2 Jina** | JS-rendered |
| Substack, Ghost blogs | **L2 Jina** | Partial JS rendering |
| News sites with lazy-load | **L2 Jina** | Scroll-triggered content |
| Twitter/X public threads | **L2 Jina** first, L3 CDP if truncated | Sometimes works |
| 公众号 (mp.weixin.qq.com) | **L2 Jina** | Clean Markdown extraction |
| LinkedIn articles | L3 CDP | Hard login wall |
| 小红书, B站, 微博, 飞书 | L3 CDP | 登录强制 |
---
## Token savings example
Raw HTML of a long Medium article: ~150 KB, ~50,000 tokens
Same article via Jina Reader: ~20 KB, ~7,000 tokens
**86% reduction**, with cleaner structure and no ads/nav cruft.
---
## Advanced Endpoints (optional)
If you need more than basic content extraction, Jina also offers:
- **Search**: `https://s.jina.ai/<query>` — returns top 5 results as Markdown
- **Embeddings**: `https://api.jina.ai/v1/embeddings` (requires free API key)
- **Reranker**: `https://api.jina.ai/v1/rerank` (requires free API key)
For DesireCore, prefer the built-in `WebSearch` tool over `s.jina.ai` for consistency.
---
## Rate Limits
- **Free tier**: ~20 requests/minute, no authentication needed
- **With free API key**: higher limits, fewer throttles
```bash
curl -sL "https://r.jina.ai/https://example.com" \
-H "Authorization: Bearer YOUR_KEY"
```
- Get a free key at [jina.ai](https://jina.ai) — stored in env var `JINA_API_KEY` if available
---
## Usage tips
### Cache your own results
Jina itself doesn't cache for you. If you call the same URL repeatedly in a session, save the Markdown to a temp file:
```bash
curl -sL "https://r.jina.ai/$URL" > /tmp/jina-cache.md
```
### Handle very long articles
Jina returns the full article in one response. For articles > 50K chars, pipe through `head` or extract specific sections with Python/awk before feeding back to the model context.
### Combine with CDP
When you use L3 CDP to fetch a login-gated page, you can pipe the resulting HTML through Jina for clean Markdown instead of parsing with BeautifulSoup:
```python
html = fetch_with_cdp(url) # from references/cdp-browser.md
# Now convert via Jina (note: Jina fetches the URL itself, not your HTML)
# So this only works if the content is already visible without login:
import subprocess
md = subprocess.run(["curl", "-sL", f"https://r.jina.ai/{url}"],
capture_output=True, text=True).stdout
```
For truly login-gated content, you must parse the HTML directly (BeautifulSoup) since Jina can't log in on your behalf.
---
## Failure Mode
If Jina Reader returns garbage or error:
1. **Hard login wall** → escalate to L3 CDP browser
2. **Geographically restricted** → tell the user, suggest VPN or manual access
3. **Cloudflare challenge** → try L3 CDP (user's browser passes challenges naturally)
4. **404 / gone** → confirm the URL is correct
In all cases, tell the user explicitly which URL failed and what you tried.