Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions src/content/skills-zh/yescan-ocr-qoder-zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
name: yescan-ocr-qoder
title: 夸克扫描王 OCR - 通用文字识别
description: 从图片、截图、照片或扫描文档中提取、识别和结构化文本——支持手写体、表格、数学公式、身份证、发票、医疗报告、营业执照等 17 种场景。由夸克扫描王 OCR API 提供支持。
source: community
author: yescan-ai
githubUrl: https://github.com/yescan-ai/yescan-ocr-qoder
docsUrl: https://scan.quark.cn/business
category: document
tags:
- ocr
- 文字识别
- 手写体
- 表格
- 发票
- 身份证
- 医疗报告
- 公式
- 夸克扫描王
roles:
- developer
- data-analyst
- finance
- legal
- hr
featured: false
popular: false
isOfficial: false
installCommand: |
git clone https://github.com/yescan-ai/yescan-ocr-qoder
cp -r yescan-ocr-qoder ~/.qoder/skills/
date: 2026-06-09
---

## 使用场景

- 从手写笔记、作文、信件照片中高精度提取文字
- 将图片中的表格数据识别并结构化为可机读格式
- 识别和解析身份证、社保卡、驾驶证、行驶证、港澳台通行证等证件
- 从增值税发票、火车票、英文商业发票中提取关键字段
- 识别数学公式和化学方程式,输出 LaTeX 格式结果
- 解析化验单、体检报告等医疗文档图片
- 提取商品标签、包装图片和营业执照中的文字
- 通用文字提取,作为兜底场景处理任意含文字的图片

## 核心能力

- **手写体 OCR**:识别潦草手写文字,准确率高
- **表格 OCR**:从图片中提取结构化表格数据
- **证件识别**:支持身份证、社保卡、通行证、学位证等 8+ 种证件类型
- **票据解析**:增值税发票、火车票、英文商业发票
- **公式识别**:数学公式和化学方程式转 LaTeX
- **题目 OCR**:从照片中识别考题和习题
- **医疗报告 OCR**:解析化验单和医疗文档
- **通用文字提取**:兜底模式,处理任意含文字图片

## 支持的场景(共 17 种)

| 序号 | 场景名称 | 场景标识 |
|------|----------|----------|
| 1 | 手写文档识别 | `handwritten-ocr` |
| 2 | 表格识别 | `table-ocr` |
| 3 | 身份证识别 | `idcard-ocr` |
| 4 | 社保卡识别 | `social-security-card-ocr` |
| 5 | 港澳通行证识别 | `travel-permit-ocr` |
| 6 | 学位证识别 | `degree-certificate-ocr` |
| 7 | 增值税发票识别 | `vat-invoice-ocr` |
| 8 | 火车票识别 | `train-ticket-ocr` |
| 9 | 公式识别 | `formula-ocr` |
| 10 | 题目识别 | `question-ocr` |
| 11 | 驾驶证识别 | `driver-license-ocr` |
| 12 | 行驶证识别 | `vehicle-license-ocr` |
| 13 | 英文发票识别 | `commercial-invoice-ocr` |
| 14 | 医疗报告单识别 | `medical-report-ocr` |
| 15 | 营业执照识别 | `business-license-ocr` |
| 16 | 商品图片识别 | `product-image-ocr` |
| 17 | 通用文字提取 | `general-ocr` |

## 示例

```
请帮我识别这张图片里的表格内容。
(附带一张表格截图)
```

```
帮我把这张手写笔记的照片转成文字。
```

```
识别这张增值税发票,提取发票号码、金额和税额。
```

## 配置说明

本技能需要夸克扫描王 API Key。

1. 访问[夸克扫描王开发者后台](https://scan.quark.cn/business)注册并获取 API Key(选择 AI Agent 接入类型)。
2. 将密钥写入 `~/.yescan_env`:
```bash
echo 'SCAN_WEBSERVICE_KEY=<你的API密钥>' > ~/.yescan_env
```
3. 安装技能后每次运行会自动读取配置,无需重启。

## 注意事项

- 支持的图片格式:jpg、jpeg、png、gif、bmp、webp、tiff、wbmp
- 图片大小限制:单张不超过 5MB
- 每次调用仅处理单张图片,批量处理需循环调用
- 运行环境需要 Python 3.9+
- 图片会发送至夸克扫描王服务器进行识别,数据不会被永久保存
- 不支持视频处理或实时摄像头流
94 changes: 94 additions & 0 deletions src/content/skills-zh/yescan-office-qoder-zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
name: yescan-office-qoder
title: Yescan Office - 图片转 Office/PDF 文档
description: 将图片、截图或扫描件转换为可编辑的 Office 文档(Word/Excel)或 PDF。支持复杂表格、合同、图文混排内容的版式还原。由夸克扫描王转换 API 提供支持。
source: community
author: changri
githubUrl: https://github.com/yescan-ai/yescan-office-qoder
docsUrl: https://scan.quark.cn/business
category: document
tags:
- ocr
- 图片转Word
- 图片转Excel
- 图片转PDF
- 格式转换
- 表格还原
- 夸克扫描王
roles:
- developer
- data-analyst
- finance
- legal
- hr
- executive
featured: false
popular: false
isOfficial: false
installCommand: |
git clone https://github.com/yescan-ai/yescan-office-qoder
cp -r yescan-office-qoder ~/.qoder/skills/
date: 2026-06-10
---

## 使用场景

- 将包含表格的财务报表截图转换为可编辑的 Excel 文件
- 将手写库存记录照片转换为结构化的 Excel 表格
- 将会议记录拍照图片转换为 Word 文档
- 将产品说明书截图转换为 .docx 格式,保留原始版式
- 将手写课堂笔记图片转换为 PDF 文档存档
- 将合同照片处理为清晰的 PDF 文件,便于存档和分发
- 将白板草图转换为版式整洁的 PDF 文档

## 核心能力

- **图片转 Excel**:识别图片中的表格数据,转换为 .xlsx 文件,保留行列结构和数据关系
- **图片转 Word**:还原图片中的图文排版,生成可编辑的 .docx 文档,支持长文本和多段落内容
- **图片转 PDF**:将图片转换为版式规范的 PDF 文档,适合存档和打印
- **版式还原**:尽量还原原始图片中的排版、表格边框、字体大小等视觉元素
- **自动保存**:转换后的文件自动保存到本地临时目录,返回文件路径可直接使用

## 支持的场景(共 3 种)

| 序号 | 场景名称 | 场景标识 | 输出格式 |
|------|----------|----------|----------|
| 1 | 图片转 Excel | `image-to-excel` | .xlsx |
| 2 | 图片转 Word | `image-to-word` | .docx |
| 3 | 图片转 PDF | `image-to-pdf` | .pdf |

## 示例

```
帮我把这张财务报表截图转换成 Excel 文件。
(附带一张含表格的截图)
```

```
把这张会议记录的拍照图片转成 Word 文档。
```

```
请将这张设备铭牌照片转换为 PDF 格式存档。
```

## 配置说明

本技能需要夸克扫描王 API Key。

1. 访问[夸克扫描王开发者后台](https://scan.quark.cn/business)注册并获取 API Key(选择 AI Agent 接入类型)。
2. 将密钥写入 `~/.yescan_env`:
```bash
echo 'SCAN_WEBSERVICE_KEY=<你的API密钥>' > ~/.yescan_env
```
3. 安装技能后每次运行会自动读取配置,无需重启。

## 注意事项

- 支持的图片格式:jpg、jpeg、png、gif、bmp、webp、tiff、wbmp
- 图片大小限制:单张不超过 5MB
- 每次调用仅处理单张图片,批量处理需循环调用
- 运行环境需要 Python 3.9+
- 转换后的文件保存至系统临时目录(如 `/tmp`),需自行管理清理
- 图片会发送至夸克扫描王服务器进行转换,数据不会被永久保存
- 不支持视频处理或实时摄像头流
112 changes: 112 additions & 0 deletions src/content/skills/yescan-ocr-qoder.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
name: yescan-ocr-qoder
title: Yescan OCR - Universal Text Recognition
description: Extract, recognize, and structure text from images, screenshots, photos, or scanned documents — including handwriting, tables, math formulas, ID cards, invoices, medical reports, business licenses, and more. Powered by Quark Scan King (夸克扫描王) OCR API.
source: community
author: yescan-ai
githubUrl: https://github.com/yescan-ai/yescan-ocr-qoder
docsUrl: https://scan.quark.cn/business
category: document
tags:
- ocr
- text-recognition
- handwriting
- table
- invoice
- id-card
- medical-report
- formula
- quark-scan
roles:
- developer
- data-analyst
- finance
- legal
- hr
featured: false
popular: false
isOfficial: false
installCommand: |
git clone https://github.com/yescan-ai/yescan-ocr-qoder
cp -r yescan-ocr-qoder ~/.qoder/skills/
date: 2026-06-09
---

## Use Cases

- Extract text from handwritten notes, essays, or letters with high accuracy
- Recognize and structure table data from images into machine-readable formats
- Identify and parse Chinese ID cards, social security cards, driver's licenses, vehicle licenses, and travel permits
- Extract key fields from VAT invoices, train tickets, and commercial invoices
- Recognize math formulas and equations, outputting LaTeX-compatible results
- Parse medical reports, lab results, and health check documents from images
- Extract text from product labels, packaging images, and business licenses
- General-purpose text extraction from any image as a fallback scenario

## Core Capabilities

- **Handwritten OCR**: Recognize cursive and messy handwriting from photos
- **Table OCR**: Extract structured table data from images
- **ID & Certificate Recognition**: Support 8+ document types including ID cards, social security cards, travel permits, and degree certificates
- **Invoice & Ticket Parsing**: VAT invoices, train tickets, and English commercial invoices
- **Formula Recognition**: Math formulas and chemical equations to LaTeX
- **Question OCR**: Extract exam questions and exercises from photos
- **Medical Report OCR**: Parse lab reports and medical documents
- **General Text Extraction**: Fallback mode for any image containing text

## Supported Scenarios (17 total)

| # | Scenario | Scene ID |
|---|----------|----------|
| 1 | Handwritten documents | `handwritten-ocr` |
| 2 | Tables | `table-ocr` |
| 3 | ID cards | `idcard-ocr` |
| 4 | Social security cards | `social-security-card-ocr` |
| 5 | Travel permits | `travel-permit-ocr` |
| 6 | Degree certificates | `degree-certificate-ocr` |
| 7 | VAT invoices | `vat-invoice-ocr` |
| 8 | Train tickets | `train-ticket-ocr` |
| 9 | Math formulas | `formula-ocr` |
| 10 | Exam questions | `question-ocr` |
| 11 | Driver's licenses | `driver-license-ocr` |
| 12 | Vehicle licenses | `vehicle-license-ocr` |
| 13 | Commercial invoices | `commercial-invoice-ocr` |
| 14 | Medical reports | `medical-report-ocr` |
| 15 | Business licenses | `business-license-ocr` |
| 16 | Product images | `product-image-ocr` |
| 17 | General text | `general-ocr` |

## Example

```
请帮我识别这张图片里的表格内容。
(附带一张表格截图)
```

```
帮我把这张手写笔记的照片转成文字。
```

```
识别这张增值税发票,提取发票号码、金额和税额。
```

## Setup

This skill requires a Quark Scan King (夸克扫描王) API Key.

1. Visit the [Quark Scan King Developer Portal](https://scan.quark.cn/business) to register and obtain an API Key (AI Agent type).
2. Save the key to `~/.yescan_env`:
```bash
echo 'SCAN_WEBSERVICE_KEY=<your_api_key>' > ~/.yescan_env
```
3. Install the skill and it will auto-read the config on each run.

## Notes

- Supports image formats: jpg, jpeg, png, gif, bmp, webp, tiff, wbmp
- Image size limit: 5MB per file
- Each invocation processes a single image; for batch processing, call in a loop
- Requires Python 3.9+
- Images are sent to Quark Scan King servers for recognition; data is not permanently stored
- Not suitable for video processing or real-time camera streams
Loading