From 367affdb7ce3329f482c135b9e86d7d2962c6a43 Mon Sep 17 00:00:00 2001 From: yescan-vision Date: Tue, 9 Jun 2026 17:52:24 +0800 Subject: [PATCH 1/3] =?UTF-8?q?feat:=20=E6=8F=90=E4=BA=A4=E5=A4=B8?= =?UTF-8?q?=E5=85=8B=E6=89=AB=E6=8F=8F=E7=8E=8Bocr=20skills?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- src/content/skills-zh/yescan-ocr-qoder-zh.md | 112 +++++++++++++++++++ src/content/skills/yescan-ocr-qoder.md | 112 +++++++++++++++++++ 2 files changed, 224 insertions(+) create mode 100644 src/content/skills-zh/yescan-ocr-qoder-zh.md create mode 100644 src/content/skills/yescan-ocr-qoder.md diff --git a/src/content/skills-zh/yescan-ocr-qoder-zh.md b/src/content/skills-zh/yescan-ocr-qoder-zh.md new file mode 100644 index 0000000..7746e96 --- /dev/null +++ b/src/content/skills-zh/yescan-ocr-qoder-zh.md @@ -0,0 +1,112 @@ +--- +name: yescan-ocr-qoder +title: 夸克扫描王 OCR - 通用文字识别 +description: 从图片、截图、照片或扫描文档中提取、识别和结构化文本——支持手写体、表格、数学公式、身份证、发票、医疗报告、营业执照等 17 种场景。由夸克扫描王 OCR API 提供支持。 +source: community +author: changri +githubUrl: https://github.com/yescan-ai/yescan-ocr-qoder +docsUrl: https://scan.quark.cn/business +category: document +tags: + - ocr + - 文字识别 + - 手写体 + - 表格 + - 发票 + - 身份证 + - 医疗报告 + - 公式 + - 夸克扫描王 +roles: + - developer + - data-analyst + - finance + - legal + - hr +featured: false +popular: false +isOfficial: false +installCommand: | + git clone https://github.com/yescan-ai/yescan-ocr-qoder + cp -r yescan-ocr-qoder ~/.qoder/skills/ +date: 2026-06-09 +--- + +## 使用场景 + +- 从手写笔记、作文、信件照片中高精度提取文字 +- 将图片中的表格数据识别并结构化为可机读格式 +- 识别和解析身份证、社保卡、驾驶证、行驶证、港澳台通行证等证件 +- 从增值税发票、火车票、英文商业发票中提取关键字段 +- 识别数学公式和化学方程式,输出 LaTeX 格式结果 +- 解析化验单、体检报告等医疗文档图片 +- 提取商品标签、包装图片和营业执照中的文字 +- 通用文字提取,作为兜底场景处理任意含文字的图片 + +## 核心能力 + +- **手写体 OCR**:识别潦草手写文字,准确率高 +- **表格 OCR**:从图片中提取结构化表格数据 +- **证件识别**:支持身份证、社保卡、通行证、学位证等 8+ 种证件类型 +- **票据解析**:增值税发票、火车票、英文商业发票 +- **公式识别**:数学公式和化学方程式转 LaTeX +- **题目 OCR**:从照片中识别考题和习题 +- **医疗报告 OCR**:解析化验单和医疗文档 +- **通用文字提取**:兜底模式,处理任意含文字图片 + +## 支持的场景(共 17 种) + +| 序号 | 场景名称 | 场景标识 | +|------|----------|----------| +| 1 | 手写文档识别 | `handwritten-ocr` | +| 2 | 表格识别 | `table-ocr` | +| 3 | 身份证识别 | `idcard-ocr` | +| 4 | 社保卡识别 | `social-security-card-ocr` | +| 5 | 港澳通行证识别 | `travel-permit-ocr` | +| 6 | 学位证识别 | `degree-certificate-ocr` | +| 7 | 增值税发票识别 | `vat-invoice-ocr` | +| 8 | 火车票识别 | `train-ticket-ocr` | +| 9 | 公式识别 | `formula-ocr` | +| 10 | 题目识别 | `question-ocr` | +| 11 | 驾驶证识别 | `driver-license-ocr` | +| 12 | 行驶证识别 | `vehicle-license-ocr` | +| 13 | 英文发票识别 | `commercial-invoice-ocr` | +| 14 | 医疗报告单识别 | `medical-report-ocr` | +| 15 | 营业执照识别 | `business-license-ocr` | +| 16 | 商品图片识别 | `product-image-ocr` | +| 17 | 通用文字提取 | `general-ocr` | + +## 示例 + +``` +请帮我识别这张图片里的表格内容。 +(附带一张表格截图) +``` + +``` +帮我把这张手写笔记的照片转成文字。 +``` + +``` +识别这张增值税发票,提取发票号码、金额和税额。 +``` + +## 配置说明 + +本技能需要夸克扫描王 API Key。 + +1. 访问[夸克扫描王开发者后台](https://scan.quark.cn/business)注册并获取 API Key(选择 AI Agent 接入类型)。 +2. 将密钥写入 `~/.yescan_env`: +```bash +echo 'SCAN_WEBSERVICE_KEY=<你的API密钥>' > ~/.yescan_env +``` +3. 安装技能后每次运行会自动读取配置,无需重启。 + +## 注意事项 + +- 支持的图片格式:jpg、jpeg、png、gif、bmp、webp、tiff、wbmp +- 图片大小限制:单张不超过 5MB +- 每次调用仅处理单张图片,批量处理需循环调用 +- 运行环境需要 Python 3.9+ +- 图片会发送至夸克扫描王服务器进行识别,数据不会被永久保存 +- 不支持视频处理或实时摄像头流 diff --git a/src/content/skills/yescan-ocr-qoder.md b/src/content/skills/yescan-ocr-qoder.md new file mode 100644 index 0000000..a09eaa8 --- /dev/null +++ b/src/content/skills/yescan-ocr-qoder.md @@ -0,0 +1,112 @@ +--- +name: yescan-ocr-qoder +title: Yescan OCR - Universal Text Recognition +description: Extract, recognize, and structure text from images, screenshots, photos, or scanned documents — including handwriting, tables, math formulas, ID cards, invoices, medical reports, business licenses, and more. Powered by Quark Scan King (夸克扫描王) OCR API. +source: community +author: changri +githubUrl: https://github.com/yescan-ai/yescan-ocr-qoder +docsUrl: https://scan.quark.cn/business +category: document +tags: + - ocr + - text-recognition + - handwriting + - table + - invoice + - id-card + - medical-report + - formula + - quark-scan +roles: + - developer + - data-analyst + - finance + - legal + - hr +featured: false +popular: false +isOfficial: false +installCommand: | + git clone https://github.com/yescan-ai/yescan-ocr-qoder + cp -r yescan-ocr-qoder ~/.qoder/skills/ +date: 2026-06-09 +--- + +## Use Cases + +- Extract text from handwritten notes, essays, or letters with high accuracy +- Recognize and structure table data from images into machine-readable formats +- Identify and parse Chinese ID cards, social security cards, driver's licenses, vehicle licenses, and travel permits +- Extract key fields from VAT invoices, train tickets, and commercial invoices +- Recognize math formulas and equations, outputting LaTeX-compatible results +- Parse medical reports, lab results, and health check documents from images +- Extract text from product labels, packaging images, and business licenses +- General-purpose text extraction from any image as a fallback scenario + +## Core Capabilities + +- **Handwritten OCR**: Recognize cursive and messy handwriting from photos +- **Table OCR**: Extract structured table data from images +- **ID & Certificate Recognition**: Support 8+ document types including ID cards, social security cards, travel permits, and degree certificates +- **Invoice & Ticket Parsing**: VAT invoices, train tickets, and English commercial invoices +- **Formula Recognition**: Math formulas and chemical equations to LaTeX +- **Question OCR**: Extract exam questions and exercises from photos +- **Medical Report OCR**: Parse lab reports and medical documents +- **General Text Extraction**: Fallback mode for any image containing text + +## Supported Scenarios (17 total) + +| # | Scenario | Scene ID | +|---|----------|----------| +| 1 | Handwritten documents | `handwritten-ocr` | +| 2 | Tables | `table-ocr` | +| 3 | ID cards | `idcard-ocr` | +| 4 | Social security cards | `social-security-card-ocr` | +| 5 | Travel permits | `travel-permit-ocr` | +| 6 | Degree certificates | `degree-certificate-ocr` | +| 7 | VAT invoices | `vat-invoice-ocr` | +| 8 | Train tickets | `train-ticket-ocr` | +| 9 | Math formulas | `formula-ocr` | +| 10 | Exam questions | `question-ocr` | +| 11 | Driver's licenses | `driver-license-ocr` | +| 12 | Vehicle licenses | `vehicle-license-ocr` | +| 13 | Commercial invoices | `commercial-invoice-ocr` | +| 14 | Medical reports | `medical-report-ocr` | +| 15 | Business licenses | `business-license-ocr` | +| 16 | Product images | `product-image-ocr` | +| 17 | General text | `general-ocr` | + +## Example + +``` +请帮我识别这张图片里的表格内容。 +(附带一张表格截图) +``` + +``` +帮我把这张手写笔记的照片转成文字。 +``` + +``` +识别这张增值税发票,提取发票号码、金额和税额。 +``` + +## Setup + +This skill requires a Quark Scan King (夸克扫描王) API Key. + +1. Visit the [Quark Scan King Developer Portal](https://scan.quark.cn/business) to register and obtain an API Key (AI Agent type). +2. Save the key to `~/.yescan_env`: +```bash +echo 'SCAN_WEBSERVICE_KEY=' > ~/.yescan_env +``` +3. Install the skill and it will auto-read the config on each run. + +## Notes + +- Supports image formats: jpg, jpeg, png, gif, bmp, webp, tiff, wbmp +- Image size limit: 5MB per file +- Each invocation processes a single image; for batch processing, call in a loop +- Requires Python 3.9+ +- Images are sent to Quark Scan King servers for recognition; data is not permanently stored +- Not suitable for video processing or real-time camera streams From cf6e6d3b6dc532cac02b4c0832da13583502e4fa Mon Sep 17 00:00:00 2001 From: yescan-vision Date: Tue, 9 Jun 2026 18:21:45 +0800 Subject: [PATCH 2/3] feat: update author --- src/content/skills-zh/yescan-ocr-qoder-zh.md | 2 +- src/content/skills/yescan-ocr-qoder.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/content/skills-zh/yescan-ocr-qoder-zh.md b/src/content/skills-zh/yescan-ocr-qoder-zh.md index 7746e96..9660209 100644 --- a/src/content/skills-zh/yescan-ocr-qoder-zh.md +++ b/src/content/skills-zh/yescan-ocr-qoder-zh.md @@ -3,7 +3,7 @@ name: yescan-ocr-qoder title: 夸克扫描王 OCR - 通用文字识别 description: 从图片、截图、照片或扫描文档中提取、识别和结构化文本——支持手写体、表格、数学公式、身份证、发票、医疗报告、营业执照等 17 种场景。由夸克扫描王 OCR API 提供支持。 source: community -author: changri +author: yescan-ai githubUrl: https://github.com/yescan-ai/yescan-ocr-qoder docsUrl: https://scan.quark.cn/business category: document diff --git a/src/content/skills/yescan-ocr-qoder.md b/src/content/skills/yescan-ocr-qoder.md index a09eaa8..c064f38 100644 --- a/src/content/skills/yescan-ocr-qoder.md +++ b/src/content/skills/yescan-ocr-qoder.md @@ -3,7 +3,7 @@ name: yescan-ocr-qoder title: Yescan OCR - Universal Text Recognition description: Extract, recognize, and structure text from images, screenshots, photos, or scanned documents — including handwriting, tables, math formulas, ID cards, invoices, medical reports, business licenses, and more. Powered by Quark Scan King (夸克扫描王) OCR API. source: community -author: changri +author: yescan-ai githubUrl: https://github.com/yescan-ai/yescan-ocr-qoder docsUrl: https://scan.quark.cn/business category: document From a395ff7a95972a66a163bca43df6701b1ecc4695 Mon Sep 17 00:00:00 2001 From: yescan-vision Date: Wed, 10 Jun 2026 10:11:14 +0800 Subject: [PATCH 3/3] =?UTF-8?q?feat:=20=E6=94=AF=E6=8C=81=E6=89=AB?= =?UTF-8?q?=E6=8F=8F=E7=8E=8B=E6=96=87=E4=BB=B6=E8=BD=AC=E6=8D=A2=20skills?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../skills-zh/yescan-office-qoder-zh.md | 94 +++++++++++++++++++ src/content/skills/yescan-office-qoder.md | 94 +++++++++++++++++++ 2 files changed, 188 insertions(+) create mode 100644 src/content/skills-zh/yescan-office-qoder-zh.md create mode 100644 src/content/skills/yescan-office-qoder.md diff --git a/src/content/skills-zh/yescan-office-qoder-zh.md b/src/content/skills-zh/yescan-office-qoder-zh.md new file mode 100644 index 0000000..f89ffa6 --- /dev/null +++ b/src/content/skills-zh/yescan-office-qoder-zh.md @@ -0,0 +1,94 @@ +--- +name: yescan-office-qoder +title: Yescan Office - 图片转 Office/PDF 文档 +description: 将图片、截图或扫描件转换为可编辑的 Office 文档(Word/Excel)或 PDF。支持复杂表格、合同、图文混排内容的版式还原。由夸克扫描王转换 API 提供支持。 +source: community +author: changri +githubUrl: https://github.com/yescan-ai/yescan-office-qoder +docsUrl: https://scan.quark.cn/business +category: document +tags: + - ocr + - 图片转Word + - 图片转Excel + - 图片转PDF + - 格式转换 + - 表格还原 + - 夸克扫描王 +roles: + - developer + - data-analyst + - finance + - legal + - hr + - executive +featured: false +popular: false +isOfficial: false +installCommand: | + git clone https://github.com/yescan-ai/yescan-office-qoder + cp -r yescan-office-qoder ~/.qoder/skills/ +date: 2026-06-10 +--- + +## 使用场景 + +- 将包含表格的财务报表截图转换为可编辑的 Excel 文件 +- 将手写库存记录照片转换为结构化的 Excel 表格 +- 将会议记录拍照图片转换为 Word 文档 +- 将产品说明书截图转换为 .docx 格式,保留原始版式 +- 将手写课堂笔记图片转换为 PDF 文档存档 +- 将合同照片处理为清晰的 PDF 文件,便于存档和分发 +- 将白板草图转换为版式整洁的 PDF 文档 + +## 核心能力 + +- **图片转 Excel**:识别图片中的表格数据,转换为 .xlsx 文件,保留行列结构和数据关系 +- **图片转 Word**:还原图片中的图文排版,生成可编辑的 .docx 文档,支持长文本和多段落内容 +- **图片转 PDF**:将图片转换为版式规范的 PDF 文档,适合存档和打印 +- **版式还原**:尽量还原原始图片中的排版、表格边框、字体大小等视觉元素 +- **自动保存**:转换后的文件自动保存到本地临时目录,返回文件路径可直接使用 + +## 支持的场景(共 3 种) + +| 序号 | 场景名称 | 场景标识 | 输出格式 | +|------|----------|----------|----------| +| 1 | 图片转 Excel | `image-to-excel` | .xlsx | +| 2 | 图片转 Word | `image-to-word` | .docx | +| 3 | 图片转 PDF | `image-to-pdf` | .pdf | + +## 示例 + +``` +帮我把这张财务报表截图转换成 Excel 文件。 +(附带一张含表格的截图) +``` + +``` +把这张会议记录的拍照图片转成 Word 文档。 +``` + +``` +请将这张设备铭牌照片转换为 PDF 格式存档。 +``` + +## 配置说明 + +本技能需要夸克扫描王 API Key。 + +1. 访问[夸克扫描王开发者后台](https://scan.quark.cn/business)注册并获取 API Key(选择 AI Agent 接入类型)。 +2. 将密钥写入 `~/.yescan_env`: +```bash +echo 'SCAN_WEBSERVICE_KEY=<你的API密钥>' > ~/.yescan_env +``` +3. 安装技能后每次运行会自动读取配置,无需重启。 + +## 注意事项 + +- 支持的图片格式:jpg、jpeg、png、gif、bmp、webp、tiff、wbmp +- 图片大小限制:单张不超过 5MB +- 每次调用仅处理单张图片,批量处理需循环调用 +- 运行环境需要 Python 3.9+ +- 转换后的文件保存至系统临时目录(如 `/tmp`),需自行管理清理 +- 图片会发送至夸克扫描王服务器进行转换,数据不会被永久保存 +- 不支持视频处理或实时摄像头流 diff --git a/src/content/skills/yescan-office-qoder.md b/src/content/skills/yescan-office-qoder.md new file mode 100644 index 0000000..c1891f6 --- /dev/null +++ b/src/content/skills/yescan-office-qoder.md @@ -0,0 +1,94 @@ +--- +name: yescan-office-qoder +title: Yescan Office - Image to Office/PDF Conversion +description: Convert images, screenshots, or scanned documents into editable Office documents (Word/Excel) or PDF. Preserves layout of complex tables, contracts, and mixed text-image content. Powered by Quark Scan King conversion API. +source: community +author: changri +githubUrl: https://github.com/yescan-ai/yescan-office-qoder +docsUrl: https://scan.quark.cn/business +category: document +tags: + - ocr + - image-to-word + - image-to-excel + - image-to-pdf + - format-conversion + - table-restoration + - quark-scan +roles: + - developer + - data-analyst + - finance + - legal + - hr + - executive +featured: false +popular: false +isOfficial: false +installCommand: | + git clone https://github.com/yescan-ai/yescan-office-qoder + cp -r yescan-office-qoder ~/.qoder/skills/ +date: 2026-06-10 +--- + +## Use Cases + +- Convert financial report screenshots containing tables into editable Excel files +- Transform handwritten inventory record photos into structured Excel spreadsheets +- Convert meeting notes photos into Word documents +- Transform product manual screenshots into .docx format with original layout preserved +- Convert handwritten classroom note images into PDF documents for archiving +- Process contract photos into clean PDF files for storage and distribution +- Convert whiteboard sketches into well-formatted PDF documents + +## Core Capabilities + +- **Image to Excel**: Recognize table data in images and convert to .xlsx files, preserving row/column structure and data relationships +- **Image to Word**: Restore text and image layout from images, generating editable .docx documents with support for long text and multi-paragraph content +- **Image to PDF**: Convert images into well-formatted PDF documents suitable for archiving and printing +- **Layout Restoration**: Preserve original visual elements including table borders, font sizes, and text formatting from the source image +- **Auto-save**: Converted files are automatically saved to the local temporary directory, returning a file path ready for immediate use + +## Supported Scenarios (3 total) + +| # | Scenario | Scene ID | Output Format | +|---|----------|----------|---------------| +| 1 | Image to Excel | `image-to-excel` | .xlsx | +| 2 | Image to Word | `image-to-word` | .docx | +| 3 | Image to PDF | `image-to-pdf` | .pdf | + +## Example + +``` +Please convert this financial report screenshot into an Excel file. +(attach a screenshot containing a table) +``` + +``` +Convert this photo of meeting notes into a Word document. +``` + +``` +Please convert this equipment nameplate photo into PDF format for archiving. +``` + +## Setup + +This skill requires a Quark Scan King (夸克扫描王) API Key. + +1. Visit the [Quark Scan King Developer Portal](https://scan.quark.cn/business) to register and obtain an API Key (AI Agent type). +2. Save the key to `~/.yescan_env`: +```bash +echo 'SCAN_WEBSERVICE_KEY=' > ~/.yescan_env +``` +3. Install the skill and it will auto-read the config on each run. + +## Notes + +- Supports image formats: jpg, jpeg, png, gif, bmp, webp, tiff, wbmp +- Image size limit: 5MB per file +- Each invocation processes a single image; for batch processing, call in a loop +- Requires Python 3.9+ +- Converted files are saved to the system temporary directory (e.g., `/tmp`); manage cleanup manually +- Images are sent to Quark Scan King servers for conversion; data is not permanently stored +- Not suitable for video processing or real-time camera streams