Files

T

sutong 750f981c7e feat: init media-center skill

资源中心——从多渠道获取资源链接，转存到夸克网盘并整理归档。
- sources/tencent-doc: 腾讯文档读取
- sources/search: 网盘搜索
- storage/quark: 夸克网盘操作
- ref/: 来源 skill 参考归档

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-05-16 18:28:23 +08:00

1.9 KiB

Raw Blame History

腾讯文档 — 使用

读取文档内容

从 URL 提取 file_id

URL 格式：https://docs.qq.com/doc/DR2xUcFdrSVhJTkZu

提取 DR2xUcFdrSVhJTkZu 部分即为 file_id。

第一步：判断文档类型

mcporter call tencent-docs smartcanvas.read file_id=<FILE_ID> size=10

报错 file is tencentdoc, not smartcanvas → 传统文档，走第二步 A
返回正常 JSON → smartcanvas 文档，走第二步 B

第二步 A：tencentdoc 类型（大文档推荐）

# 获取完整文档结构
mcporter call tencent-docs doc.resolve_document_structure file_id=<FILE_ID> > doc_raw.json

# 提取纯文本
python -X utf8 -c "
import json
with open('doc_raw.json','r',encoding='utf-8') as f:
    data=json.load(f)
texts=[]
for n in data.get('nodes',[]):
    p=n.get('text_preview','')
    hl=n.get('heading_level',0)
    if p:
        texts.append(('#'*hl+' '+p) if hl>0 else p)
with open('doc_content.txt','w',encoding='utf-8') as f:
    f.write('\n'.join(texts))
print(f'Done: {len(texts)} paragraphs')
"

# 清理中间文件（可选）
rm doc_raw.json

第二步 B：smartcanvas 类型（支持分页）

# 首次读取
mcporter call tencent-docs smartcanvas.read file_id=<FILE_ID> size=50

# 翻页（用上一页返回的 next_token）
mcporter call tencent-docs smartcanvas.read file_id=<FILE_ID> next_token=<TOKEN> size=50

搜索关键字获取资源链接

# 在导出的文本中搜索
grep -n "关键词" doc_content.txt

链接格式参考：

[普通链接: https://pan.quark.cn/s/xxx] — 夸克分享链接
[腾讯文档链接: https://docs.qq.com/doc/...] — 其他腾讯文档

注意事项

超大文档（>10万字）不要用 get_content，必超时
Windows 编码：带 emoji 的文档必须用 python -X utf8
链接格式：提取出的链接在 text_preview 中带 [普通链接: ...] 包裹，直接用中间的真实 URL

1.9 KiB Raw Blame History Unescape Escape

腾讯文档 — 使用

读取文档内容

从 URL 提取 file_id

第一步：判断文档类型

第二步 A：tencentdoc 类型（大文档推荐）

第二步 B：smartcanvas 类型（支持分页）

搜索关键字获取资源链接

注意事项

1.9 KiB

Raw Blame History