liuyuqi-dellpc 687318a9c8 Merge branch 'release/0.3.0' | 2 months ago | |
---|---|---|
.github | 8 months ago | |
crawl_yuque | 2 months ago | |
.env.example | 2 months ago | |
.gitignore | 2 months ago | |
README.md | 2 months ago | |
crawl_yuque.spec | 2 months ago | |
gui.py | 2 months ago | |
main.py | 2 months ago | |
main.ui | 2 months ago | |
poetry.lock | 2 months ago | |
pyproject.toml | 2 months ago | |
requirements.txt | 11 months ago |
语雀文档 一键导出 markdown
复制文档url,执行如下命令:
python main.py markdown -url https://www.yuque.com/burpheart/phpaudit
wget https://fileshare.yoqi.me/d/dl/c/Python/crawl_yuque/crawl_yuque
chmod +x crawl_yuque
./crawl_yuque markdown -url https://www.yuque.com/burpheart/phpaudit
私有文档配置 .env 文件,chrome 获取cookie填入即可,登录状态可以看到的项目都可以获取。
运行 main.py,获取url参数调用requests获取源码,查找如下网页源码:
<script nonce=wJM6HFxGFWlvqbg5UT1h>
(function() {
window.appData = JSON.parse(decodeURIComponent("%7B%22me%22%3A%7B%xxxx7D"));
})();
</script>
可以发现,云雀将内容存储在window.appData中,我们只需要将其转换为json格式,即可获取到所有的文章内容。
Licensed under the Apache 2.0 © liuyuqi.gov@msn.cn
目前有一些其他语言,如php,node 实现的采集工具,本项目实现的主要用途针对自己的项目,导出markdown文件,方便多平台同步。