修复teleport软件中文爬取网站的错误。 http://blog.yoqi.me/?p=4081

liuyuqi 889cc9ceb3 更新 'README.md' 6 years ago
.vscode a5c056ac76 增加批量更新js错误 6 years ago
src a5c056ac76 增加批量更新js错误 6 years ago
.classpath a5c056ac76 增加批量更新js错误 6 years ago
.gitignore a5c056ac76 增加批量更新js错误 6 years ago
.project a5c056ac76 增加批量更新js错误 6 years ago
LICENSE 23ac3f2917 Initial commit 6 years ago
README.md 889cc9ceb3 更新 'README.md' 6 years ago
covert.php e3117066b7 init 6 years ago
js_convert.py a5c056ac76 增加批量更新js错误 6 years ago
pom.xml a5c056ac76 增加批量更新js错误 6 years ago

README.md

fix-teleport

修复teleport软件中文爬取网站的错误。

查找 替换
/\*tpa=.*\*/
\btppabs="h[^"]*"或者tppabs="h[^"]*"
href="javascript:if\(confirm\('htt[^"]*" href=www.xxx.com
href=" *javascript:if\(confirm\('(htt[^"\s]*).*?" href="$1"
href="javascript:if(confirm([^"]*" href=""
css文件:
tpa=http://[^\s]*.gif
/\*tpa.*?\*/

中文乱码,使用工具:

http://others.yoqi.me/convert.php

2017-12-23更新

  1. 由于普通正则表达式无法对正则表达式内继续正则匹配。增加js_convert.py,对项目中文件 href="javascript:if(confirm(%27http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=31011502004838 \n\nThis file was not retrie ved by Teleport Pro, because it is addressed on a domain or path outside the boundaries set for its Starting Address. \n\nDo you want t o open it from the server?%27))window.location=%27http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=31011502004838%27" 批量更改为: href="http://www.beian.gov.cn/portal/registerSystemInfo?recordcode=31011502004838"

  2. java项目批量正则修改就不需要继续开发了。