[新手上路]批处理新手入门导读[视频教程]批处理基础视频教程[视频教程]VBS基础视频教程[批处理精品]批处理版照片整理器
[批处理精品]纯批处理备份&还原驱动[批处理精品]CMD命令50条不能说的秘密[在线下载]第三方命令行工具[在线帮助]VBScript / JScript 在线参考
返回列表 发帖

[文本处理] [已解决]批处理如何从网页源代码中提取资源链接?

本帖最后由 impk 于 2019-8-4 12:51 编辑

https://www.manhuadb.com
如何从这个网页的源代码里提取图片资源链接
过滤出http至jpg之间的全部内容并输入到txt
有几种方法可以实现?需要用到哪几种命令?

本帖最后由 zaqmlp 于 2019-7-26 12:03 编辑
  1. @echo off
  2. set info=互助互利,支付宝扫码头像,感谢赞助
  3. rem 有问题,可加QQ956535081及时沟通
  4. title %info%
  5. cd /d "%~dp0"
  6. powershell -NoProfile -ExecutionPolicy bypass ^
  7.     [System.Collections.ArrayList]$s=@();^
  8.     $url='https://www.manhuadb.com/';^
  9.     $web=New-Object System.Net.WebClient;^
  10.     $web.Encoding=[System.Text.Encoding]::UTF8;^
  11.     $html=$web.DownloadString($url);^
  12.     $m=[regex]::matches($html,'(?^<=src=\"").+?\.jpg');^
  13.     if($m.count -ge 1){^
  14.         foreach($item in $m){[void]$s.Add($item.value);};^
  15.         [IO.File]::WriteAllLines('结果.txt', $s, [Text.Encoding]::Default);^
  16.     };
  17. echo;%info%
  18. pause
复制代码
提供bat代写,为你省时省力省事,支付宝扫码头像支付
微信: unique2random

TOP

本帖最后由 impk 于 2019-7-22 17:54 编辑

回复 2# zaqmlp

保存为bat运行后提示,powershell不是内部或外部命令,也不是可运行的程序或批处理文件

这个脚本需要安装额外的运行环境么?我用的是XP3系统

TOP

回复 3# impk


    WinXP 系统现在几乎没人用了,不闲麻烦的话你可以安装 PowerShell 试试。
https://www.microsoft.com/zh-CN/download/details.aspx?id=16818

TOP

本帖最后由 WHY 于 2019-7-26 11:28 编辑
  1. var txt = getText('https://www.manhuadb.com');
  2. var arr = [], m = [], map = [];
  3. var reg = /src="((?:https?:\/)?\/[^"]+\.jpe?g)"/ig;  //匹配 src="https://...jpg" 或者 src="/...jpg"
  4. while( m = reg.exec(txt) ){
  5.     var s = m[1].toLowerCase().replace(/^\//, 'https://www.manhuadb.com/');
  6.     if( !map[s] ){  //去重复
  7.         arr.push(s); map[s] = 1;
  8.     }
  9. }
  10. writeToFile(arr);
  11. //提取网页
  12. function getText(url) {
  13.     var http = new ActiveXObject('Microsoft.XMLHTTP');
  14.     http.open('GET', url, false);
  15.     http.send();
  16.     with( new ActiveXObject('ADODB.Stream') ){
  17.         Type = 1;
  18.         Mode = 3;
  19.         Open();
  20.         Write(http.responseBody);
  21.         Position = 0;
  22.         Type = 2;
  23.         Charset = 'UTF-8';
  24.         var str = ReadText(-1);
  25.     }
  26.     return str;
  27. }
  28. //写入文本
  29. function writeToFile(arr) {
  30.     var fso  = new ActiveXObject('Scripting.FileSystemObject');
  31.     fso.OpenTextFile('result.Log', 2, true).WriteLine(arr.join('\r\n'));
  32. }
  33. WSH.Echo('Done');
复制代码
1

评分人数

    • smss: 消消火技术 + 1

TOP

回复 5# WHY


   
这个是什么脚本。

TOP

回复 6# netdzb
vbs
JC网络工作室

TOP

好吧,再次修改一下。因为得到图片地址有遗漏。
保存为 Test.JS

TOP

回复 5# WHY


    这个脚本是保存为vbs文件再使用吧?我这边报错,提示【行1字符19语法错误代码800A03EA】

TOP

回复 9# impk


    您的眼力真好,人家明明说保存 js ,你要保存 vbs.

TOP

回复  impk


    您的眼力真好,人家明明说保存 js ,你要保存 vbs.
xczxczxcz 发表于 2019-7-28 18:34



    保存为js也报错提示【行18字符5 系统未找到指定的资源 代码800C0005】

TOP

  1. https://media.manhuadb.com/cartoon/1488_title_paniwcbr.jpg
  2. https://media.manhuadb.com/cartoon/6247_title_qirqiyix.jpg
  3. https://media.manhuadb.com/cartoon/143_title_hjecxxeh.jpg
  4. https://media.manhuadb.com/cartoon/1585_title_lkveswyl_720x405.jpg
  5. https://media.manhuadb.com/cartoon/1185_title_gktsajut.jpg
  6. https://media.manhuadb.com/cartoon/1167_title_vfowmmsg.jpg
  7. https://media.manhuadb.com/cartoon/7797_cover_qnolrmmf.jpg
  8. https://media.manhuadb.com/cartoon/7796_cover_razazdtf.jpg
  9. https://media.manhuadb.com/cartoon/7795_cover_lobmedaj.jpg
  10. https://media.manhuadb.com/cartoon/7794_cover_zwzldgxn.jpg
  11. https://media.manhuadb.com/cartoon/7793_cover_cyofbpdm.jpg
  12. https://media.manhuadb.com/cartoon/7792_cover_yrfkzest.jpg
  13. https://media.manhuadb.com/cartoon/7791_cover_yzpvzsnb.jpg
  14. https://media.manhuadb.com/cartoon/7790_cover_umsatbcm.jpg
  15. https://media.manhuadb.com/cartoon/7789_cover_gmbubydc.jpg
  16. https://media.manhuadb.com/cartoon/7788_cover_wvlppvip.jpg
  17. https://media.manhuadb.com/cartoon/7787_cover_gcjtftpt.jpg
  18. https://media.manhuadb.com/cartoon/7786_cover_cipstctc.jpg
  19. https://www.manhuadb.com/cartoon/139_title_eqymyphu.jpg
  20. https://www.manhuadb.com/cartoon/162_cover_glhxiyir.jpg
  21. https://media.manhuadb.com/cartoon/1466_cover_cawzjzvo_250x362.jpg
  22. https://media.manhuadb.com/cartoon/_cover_uczonnez.jpg
  23. https://media.manhuadb.com/cartoon/1518_title_fipqdtpt.jpg
  24. https://www.manhuadb.com/press/296_1_ycygyayd_thumb.jpg
  25. https://media.manhuadb.com/cartoon/3145_cover_jbmhtazk.jpg
  26. https://media.manhuadb.com/cartoon/_cover_djxwbobi.jpg
  27. https://www.manhuadb.com/cartoon/103_cover_dnahrshe.jpg
  28. https://www.manhuadb.com/cartoon/1061_title_rmzbrgjr.jpg
  29. https://www.manhuadb.com/cartoon/147_cover_iegknrqv.jpg
  30. https://www.manhuadb.com/cartoon/138_cover_pgojimpj.jpg
  31. https://www.manhuadb.com/cartoon/114_cover_ivqpicbz.jpg
  32. https://www.manhuadb.com/cartoon/236_cover_raumwyvs.jpg
  33. https://www.manhuadb.com/press/261_1_hberznkx_thumb.jpg
  34. https://media.manhuadb.com/cartoon/_cover_nhkpnyxt.jpg
  35. https://media.manhuadb.com/cartoon/1520_cover_ovlvzpem.jpg
  36. https://media.manhuadb.com/cartoon/6603_cover_wqldmvru.jpg
  37. https://media.manhuadb.com/cartoon/2060_cover_nodusfkj.jpg
  38. https://media.manhuadb.com/cartoon/2584_cover_qhsomnay.jpg
  39. https://media.manhuadb.com/cartoon/7746_cover_rcsbywsk.jpg
  40. https://media.manhuadb.com/cartoon/7666_cover_lcdqerfk.jpg
  41. https://media.manhuadb.com/cartoon/7165_cover_zrkpddfr.jpg
  42. https://media.manhuadb.com/cartoon/6474_cover_ixqeakrk.jpg
  43. https://media.manhuadb.com/cartoon/2971_cover_uforygug.jpg
  44. https://www.manhuadb.com/cartoon/1203_cover_fubjqdgw.jpg
  45. https://www.manhuadb.com/cartoon/181_cover_pgmtlitq.jpg
  46. https://media.manhuadb.com/cartoon/4248_cover_hrdninkt.jpg
  47. https://media.manhuadb.com/cartoon/6450_cover_vosbgtlb.jpg
  48. https://media.manhuadb.com/cartoon/5376_cover_sdtjnmwv.jpg
  49. https://media.manhuadb.com/cartoon/5983_cover_rczkutnm.jpg
  50. https://media.manhuadb.com/cartoon/6646_cover_ksewiaib.jpg
  51. https://media.manhuadb.com/cartoon/3876_cover_ucfwkywt.jpg
  52. https://media.manhuadb.com/cartoon/5025_cover_kghatein.jpg
  53. https://media.manhuadb.com/cartoon/7471_cover_xiqvvswv.jpg
  54. https://media.manhuadb.com/cartoon/3772_cover_hcrrfnci.jpg
  55. https://media.manhuadb.com/cartoon/7154_cover_soyukzbg.jpg
  56. https://media.manhuadb.com/cartoon/1482_cover_eavxecdn.jpg
  57. https://media.manhuadb.com/cartoon/1584_cover_ngubnkzy.jpg
  58. https://media.manhuadb.com/cartoon/1588_cover_hszsmktf.jpg
  59. https://media.manhuadb.com/cartoon/1635_cover_wddvozfb.jpg
  60. https://media.manhuadb.com/cartoon/1817_cover_qpsbuivc.jpg
  61. https://media.manhuadb.com/cartoon/1890_cover_manrlmkg.jpg
  62. https://media.manhuadb.com/cartoon/2073_cover_hzwayfnw.jpg
  63. https://media.manhuadb.com/cartoon/2500_cover_zksojfap.jpg
  64. https://media.manhuadb.com/cartoon/2515_cover_fgnebxdd.jpg
复制代码

TOP

Mojolicious
  1. use Modern::Perl;
  2. use Mojo::UserAgent;
  3. my $ua = Mojo::UserAgent->new();
  4. my $dom = $ua->get("https://www.manhuadb.com/")->result->dom;
  5. for my $e ( $dom->find("img")->each ) {
  6.     say $e->attr("src") if $e->attr("src")=~/jpg$/;
  7. }
复制代码

TOP

回复 11# impk


    碰到这种问题你应该先去问搜索引擎,脚本没有问题,不要背锅给脚本。没能力解决就别用。

TOP

回复 14# WHY


    回复 14# WHY


碰到这种问题你应该先去问搜索引擎,脚本没有问题,不要背锅给脚本。没能力解决就别用。

废话 有问题先搜索这还用你教?你怎知我没搜索过?
脚本在我机器上有问题,我实话实说,跟背锅有什么关系?
你能力大就别搭理我们这种菜鸟,听你说话还不够恶心的呢

TOP

返回列表