找回密码
 注册
搜索
[新手上路]批处理新手入门导读[视频教程]批处理基础视频教程[视频教程]VBS基础视频教程[批处理精品]批处理版照片整理器
[批处理精品]纯批处理备份&还原驱动[批处理精品]CMD命令50条不能说的秘密[在线下载]第三方命令行工具[在线帮助]VBScript / JScript 在线参考
查看: 21846|回复: 3

[文本处理] [已解决]批处理如何从网页的文本里提取出tag的部分?

[复制链接]
发表于 2021-6-1 18:10:52 | 显示全部楼层 |阅读模式
你好,先谢谢你了,代码如下!
能否用正则的形式,搭配上bat,在下面中找出所有 "tag":"XXX" 中  XXX 的内容。
比如下面的内容,最后生成的文本是

オリジナル
イラストレーション
背景
風景3000users入り
女の子
風景
……

就谢谢你了~~~~~~
  1. master/img/2019/06/17/13/29/28/75269339_p0_master1200.jpg","regular":"https://i.pximg.net/img-master/img/2019/06/17/13/29/28/75269339_p0_master1200.jpg","original":"https://i.pximg.net/img-original/img/2019/06/17/13/29/28/75269339_p0.png"},"tags":{"authorId":"30486331","isLocked":false,"tags":[{"tag":"オリジナル","locked":true,"deletable":false,"userId":"30486331","translation":{"en":"原创"},"userName":"ヨムナシ"},{"tag":"背景","locked":true,"deletable":false,"userId":"30486331","translation":{"en":"background"},"userName":"ヨムナシ"},{"tag":"風景","locked":true,"deletable":false,"userId":"30486331","translation":{"en":"风景"},"userName":"ヨムナシ"},{"tag":"イラストレーション","locked":true,"deletable":false,"userId":"30486331","translation":{"en":"illustration"},"userName":"ヨムナシ"},{"tag":"女の子","locked":true,"deletable":false,"userId":"30486331","translation":{"en":"女孩子"},"userName":"ヨムナシ"},{"tag":"ここに行きたい","locked":false,"deletable":false,"translation":{"en":"好想去这里"}},{"tag":"風景3000users入り","locked":false,"deletable":false,"translation":{"en":"scenery 3000+ bookmarks"}},{"tag":"オリジナル3000users入り","locked":false,"deletable":false,"translation":{"en":"原创3000收藏"}}],"writable":false},"alt":"#オリジナル 逆上がりの世界 - ヨムナシ的插画","storableTags":["RTJMXD26Ak","jhuUT0OJva","uusOs0ipBx","J6HRrOvKcm","Lt-oEicbBr","LpjxMAWKke","t5wuY96p0s","YRDwjaiLZn"],"userId":"30486331","userName":"ヨムナシ","userAccount":"yomunashi333","userIllusts":{"85863283":null,"85713805":null,"85713731":null,"77296333":null,"76298917":null,"75269561":null,"75269463":{"id":"75269463","title":"茄子きらいだからあげるねっ","illustType":0,"xRestrict":0,"restrict":0,"sl":2,"url":"https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/06/17/13/46/39/75269463_p0_square1200.jpg","description":"","tags":["オリジナル","背景","イラストレーション","女の子","パフェ","オリジナル100users入
复制代码

评分

参与人数 1PB +2 收起 理由
Batcher + 2 感谢给帖子标题标注[已解决]字样

查看全部评分

发表于 2021-6-1 21:14:43 | 显示全部楼层
采用sed来处理,非正则,仅针对上文。代码简单,可以根据自己需要调整
  1. copy /y 35.txt 原文.txt
  2. sed -i "s/{"tag":/\nmmmmmm/g;s/,"locked/\n/g" 原文.txt
  3. findstr /i "mmmmmm" 原文.txt>>结果.txt
  4. sed -i "s/mmmmmm//g;s/"//g" 结果.txt
复制代码
关键语句  s/原内容/新内容/g   。  \把紧跟的字符作为字符识别   \n换行
代码涉及的sed 下载地址 http://bcn.bathome.net/s/tool/index.html?down&key=sed
发表于 2021-6-1 21:24:01 | 显示全部楼层
依靠sed来提取,非正则。仅对上文。语句简单,可自行修改功能。
  1. copy /y 35.txt 原文.txt
  2. sed -i "s/{"tag":/\nmmmmmm/g;s/,"locked/\n/g" 原文.txt
  3. findstr /i "mmmmmm" 原文.txt>>结果.txt
  4. sed -i "s/mmmmmm//g;s/"//g" 结果.txt
  5. pause
复制代码
语句说明  s/原内容/新内容/g  替换。 \对紧跟后面的字符令其识别为字符。\n换行
涉及的sed下载地址 http://bcn.bathome.net/s/tool/index.html?down&key=sed

评分

参与人数 1技术 +1 收起 理由
灯塔彭于晏 + 1 谢谢您啦!

查看全部评分

发表于 2021-6-2 00:17:56 | 显示全部楼层
  1. $m=select-string -path file '(?<="tag":")[^"]*' -AllMatches
  2. foreach( $a in $m.matches )
  3. {
  4.     $a.value
  5. }
复制代码
  1. grep -P -o "(?<="tag":)[^,]*" file
复制代码
您需要登录后才可以回帖 登录 | 注册

本版积分规则

Archiver|手机版|小黑屋|批处理之家 ( 渝ICP备10000708号 )

GMT+8, 2026-3-20 16:05 , Processed in 0.019260 second(s), 8 queries , File On.

Powered by Discuz! X3.5

© 2001-2026 Discuz! Team.

快速回复 返回顶部 返回列表