返回列表 发帖

[文本处理] 【已解决】求助批处理从一个XML中提取需要的内容写到csv中

本帖最后由 zhengwei007 于 2024-11-23 17:14 编辑

就只有一个文件,但里面内容太多,手动不太现实,我截取一段代码:
<?xml version='1.0' encoding='utf-8'?>
<list xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="xsd/ExtractableItems.xsd">
<item id="7629"> <!-- Looted Goods - White Cargo box -->
<extract id="6688" quantity="1" chance="9" /> <!-- Forgotten Blade Edge -->
<extract id="6689" quantity="1" chance="9" /> <!-- Basalt Battlehammer Head -->
<extract id="6690" quantity="1" chance="9" /> <!-- Imperial Staff Head -->
<extract id="6691" quantity="1" chance="9" /> <!-- Angel Slayer Blade -->
<extract id="6693" quantity="1" chance="9" /> <!-- Dragon Hunter Axe Blade -->
<extract id="6694" quantity="1" chance="9" /> <!-- Saint Spear Blade -->
<extract id="6695" quantity="1" chance="9" /> <!-- Demon Splinter Blade -->
<extract id="6696" quantity="1" chance="9" /> <!-- Heavens Divider Edge -->
<extract id="6697" quantity="1" chance="9" /> <!-- Arcana Mace Head -->
<extract id="7579" quantity="1" chance="9" /> <!-- Draconic Bow Shaft -->
<extract id="57" quantity="330000" chance="10" /> <!-- Adena -->
</item>
<item id="7631"> <!-- Looted Goods - Yellow Cargo box -->
<extract id="6701" quantity="1" chance="20" /> <!-- Sealed Imperial Crusader Breastplate Part -->
<extract id="6702" quantity="1" chance="20" /> <!-- Sealed Imperial Crusader Gaiters Pattern -->
<extract id="6707" quantity="1" chance="20" /> <!-- Sealed Draconic Leather Armor Part -->
<extract id="6711" quantity="1" chance="20" /> <!-- Sealed Major Arcana Robe Part -->
<extract id="57" quantity="930000" chance="20" /> <!-- Adena -->
</item>
<item id="6505"> <!-- Orange Treasure Chest -->
<extract id="6910" quantity="3" chance="58" /> <!-- Premium Fish Oil -->
<extract id="-1" quantity="0" chance="42" /> <!-- Nothing -->
</item>
</list>COPY
以上代码通过批处理,希望得到以下内容,主要提取itemID,extractID,quantity,chance,结果如下:
item id,extract id,quantity, chance
7629,6688,1,9
7629,6689,1,9
7629,6690,1,9
7629,6691,1,9
7629,6693,1,9
7629,6694,1,9
7629,6695,1,9
7629,6696,1,9
7629,6697,1,9
7629,7579,1,9
7629,57,330000,10
7631,6701,1,20
7631,6702,1,20
7631,6707,1,20
7631,6711,1,20
7631,57,930000,20
6505,6910,3,58
6505,-1,0,42COPY
以上是最终结果,就是把itemid排一下,有几个extract id,就填充几个item id。
注:最后一行的-1可以不要,麻烦我自己筛选删除也行。
1

评分人数

    • Batcher: 感谢给帖子标题标注[已解决]字样PB + 2

保存成bat,大概可以
#ANSI&cls&powershell -NoProfile -NoLogo "gc '%~0'|out-string|iex"&pause&exit
$x = [xml](gc a.xml)
$x.SelectNodes("//item") | %{
    $id = $_.id
    $_.SelectNodes("./extract") | %{
        ($id, $_.id, $_.quantity, $_.chance) -join ","
    }
} | Out-File -Encoding oem a.csvCOPY
1

评分人数

    • czjt1234: ps好,比vbs简洁多了技术 + 1

TOP

回复 1# zhengwei007

批处版本...
@echo off &setlocal enabledelayedexpansion
(echo,item id,extract id,quantity,chance
for /f "tokens=1-7 delims= <>= " %%a in (
   'findstr /i /c:"item id=" /c:"extract id=" 0.xml'
) do if /i "%%a"=="item" (set "h=%%~c") else (set/p="!h!,%%~c,%%~e,"<nul&echo,%%~g))>out.csv
endlocal&pause&exit/bCOPY

TOP

感谢两位,都能使用,小1万行瞬间完成。

TOP

本帖最后由 aloha20200628 于 2024-11-23 17:20 编辑


vbs/jscript/powershell 内置的 xml 解析接口为析取网页数据提供了 ‘直通车’,但要求网页源文件的格式须严格符合 xml 规范,否则容易卡壳退出...

TOP

本帖最后由 qixiaobin0715 于 2024-11-24 09:38 编辑

代码这样写,是否更清爽(虽然行数要多些):
@echo off
setlocal enabledelayedexpansion
(echo,item id,extract id,quantity, chance
for /f "tokens=1,3,5,7 delims= <>= " %%a in (a.xml) do (
    if "%%a"=="item" (
        set str=%%~b
    ) else if "%%a"=="extract" (
        echo,!str!,%%~b,%%~c,%%~d
    )
))>a.csv
pauseCOPY

TOP

返回列表