|
|
发表于 2017-1-18 21:10:13
|
显示全部楼层
本帖最后由 523066680 于 2017-1-18 21:21 编辑
初步达到清除冗余信息的要求,
对以E盘备份资料处理了一次,贴部分结果,代码有待完善。
处理方法:
将目录树的信息整个重组为 Perl 的哈希(键值对)数据结构。将每个目录下的结构信息dump出来,作为字符串,
使用 md5 函数转成 MD5校验值 作为标记,减少大量字符串对比的开销、内存开销。
信息去重:
获得第一阶段的疑似重复目录信息后,有很多是多余的结果,
清理方法:如果某一组目录是另一组目录的子目录,进行排除。(比较折腾就是了)
优化考虑:
通过文件累计数量 / 文件累计大小作为阀值,只输出足够复杂且内容相同的目录树(那些微不足道的暂时pass或另外输出)。- E:\售后\以色列升级工具\USB Debug Tool\USB_debug_tool_driver\tool\MProg 2.3
- E:\S烧录程序_软件\升级工具MST712齐总\USB_debug_tool_driver\tool\MProg 2.3
- E:\售后\DVD升级工具 for 印尼谢总\USB Debug Tool\USB_debug_tool_driver\tool\MProg 2.3
- {
- "Drivers" => {
- "D2XX Release Info.txt" => {},
- "FT8U2XX.RES" => {},
- "Ftccomms.vxd" => {},
- "FTCSENUM.SYS" => {},
- "FTCSENUM.VXD" => {},
- "FTCSER.INF" => {},
- "FTCSER2K.SYS" => {},
- "ftcser98.sys" => {},
- "FTCSMOU.INF" => {},
- "FTCSMOU.VXD" => {},
- "FTCSUI.DLL" => {},
- "FTCSUI2.DLL" => {},
- "ftcun2k.ini" => {},
- "ftcun98.ini" => {},
- "ftcunin.exe" => {},
- "FTCUSB.INF" => {},
- "FTCUSB.SYS" => {},
- "FTD2XX.DLL" => {},
- "Ftd2xx.h" => {},
- "FTD2XX.INF" => {},
- "FTD2XX.LIB" => {},
- "FTD2XX.SYS" => {},
- "Ftd2xxpg.rtf" => {},
- "FTD2XXUN.EXE" => {},
- "FTD2XXUN.INI" => {},
- "PG_Extra.doc" => {},
- "ReleaseNotes.doc" => {},
- },
- "FTD2XX.DLL" => {},
- "FTUninstall.bat" => {},
- "Help" => { "MProg.chm" => {} },
- "license.txt" => {},
- "MProg.exe" => {},
- "Templates" => {
- "com2p.ept" => {},
- "default.ept" => {},
- "default1.ept" => {},
- "MSTAR.ept" => {},
- },
- "uninstall.exe" => {},
- "uninstall.ini" => {},
- }
复制代码 |
|