标题: [文本处理] 批处理怎样用DocToText批量转多文件夹内DOC为TXT,再合并TXT [打印本页]
作者: wyx567 时间: 2014-9-1 12:15 标题: 批处理怎样用DocToText批量转多文件夹内DOC为TXT,再合并TXT
本帖最后由 wyx567 于 2014-9-1 12:59 编辑
在论坛里仔细搜索还是没发现好办法,求助!
怎么样用类似DocToText工具批量转换多个文件夹内(包括子目录)的DOC为TXT,文件夹内子文件夹内可能有多个doc,然后合并TXT再以各自文件夹名称重命名。
比如:
c:\文件夹1\文件.doc
c:\文件夹2\3\文件1.doc
c:\文件夹3\4\5\文件2.doc
c:\文件夹4\5\6\7\文件3.doc
……
统一转换成
c:\文件夹1\文件夹1.txt
c:\文件夹2\文件夹2.txt
c:\文件夹3\文件夹3.txt
c:\文件夹4\文件夹4.txt
……
请问这样可以吗?
上万个文件夹,实在是没办法。
tmplinshi 大大在 2012-11-8 发过
要把中文转换成 utf-8。
把附件中的文件解压到 doctotext.exe 目录,之后使用方法如下:
doctotext_ *.doc
复制代码
doctotext_ 中文测试1.doc "中文 测试2.doc"
复制代码
文件搜索使用的是 dir 的语法。所以例如要搜索子目录的文件,则加 /s 参数:
doctotext_ /s *.doc
复制代码
doctotext_.bat:
@echo off
setlocal
set n=0
set skip=
dir /b %* | win_iconv -f gbk -t utf-8 >"%~f0.tmp"
for /f "delims=" %%a in (' dir /b %* ') do (
call :doctotext "%%a"
)
exit /b
:doctotext
if %n% neq 0 (
set skip=skip=%n%
)
for /f "usebackq %skip% eol=< delims=" %%a in ("%~f0.tmp") do (
echo convert %1...
doctotext "%%a" >"%~1.txt"
set /a n += 1
goto :eof
)
最新版本下载:
http://sourceforge.net/projects/doctotext/?source=typ_redirect
能不能在此基础上修改呢?
多谢各位!!
作者: yu2n 时间: 2014-9-1 22:44
本帖最后由 yu2n 于 2014-9-1 22:46 编辑
沙发~
我用VBS写了一个,使用Word的另存为功能,直接保存为TXT。
请将原文件夹备份后试试,很期待知道Word转换1W个DOC文档需要多长时间……- 'doc2txt.vbs By yu2n, 2014.09.01
- CommandMode "批量转多文件夹内DOC为TXT,再合并TXT"
-
- Main
- Sub Main()
- On Error Resume Next
- ' 选择文件夹
- Dim strFolder, arrPath, strPath, nFileCount, i
- WScript.Echo "请选择 Word 文件路径:"
- strFolder = BrowseForFolder("请选择 Word 文件路径:")
- If strFolder = "" Then Exit Sub
- arrPath = ScanFolder(strFolder)
- ' 统计个数,用于显示进度
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- nFileCount = nFileCount + 1
- End If
- Next
- ' 执行转换
- Set objWord = Word_Init()
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- i = i + 1
- ' 显示进度
- WScript.Echo "[" & i & "/" & nFileCount & "]" & strPath
- ' 执行转换
- Doc2Txt objWord, strPath
- ' 追加TXT
- CreatTxtFile strFolder, strPath
- End If
- Next
- ' 退出
- objWord.Quit
- Msgbox "完成!"
- End Sub
-
-
- ' 打开DOC,另存为
- Function Doc2Txt(objWord, FilePath)
- On Error Resume Next
- Set fso = CreateObject("Scripting.Filesystemobject")
- If Not fso.FileExists(FilePath) Then Exit Function
-
- Const wdFormatText = 2
- Const Encoding = 1200
- Const wdCRLF = 0
- Set objDoc = objWord.Documents.Open(FilePath)
- objWord.ActiveDocument.SaveAs FilePath & ".txt", wdFormatText, False, "", False, _
- "", False, False, False, False, False, Encoding, False, False, wdCRLF
- objDoc.Close
- If Not Err.Number = 0 Then Doc2Txt = True
- End Function
-
-
- ' 浏览文件夹
- Function BrowseForFolder(ByVal strTips)
- Dim objFolder
- Set objFolder = CreateObject("Shell.Application").BrowseForFolder (&H0, strTips, &H0010 + &H0001)
- If (Not objFolder Is Nothing) Then BrowseForFolder = objFolder.Self.Path 'objFolder.Items().Item().Path
- End Function
-
-
- ' 创建 Word 对象
- Function Word_Init()
- Set objWord = CreateObject("Word.Application")
- If Not Err.Number = 0 Then
- Msgbox "错误:无法创建 Word 对象,你可能没有安装 Office 。"
- WScript.Quit(999)
- End If
- If Not objWord.Application.Version >= 12.0 Then
- Msgbox "警告:请使用 Office 2007 以上版本。"
- End If
- ' 隐藏运行,屏蔽提示
- objWord.Visible = False
- objWord.DisplayAlerts = False
- Set Word_Init = objWord
- End Function
-
-
- '将转换后的TXT追加到指定文件
- Function CreatTxtFile(ByVal strFolderPath, ByVal strFilePath)
- Set fso = CreateObject("Scripting.FileSystemObject")
- If Not fso.FileExists(strFilePath & ".txt") Then Exit Function
- ' 整理路径
- strSubFolderName = Mid(strFilePath, Len(strFolderPath) + 2)
- strSubFolderName = Left(strSubFolderName, InStr(strSubFolderName, "\") - 1)
- strTxtFile = strFolderPath & "\" & strSubFolderName & "\" & strSubFolderName & ".txt"
-
- ' 打开转换后的TXT文件
- Set rTxt = fso.OpenTextFile(strFilePath & ".txt", 1, False, -1)
- strText = rTxt.ReadAll()
- rTxt.Close
- ' 删除转换后的文件
- fso.DeleteFile strFilePath & ".txt", True
- ' 将转换后的TXT追加到指定文件
- Set wTxt = fso.OpenTextFile(strTxtFile, 8, True, -1)
- wTxt.Write strText
- wTxt.Close
- End Function
-
-
- ' 获取文件夹所有文件夹、文件列表(数组)
- Function ScanFolder(ByVal strPath)
- Dim arr()
- ReDim Preserve arr(0)
- Call SCAN_FOLDER(arr, strPath)
- ReDim Preserve arr(UBound(arr) - 1)
- ScanFolder = arr
- End Function
- Function SCAN_FOLDER(ByRef arr, ByVal folderSpec)
- On Error Resume Next
- Dim fso, objItems, objFile, objFolder
- Set fso = CreateObject("Scripting.FileSystemObject")
- Set objItems = fso.GetFolder(folderSpec)
- If Right(folderSpec, 1) <> "\" Then folderSpec = folderSpec & "\"
- If (Not fso.FolderExists(folderSpec)) Then Exit Function
- For Each objFile In objItems.Files
- arr(UBound(arr)) = objFile.Path
- ReDim Preserve arr(UBound(arr) + 1)
- Next
- For Each objFolder In objItems.subfolders
- Call SCAN_FOLDER(arr, objFolder.Path)
- Next
- arr(UBound(arr)) = folderSpec
- ReDim Preserve arr(UBound(arr) + 1)
- End Function
-
-
- ' 以命令提示符环境运行(保留参数)
- Sub CommandMode(ByVal sTitle)
- If (LCase(Right(WScript.FullName,11)) = "wscript.exe") Then
- Dim i, sArgs
- For i = 1 To WScript.Arguments.Count
- sArgs = sArgs & " " & """" & WScript.Arguments(i-1) & """"
- Next
- CreateObject("WScript.Shell").Run( _
- "cmd /c Title " & sTitle & " &Cscript.exe //NoLogo """ & _
- WScript.ScriptFullName & """ " & sArgs & " &pause"),3
- Wscript.Quit
- End If
- End Sub
复制代码
作者: wyx567 时间: 2014-9-5 16:27
我来测试下,多谢楼上兄弟!
作者: wyx567 时间: 2014-9-5 18:27
回复 2# yu2n
效果很好!
不过在转换中,会出现:
选择关闭后,WORD就在前台显示,文件一个一个打开,关闭。
不过转换合并还在正常继续
出来一个另存为画面,到这个提示后,不管怎么选择,确定后WORD关闭,vbs快速跳过没转换的DOC列表,提示关闭。
我想能不能关闭WORD的查错功能,应该就没问题了
作者: wyx567 时间: 2014-9-5 18:47
出问题的都是在DOCX上,我下载个office2010试试,现在用的2007 sp3
作者: yu2n 时间: 2014-9-5 19:05
回复 5# wyx567
这个问题我没有碰到过,可以把出错的Word文档打包几个传上来看看?
作者: wyx567 时间: 2014-9-5 19:46
回复 6# yu2n
换了office2010,转换速度明显提高
但这个坎还是过不去。。
提示另存为,直接确定,或者另存为doc格式都提示word错误重启。
链接:http://pan.baidu.com/s/1jG1euFW 密码:ryaz
我把文件传到百度云了,有空您看看
多谢!
作者: yu2n 时间: 2014-9-5 21:47
回复 7# wyx567
这个文档格式可能有问题,该文档直接手动另存为会出错,暂时不知道出错原因。
既然如此,就不使用另存为功能。换个方法,直接获取Doc文本内容,写入txt文件。- CommandMode "VBS 批量转多文件夹内DOC为TXT,再合并TXT By Yu2n@qq.com"
-
- Main
- Sub Main()
- On Error Resume Next
- ' 选择文件夹
- Dim strFolder, arrPath, strPath, nFileCount, i
- WScript.Echo "请选择 Word 文件路径:"
- strFolder = BrowseForFolder("请选择 Word 文件路径:")
- If strFolder = "" Then Exit Sub
- arrPath = ScanFolder(strFolder)
- ' 统计个数,用于显示进度
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- nFileCount = nFileCount + 1
- End If
- Next
- ' 执行转换
- Set objWord = Word_Init()
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- i = i + 1
- ' 显示进度
- WScript.Echo "[" & i & "/" & nFileCount & "]" & strPath
- ' 执行转换
- Doc2Txt objWord, strPath
- ' 追加TXT
- CreatTxtFile strFolder, strPath
- End If
- Next
- ' 退出
- objWord.Quit
- Msgbox "完成!"
- End Sub
-
-
- ' 打开DOC,另存为
- Function Doc2Txt(objWord, FilePath)
- On Error Resume Next
- Set fso = CreateObject("Scripting.Filesystemobject")
- If Not fso.FileExists(FilePath) Then Exit Function
-
- Const wdFormatText = 2
- Const Encoding = 1200
- Const wdCRLF = 0
- Set objDoc = objWord.Documents.Open(FilePath)
-
- ' 另存为,wyx567反映个别Docx出错
- ' objWord.ActiveDocument.SaveAs FilePath & ".txt", wdFormatText, False, "", False, _
- ' "", False, False, False, False, False, Encoding, False, False, wdCRLF
-
- ' 获取Doc文本内容
- strContent = objDoc.Content
- strContent = Replace(strContent, vbCr, vbCrLf)
- objDoc.Close False
-
- ' 保存Doc文本内容到txt文件
- Set wTxt = fso.OpenTextFile(FilePath & ".txt", 8, True, -1)
- wTxt.Write strContent
- wTxt.Close
- If Not Err.Number = 0 Then Doc2Txt = True
- End Function
-
-
- ' 浏览文件夹
- Function BrowseForFolder(ByVal strTips)
- Dim objFolder
- Set objFolder = CreateObject("Shell.Application").BrowseForFolder (&H0, strTips, &H0010 + &H0001)
- If (Not objFolder Is Nothing) Then BrowseForFolder = objFolder.Self.Path 'objFolder.Items().Item().Path
- End Function
-
-
- ' 创建 Word 对象
- Function Word_Init()
- Set objWord = CreateObject("Word.Application")
- If Not Err.Number = 0 Then
- Msgbox "错误:无法创建 Word 对象,你可能没有安装 Office 。"
- WScript.Quit(999)
- End If
- If Not objWord.Application.Version >= 12.0 Then
- Msgbox "警告:请使用 Office 2007 以上版本。"
- End If
- ' 隐藏运行,屏蔽提示
- objWord.Visible = False
- objWord.DisplayAlerts = False
- Set Word_Init = objWord
- End Function
-
-
- '将转换后的TXT追加到指定文件
- Function CreatTxtFile(ByVal strFolderPath, ByVal strFilePath)
- Set fso = CreateObject("Scripting.FileSystemObject")
- If Not fso.FileExists(strFilePath & ".txt") Then Exit Function
- ' 整理路径
- strSubFolderName = Mid(strFilePath, Len(strFolderPath) + 2)
- strSubFolderName = Left(strSubFolderName, InStr(strSubFolderName, "\") - 1)
- strTxtFile = strFolderPath & "\" & strSubFolderName & "\" & strSubFolderName & ".txt"
-
- ' 打开转换后的TXT文件
- Set rTxt = fso.OpenTextFile(strFilePath & ".txt", 1, False, -1)
- strText = rTxt.ReadAll()
- rTxt.Close
- ' 删除转换后的文件
- fso.DeleteFile strFilePath & ".txt", True
- ' 将转换后的TXT追加到指定文件
- Set wTxt = fso.OpenTextFile(strTxtFile, 8, True, -1)
- wTxt.Write strText
- wTxt.Close
- End Function
-
-
- ' 获取文件夹所有文件夹、文件列表(数组)
- Function ScanFolder(ByVal strPath)
- Dim arr()
- ReDim Preserve arr(0)
- Call SCAN_FOLDER(arr, strPath)
- ReDim Preserve arr(UBound(arr) - 1)
- ScanFolder = arr
- End Function
- Function SCAN_FOLDER(ByRef arr, ByVal folderSpec)
- On Error Resume Next
- Dim fso, objItems, objFile, objFolder
- Set fso = CreateObject("Scripting.FileSystemObject")
- Set objItems = fso.GetFolder(folderSpec)
- If Right(folderSpec, 1) <> "\" Then folderSpec = folderSpec & "\"
- If (Not fso.FolderExists(folderSpec)) Then Exit Function
- For Each objFile In objItems.Files
- arr(UBound(arr)) = objFile.Path
- ReDim Preserve arr(UBound(arr) + 1)
- Next
- For Each objFolder In objItems.subfolders
- Call SCAN_FOLDER(arr, objFolder.Path)
- Next
- arr(UBound(arr)) = folderSpec
- ReDim Preserve arr(UBound(arr) + 1)
- End Function
-
-
- ' 以命令提示符环境运行(保留参数)
- Sub CommandMode(ByVal sTitle)
- If (LCase(Right(WScript.FullName,11)) = "wscript.exe") Then
- Dim i, sArgs
- For i = 1 To WScript.Arguments.Count
- sArgs = sArgs & " " & """" & WScript.Arguments(i-1) & """"
- Next
- CreateObject("WScript.Shell").Run( _
- "cmd /c Title " & sTitle & " &Cscript.exe //NoLogo """ & _
- WScript.ScriptFullName & """ " & sArgs & " &pause"),3
- Wscript.Quit
- End If
- End Sub
复制代码
作者: wyx567 时间: 2014-9-5 23:09
我再试试!楼上牛逼!
感谢感谢!!
作者: wyx567 时间: 2014-9-10 18:13
这次批量转了1000个文件夹,这次没有docx了 有时会出现个这个框
上一次显示修复的提示框还是会出现
已经很好了,感谢感谢!!!
作者: wyx567 时间: 2014-9-10 19:36
回复 6# yu2n
您好,
大量文档中有很多这样的空白回车,怎么样能批量消除或者把多行换行回车换成1行换行回车呢。
[attach]7664[/attach]
还有就是怎么能批量保留TXT文件的前面,比如300行,300行以后的删除,能不能实现这样的效果呢
多谢!
作者: yu2n 时间: 2014-9-10 20:37
回复 11# wyx567
新增格式化字符串函数 FormatText :- CommandMode "VBS 批量转多文件夹内DOC为TXT,再合并TXT By Yu2n@qq.com"
-
- Main
- Sub Main()
- On Error Resume Next
- ' 选择文件夹
- Dim strFolder, arrPath, strPath, nFileCount, i
- WScript.Echo "请选择 Word 文件路径:"
- strFolder = BrowseForFolder("请选择 Word 文件路径:")
- If strFolder = "" Then Exit Sub
- arrPath = ScanFolder(strFolder)
- ' 统计个数,用于显示进度
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- nFileCount = nFileCount + 1
- End If
- Next
- ' 执行转换
- Set objWord = Word_Init()
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- i = i + 1
- ' 显示进度
- WScript.Echo "[" & i & "/" & nFileCount & "]" & strPath
- ' 执行转换
- Doc2Txt objWord, strPath
- ' 追加TXT
- CreatTxtFile strFolder, strPath
- End If
- Next
- ' 退出
- objWord.Quit
- Msgbox "完成!"
- End Sub
-
-
- ' 打开DOC,另存为
- Function Doc2Txt(objWord, FilePath)
- On Error Resume Next
- Set fso = CreateObject("Scripting.Filesystemobject")
- If Not fso.FileExists(FilePath) Then Exit Function
-
- Const wdFormatText = 2
- Const Encoding = 1200
- Const wdCRLF = 0
- Set objDoc = objWord.Documents.Open(FilePath)
-
- ' 另存为,wyx567反映个别Docx出错
- ' objWord.ActiveDocument.SaveAs FilePath & ".txt", wdFormatText, False, "", False, _
- ' "", False, False, False, False, False, Encoding, False, False, wdCRLF
-
- ' 获取Doc文本内容
- strContent = objDoc.Content
- objDoc.Close False
-
- ' 保存Doc文本内容到txt文件
- Set wTxt = fso.OpenTextFile(FilePath & ".txt", 8, True, -1)
- wTxt.Write FormatText(strContent)
- wTxt.Close
- If Not Err.Number = 0 Then Doc2Txt = True
- End Function
-
-
- ' 创建 Word 对象
- Function Word_Init()
- Set objWord = CreateObject("Word.Application")
- If Not Err.Number = 0 Then
- Msgbox "错误:无法创建 Word 对象,你可能没有安装 Office 。"
- WScript.Quit(999)
- End If
- If Not objWord.Application.Version >= 12.0 Then
- Msgbox "警告:请使用 Office 2007 以上版本。"
- End If
- ' 隐藏运行,屏蔽提示
- objWord.Visible = False
- objWord.DisplayAlerts = False
- Set Word_Init = objWord
- End Function
-
-
- '将转换后的TXT追加到指定文件
- Function CreatTxtFile(ByVal strFolderPath, ByVal strFilePath)
- Set fso = CreateObject("Scripting.FileSystemObject")
- If Not fso.FileExists(strFilePath & ".txt") Then Exit Function
- ' 整理路径
- strSubFolderName = Mid(strFilePath, Len(strFolderPath) + 2)
- strSubFolderName = Left(strSubFolderName, InStr(strSubFolderName, "\") - 1)
- strTxtFile = strFolderPath & "\" & strSubFolderName & "\" & strSubFolderName & ".txt"
-
- ' 打开转换后的TXT文件
- Set rTxt = fso.OpenTextFile(strFilePath & ".txt", 1, False, -1)
- strText = rTxt.ReadAll()
- rTxt.Close
- ' 删除转换后的文件
- fso.DeleteFile strFilePath & ".txt", True
- ' 将转换后的TXT追加到指定文件
- Set wTxt = fso.OpenTextFile(strTxtFile, 8, True, -1)
- wTxt.Write strText
- wTxt.Close
- End Function
-
-
- ' 以命令提示符环境运行(保留参数)
- Sub CommandMode(ByVal sTitle)
- If (LCase(Right(WScript.FullName,11)) = "wscript.exe") Then
- Dim i, sArgs
- For i = 1 To WScript.Arguments.Count
- sArgs = sArgs & " " & """" & WScript.Arguments(i-1) & """"
- Next
- CreateObject("WScript.Shell").Run( _
- "cmd /c Title " & sTitle & " &Cscript.exe //NoLogo """ & _
- WScript.ScriptFullName & """ " & sArgs & " &pause"),3
- Wscript.Quit
- End If
- End Sub
-
-
- ' 格式化字符串
- Function FormatText(ByVal str)
- ' 删除空行
- str = Replace(str, vbLf, "")
- str = Replace(str, vbCr, vbCrLf)
- str = regEx_replace("^\s*\r\n", str, "")
- ' 取前300行
- arrStr = Split(str, vbCrLf)
- If UBound(arrStr)>(300-1) Then ReDim Preserve arrStr(300-1)
- str = Join(arrStr, vbCrLf)
- FormatText = str
- End Function
-
-
- ' 正则表达式替换
- Function regEx_replace(ByVal sPattern, ByVal str, ByVal sReplace)
- Dim regEx ' 建立变量。
- Set regEx = CreateObject("VBScript.RegExp") ' 建立正则表达式。
- regEx.Pattern = sPattern ' 设置模式。
- regEx.IgnoreCase = True ' 设置是否区分字符大小写。
- regEx.Global = True ' 设置全局可用性。
- regEx.MultiLine = True ' 多行匹配模式
- regEx_replace = regEx.Replace(str, sReplace) ' 作替换。
- Set regEx = Nothing
- End Function
-
-
- ' 浏览文件夹
- Function BrowseForFolder(ByVal strTips)
- Dim objFolder
- Set objFolder = CreateObject("Shell.Application").BrowseForFolder (&H0, strTips, &H0010 + &H0001)
- If (Not objFolder Is Nothing) Then BrowseForFolder = objFolder.Self.Path 'objFolder.Items().Item().Path
- End Function
-
-
- ' 获取文件夹所有文件夹、文件列表(数组)
- Function ScanFolder(ByVal strPath)
- Dim arr()
- ReDim Preserve arr(0)
- Call SCAN_FOLDER(arr, strPath)
- ReDim Preserve arr(UBound(arr) - 1)
- ScanFolder = arr
- End Function
- Function SCAN_FOLDER(ByRef arr, ByVal folderSpec)
- On Error Resume Next
- Dim fso, objItems, objFile, objFolder
- Set fso = CreateObject("Scripting.FileSystemObject")
- Set objItems = fso.GetFolder(folderSpec)
- If Right(folderSpec, 1) <> "\" Then folderSpec = folderSpec & "\"
- If (Not fso.FolderExists(folderSpec)) Then Exit Function
- For Each objFile In objItems.Files
- arr(UBound(arr)) = objFile.Path
- ReDim Preserve arr(UBound(arr) + 1)
- Next
- For Each objFolder In objItems.subfolders
- Call SCAN_FOLDER(arr, objFolder.Path)
- Next
- arr(UBound(arr)) = folderSpec
- ReDim Preserve arr(UBound(arr) + 1)
- End Function
复制代码
作者: wyx567 时间: 2014-9-11 16:27
回复 12# yu2n
您好,使用新的代码,和上一次的情况一样,出现显示修复和上次截图的对话框。
我把对话框的文件发出来您有空看看
链接:http://pan.baidu.com/s/1pJnxfEb 密码:8vlf
真心感谢您的无私帮助!!
作者: yu2n 时间: 2014-9-11 20:33
回复 13# wyx567
这个Doc文档带了宏,并且设置了开启时弹出这个窗口。
百度暂时找不到在运行前移除这些VBA过程的方法。
作者: wyx567 时间: 2014-9-12 12:44
回复 14# yu2n
好的,这样的文件并不多,可以解决。
非常感谢您的帮助!
作者: yu2n 时间: 2014-9-12 14:13
本帖最后由 yu2n 于 2014-9-12 14:20 编辑
回复 15# wyx567
总算让我找到这篇文章:- VBA打开文件时(临时)禁用宏
- http://club.excelhome.net/thread-1001802-1-1.html
复制代码
于是,可以消灭那个对话框了。- CommandMode "VBS 批量转多文件夹内DOC为TXT,再合并TXT By Yu2n@qq.com"
-
- Main
- Sub Main()
- On Error Resume Next
- ' 选择文件夹
- Dim strFolder, arrPath, strPath, nFileCount, i
- WScript.Echo "请选择 Word 文件路径:"
- strFolder = BrowseForFolder("请选择 Word 文件路径:")
- If strFolder = "" Then Exit Sub
- arrPath = ScanFolder(strFolder)
- ' 统计个数,用于显示进度
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- nFileCount = nFileCount + 1
- End If
- Next
- ' 执行转换
- Set objWord = Word_Init()
- For Each strPath In arrPath
- If LCase(Right(strPath,4))=".doc" Or LCase(Right(strPath,5))=".docx" Then
- i = i + 1
- ' 显示进度
- WScript.Echo "[" & i & "/" & nFileCount & "]" & strPath
- ' 执行转换
- Doc2Txt objWord, strPath
- ' 追加TXT
- CreatTxtFile strFolder, strPath
- End If
- Next
- ' 退出
- objWord.Quit
- Msgbox "完成!"
- End Sub
-
-
- ' 打开DOC,另存为
- Function Doc2Txt(objWord, FilePath)
- On Error Resume Next
- Set fso = CreateObject("Scripting.Filesystemobject")
- If Not fso.FileExists(FilePath) Then Exit Function
- Set objDoc = objWord.Documents.Open(FilePath)
- ' 方法一、另存为,wyx567反映个别Docx出错
- ' Const wdFormatText = 2
- ' Const Encoding = 1200
- ' Const wdCRLF = 0
- ' objWord.ActiveDocument.SaveAs FilePath & ".txt", wdFormatText, False, "", False, _
- ' "", False, False, False, False, False, Encoding, False, False, wdCRLF
- ' 方法二、直接获取Doc文本内容
- strContent = objDoc.Content
- objDoc.Close False
- ' 保存Doc文本内容到txt文件
- Set wTxt = fso.OpenTextFile(FilePath & ".txt", 2, True, -1)
- wTxt.Write FormatText(strContent)
- wTxt.Close
- If Not Err.Number = 0 Then Doc2Txt = True
- End Function
-
-
- ' 创建 Word 对象
- Function Word_Init()
- Const msoAutomationSecurityForceDisable = 3
- Set objWord = CreateObject("Word.Application")
- If Not Err.Number = 0 Then
- Msgbox "错误:无法创建 Word 对象,你可能没有安装 Office 。"
- WScript.Quit(999)
- End If
- If Not objWord.Application.Version >= 12.0 Then
- Msgbox "警告:请使用 Office 2007 以上版本。"
- End If
- ' 隐藏运行,屏蔽提示
- objWord.Visible = False
- objWord.DisplayAlerts = False
- ' 禁用以编程方式打开的所有文件中的所有宏,而不显示任何安全警告。
- ' VBA打开文件时(临时)禁用宏
- ' http://club.excelhome.net/thread-1001802-1-1.html
- objWord.AutomationSecurity = msoAutomationSecurityForceDisable
- Set Word_Init = objWord
- End Function
-
-
- '将转换后的TXT追加到指定文件
- Function CreatTxtFile(ByVal strFolderPath, ByVal strFilePath)
- Set fso = CreateObject("Scripting.FileSystemObject")
- If Not fso.FileExists(strFilePath & ".txt") Then Exit Function
- ' 整理路径
- strSubFolderName = Mid(strFilePath, Len(strFolderPath) + 2)
- If InStr(strSubFolderName, "\") > 0 Then
- strSubFolderName = Left(strSubFolderName, InStr(strSubFolderName, "\") - 1)
- strTxtFile = strFolderPath & "\" & strSubFolderName & "\" & strSubFolderName & ".txt"
- Else
- strTxtFile = strFolderPath & "\" & strSubFolderName & ".txt"
- End If
- ' 打开转换后的TXT文件
- Set rTxt = fso.OpenTextFile(strFilePath & ".txt", 1, False, -1)
- strText = rTxt.ReadAll()
- rTxt.Close
- ' 删除转换后的文件
- fso.DeleteFile strFilePath & ".txt", True
- ' 将转换后的TXT追加到指定文件
- Set wTxt = fso.OpenTextFile(strTxtFile, 8, True, -1)
- wTxt.Write strText
- wTxt.Close
- End Function
-
-
- ' 以命令提示符环境运行(保留参数)
- Sub CommandMode(ByVal sTitle)
- If (LCase(Right(WScript.FullName,11)) = "wscript.exe") Then
- Dim i, sArgs
- For i = 1 To WScript.Arguments.Count
- sArgs = sArgs & " " & """" & WScript.Arguments(i-1) & """"
- Next
- CreateObject("WScript.Shell").Run( _
- "cmd /c Title " & sTitle & " &Cscript.exe //NoLogo """ & _
- WScript.ScriptFullName & """ " & sArgs & " &pause"),3
- Wscript.Quit
- End If
- End Sub
-
-
- ' 格式化字符串
- Function FormatText(ByVal str)
- ' 删除空行
- str = Replace(str, vbLf, "")
- str = Replace(str, vbCr, vbCrLf)
- str = regEx_replace("^\s*\r\n", str, "")
- ' 取前300行
- arrStr = Split(str, vbCrLf)
- If UBound(arrStr)>(300-1) Then ReDim Preserve arrStr(300-1)
- str = Join(arrStr, vbCrLf)
- FormatText = str
- End Function
-
-
- ' 正则表达式替换
- Function regEx_replace(ByVal sPattern, ByVal str, ByVal sReplace)
- Dim regEx ' 建立变量。
- Set regEx = CreateObject("VBScript.RegExp") ' 建立正则表达式。
- regEx.Pattern = sPattern ' 设置模式。
- regEx.IgnoreCase = True ' 设置是否区分字符大小写。
- regEx.Global = True ' 设置全局可用性。
- regEx.MultiLine = True ' 多行匹配模式
- regEx_replace = regEx.Replace(str, sReplace) ' 作替换。
- Set regEx = Nothing
- End Function
-
-
- ' 浏览文件夹
- Function BrowseForFolder(ByVal strTips)
- Dim objFolder
- Set objFolder = CreateObject("Shell.Application").BrowseForFolder (&H0, strTips, &H0010 + &H0001)
- If (Not objFolder Is Nothing) Then BrowseForFolder = objFolder.Self.Path 'objFolder.Items().Item().Path
- End Function
-
-
- ' 获取文件夹所有文件夹、文件列表(数组)
- Function ScanFolder(ByVal strPath)
- Dim arr()
- ReDim Preserve arr(0)
- Call SCAN_FOLDER(arr, strPath)
- ReDim Preserve arr(UBound(arr) - 1)
- ScanFolder = arr
- End Function
- Function SCAN_FOLDER(ByRef arr, ByVal folderSpec)
- On Error Resume Next
- Dim fso, objItems, objFile, objFolder
- Set fso = CreateObject("Scripting.FileSystemObject")
- Set objItems = fso.GetFolder(folderSpec)
- If Right(folderSpec, 1) <> "\" Then folderSpec = folderSpec & "\"
- If (Not fso.FolderExists(folderSpec)) Then Exit Function
- For Each objFile In objItems.Files
- arr(UBound(arr)) = objFile.Path
- ReDim Preserve arr(UBound(arr) + 1)
- Next
- For Each objFolder In objItems.subfolders
- Call SCAN_FOLDER(arr, objFolder.Path)
- Next
- arr(UBound(arr)) = folderSpec
- ReDim Preserve arr(UBound(arr) + 1)
- End Function
复制代码
作者: g99 时间: 2014-12-28 22:16
回复 16# yu2n
运行流畅,速度极快!
欢迎光临 批处理之家 (http://bbs.bathome.net/) |
Powered by Discuz! 7.2 |