|
|
这是 http://www.mbayer.de/html2text/ 的 Windows 版本,提取自 http://www.opencats.org/downloads/setupResumeIndexingTools.exe。
This is html2text, version 1.3.2a
Usage:
html2text -help
html2text -version
html2text [ -unparse | -check ] [ -debug-scanner ] [ -debug-parser ] \
[ -rcfile <file> ] [ -style ( compact | pretty ) ] [ -width <w> ] \
[ -o <file> ] [ -nobs ] [ -ascii ] [ <input-url> ] ...
Formats HTML document(s) read from <input-url> or STDIN and generates ASCII
text.
-help Print this text and exit
-version Print program version and copyright notice
-unparse Generate HTML instead of ASCII output
-check Do syntax checking only
-debug-scanner Report parsed tokens on STDERR (debugging)
-debug-parser Report parser activity on STDERR (debugging)
-rcfile <file> Read <file> instead of "$HOME/.html2textrc"
-style compact Create a "compact" output format (default)
-style pretty Insert some vertical space for nicer output
-width <w> Optimize for screen widths other than 79
-o <file> Redirect output into <file>
-nobs Do not use backspaces for boldface and underlining
-ascii Use plain ASCII for output instead of ISO-8859-1
示例:
| C:\>curl http://www.gnu.org/software/sed/manual/sed.html | html2text -style pretty -nobs | sed !d >sed.txt |
因为 html2text 输出的行只有换行符,没有回车符,所以可以用 sed 转换一下。
http://bcn.bathome.net/s/tool/index.html?key=html2text |
|