标题: 【练习-009】批处理实现大数值排序 [打印本页]
作者: pusofalse 时间: 2008-8-4 08:26 标题: 【练习-009】批处理实现大数值排序
a.txt中有20行随机产生的数列,如下:- 2928326128601232462131283250710027308938740594716691200992050511576
- 5352129649530193383124730478244772348721985707222557212265817305
- 217141333532296179938475175265792931789219830308392472584606305
- 2371620291160322081050531817416284225477019123161801285941026814244
- 283929972304551060318886921731765136928849135391662294051194618754
- 1809165929787147057932949630411324311737224509104016550662932273
- 27396236084901303873154718299242931819623155304661177528921164510335
- 254221462410491137971033914630292752245114969186002809930190939425
- 1085287492160525651862932475207612387312368408826675135332406418337
- 2567810118246621010283281198810903279355871571118961177731143829148
- 23727111515524141721964179351992331180134926914198081871053303186
- 378579502856625703213542353218420835730692264021219729654278515442
- 30215186011014395001656818458819061824708536511543271701327524725
- 223702764213159156022932717903282522044350522584222768193271431422354
- 3079720530119542370417125702274761144023302102641160114921224469221
- 15642298214000242538839193816839550322381321993212316517861828002
- 13042178002978222022331319116624809338275899045263351248023569
- 11252165681825711849278422768716060438517976169102391532289954712000
- 105084292396529699311371735329685626410510259482788519645152723476
- 29674179062831103792824121564178225289202161443911094228581583531951
复制代码
要求通过纯批处理按照数列的大小顺序,正序输出如下:-
- 13042178002978222022331319116624809338275899045263351248023569
- 217141333532296179938475175265792931789219830308392472584606305
- 1809165929787147057932949630411324311737224509104016550662932273
- 5352129649530193383124730478244772348721985707222557212265817305
- 15642298214000242538839193816839550322381321993212316517861828002
- 23727111515524141721964179351992331180134926914198081871053303186
- 30215186011014395001656818458819061824708536511543271701327524725
- 105084292396529699311371735329685626410510259482788519645152723476
- 254221462410491137971033914630292752245114969186002809930190939425
- 283929972304551060318886921731765136928849135391662294051194618754
- 378579502856625703213542353218420835730692264021219729654278515442
- 1085287492160525651862932475207612387312368408826675135332406418337
- 2371620291160322081050531817416284225477019123161801285941026814244
- 2567810118246621010283281198810903279355871571118961177731143829148
- 2928326128601232462131283250710027308938740594716691200992050511576
- 3079720530119542370417125702274761144023302102641160114921224469221
- 11252165681825711849278422768716060438517976169102391532289954712000
- 27396236084901303873154718299242931819623155304661177528921164510335
- 29674179062831103792824121564178225289202161443911094228581583531951
- 223702764213159156022932717903282522044350522584222768193271431422354
复制代码
每行的数值都远远超过了cmd所能计算的最大数值。
要求,正确输出,不生成临时文件,代码高效通用。完成题目,视思路加分。
----------------------------------------------
至此,仍没有两全其美的代码,简洁高效但不通用,请参照2楼第一个代码与3楼的代码。
一定程度上的通用,请参照2楼第二个代码与6楼代码。
作者: batman 时间: 2008-8-4 14:03
先说明下思路:
此题确实是出给我们的一大难题,为什么这要说呢?楼主要求的是文本内所有超大数值的排序,
所有的数值都远远超过了cmd所能运算的最大数值,用常规的比较法都是行不通的这是难点之一;
第二这些数值是随机生成的且字符总数不定,甚至可能达到数行和数十行之长,如采用逐字符判断
的方法来确实行最大字符数,效率将会是此类方案所无法逾越的障碍;第三楼主要求不生成临时文
件,这对用findstr /o来获取行最大字符的方案来讲,无疑是锁上了大门。
综上所述总结如下:
代码要通用就要获取行最大字符数,一种方法是逐字符法,用逐字符法效率上就存在很大问题;
第二种方法是使用findstr /o一次性获取每行字符偏移量,再通过处理获得行最大字符数,效率上比
逐字符法是高多了,但因为单用findstr /o是不能获得文本行未行字符偏移量的,必须要对未行强加
回车,在不破坏原文件的情况下就要用到临时文件。
而我们写代码时一般遵循四条原则:高效率、通用、简洁、尽量不生成临时文件,其中首要的
一条就是高效,其次是通用,至于简洁和有无临时文件都不是主要考虑因素,所以依此主次关系我
给出以下两种方案:
一、通用性差一点(数值字符都在一行内是绝对没问题的),效率高,代码简洁,无临时文件:- @echo off&setlocal enabledelayedexpansion
- for /l %%i in (1,1,80) do set "kong=!kong!#"
- for /f %%i in (1.txt) do (
- set "str=%%i%kong%"
- set "a=!str:~,80!"
- set "a=!a:%%i=!"
- set "_!a!%%i=a"
- )
- for /f "delims==_" %%i in ('set _') do (
- set "str=%%i"
- echo !str:#=!
- )
- pause>nul
复制代码
当然也能通过修改80的值为更高来提高其通用性。
二、通用性极高,效率一般,生成临时文件,代码较复杂:- @echo off&setlocal enabledelayedexpansion
- set "max=0"&set "a=0"
- for /f %%i in (1.txt) do echo %%i>>2.txt
- echo.>>2.txt
- for /f "tokens=1,2* delims=:" %%i in ('findstr /n /o .* 2.txt') do (
- set /a n+=1,m=n-1
- set "num=%%i"&set "_!n!=%%j"&set "#%%i=%%k"
- if !m! gtr 0 set /a a=_!n!-_!m!-2
- if !max! lss !a! set "max=!a!"
- )
- set /a num-=1
- for /l %%i in (1,1,%max%) do set "kong=!kong!#"
- for /l %%i in (1,1,%num%) do (
- set "str=!#%%i!%kong%"
- set "a=!str:~,%max%!"
- call,set "a=%%a:!#%%i!=%%"
- set ".!a!!#%%i!=a"
- )
- for /f "delims==." %%i in ('set .') do (
- set "str=%%i"
- echo !str:#=!
- )
- del /q 2.txt&pause>nul
复制代码
[ 本帖最后由 batman 于 2008-8-5 21:14 编辑 ]
作者: terse 时间: 2008-8-4 16:39
- @echo off&setlocal enabledelayedexpansion
- for /l %%i in (1,1,100) do set "var=0!var!"
- for /f %%i in (1.txt) do (
- set str=!var!%%i
- set .!str:~-100! !random!=a
- )
- for /f "delims=.= " %%i in ('set .') do for /f "tokens=* delims=0" %%i in ("%%i") do echo %%i
- pause>nul
复制代码
作者: pusofalse 时间: 2008-8-4 21:52
这题在调试的时候才知道果真好麻烦啊~
感叹楼上的,高人~
作者: pusofalse 时间: 2008-8-4 22:07
batman兄的第二个代码有误哦,测试文本。- 29324200852651210028213071109630551685419682237192661910031596813525985
- 139192824221705323683853099069511460582426579937521145284152777127372
- 3228832607224652155316459166842936030363170611320231631619428405179384047
- 812170830663138941085534183172610653136294485195711016316241388511507
- 461336171716126377142221699522246153038595327243922576267943609636
- 1638925082316833036661013264793101030929370185741649225072170182874123709
- 19104273682033120216783266533081314891292107581812412876152371871916174
- 28765483830845209484709271702280288292290214606465320169291811087021312
- 309619124727589342731614454314507972257492438810339484727971340026983
- 12631365789691379909125415226544733128052344013430802923375228394920
- 1209434542725110215429171429642266631332923718109092547688512906917377
- 9235325441420415502171072042026910159313029658501976719241297361608917193
- 104303563083330908200213037916087115501115715163303019513237753157722853
- 13541756476689943590415015749186092594311926102972518323503012611090
- 32555129948775117222104620611451024122155601418467473071538131259929122
- 2548313857163852897313874201491508320410217331915310293928115978926830836
- 1728916573194511826201557603196381113155825309211851122813285308953794
- 279911732611489993427112229571538210465273573238478681351713760130927475
- 203542497728828465536044342329627237153851428124837111485541856124091
- 11167192753125043631539774133216256184840111392039610101233269561536
- 4903155797025161392428517782198905721614416681144701698415315575419818
- 1622389334882239291124631810189733645149042278127844293232621112717646
- 20215127882431066880816236289679811281382028731393254147717121765097
- 16645326181395012332177468835293330012549029775211691470493332222112797
- 933428889205984801230861004178721772312389185772899226893711797343359
- 16120191173270725617184072248627813152502180710713269551966628181211112497
- 1632311921601282512366484925858327402625626369309771407222363122614443
- 293592703625624960888281127241740451826035343113994824114651110092
- 893354022094424096154953052425998786321972607611409284852914205609188
- 1420612643180582349118041199391970330622603629175101501699075131221324749
复制代码
[ 本帖最后由 pusofalse 于 2008-8-4 22:09 编辑 ]
作者: pusofalse 时间: 2008-8-4 23:12
没保存,害自己写了两遍。贴上来,免得再丢失。
-
- @echo off&setlocal enabledelayedexpansion
- set m=0
- for /f "tokens=1,* delims=:" %%a in ('findstr/o .* 1.txt') do (
- set/a n+=1,l=n-1,y+=1
- set ..!n!=%%a
- set ##!y!=%%b
- if !n! geq 2 (
- call,set/a s=%%..!n!%%-%%..!l!%%-2,line+=1
- call,set "_!s!=%%_!s!%%%%##!line!%% "
- if !s! geq !m! set m=!s!
- )
- )
- for /f "skip=1 delims=:" %%a in ('^(echo !##%y%!^&echo.^)^|findstr/o .*') do set/a final=%%a-3
- call,set "_%final%=%%_!final!%% !##%y%!"
- if %final% geq !m! set m=%final%
- for /l %%a in (1 1 %m%) do (
- if defined _%%a (
- for %%i in (!_%%a!) do set -%%i=faith
- for /f "delims=-=" %%s in ('set -') do (
- echo %%s
- set "-%%s="
- )
- )
- )
- pause>nul
复制代码
作者: youxi01 时间: 2008-8-4 23:20
个人较为认同3F的方案
这个问题很早之前在dos联盟也有讨论
当时和随风讨论也是认定补位的方法准确而高效
作者: batman 时间: 2008-8-5 20:24
原帖由 pusofalse 于 2008-8-4 21:52 发表
这题在调试的时候才知道果真好麻烦啊~
感叹楼上的,高人~
谢谢兄弟指出,我的第二个代码已修正,其实3楼的代码并不通用,如下测试结果:
测试文本:
1.txt
- 29324200852651210028213071109630551685419682237192661910031596813525985
- 13919282422170532368385309906951146058242657993752114528415277712737232
- 28832607224652155316459166842936030363170611320231631619428405179384047
- 81217083066313894108553418317261065313629448519571101631624138851150746
- 13361717161263771422216995222461530385953272439225762679436096361638
- 99999999999999999999999999999999999999999999999999999999999999999999999
- 99999999999999999999999999999999999999999999999999999999999999999999999
- 99999999999999999999999999999999999999999999999999999999999999999999999
- 999999999999999999999999999999999999999999999999999999999
复制代码
运行3楼代码结果如下:
- 29448519571101631624138851150746133617171612637714222169952224615303859532724392
- 25762679436096361638
- 99999999999999999999999999999999999999999999999999999999999999999999999999999999
- 99999999999999999999
复制代码
而运行我第二楼第二个通用代码,结果如下:
- 99999999999999999999999999999999999999999999999999999999999999999999999999999999
- 99999999999999999999999999999999999999999999999999999999999999999999999999999999
- 99999999999999999999999999999999999999999999999999999999999999999999999999999999
- 99999999999999999999999999999
- 29324200852651210028213071109630551685419682237192661910031596813525985139192824
- 22170532368385309906951146058242657993752114528415277712737232288326072246521553
- 16459166842936030363170611320231631619428405179384047812170830663138941085534183
- 17261065313629448519571101631624138851150746133617171612637714222169952224615303
- 85953272439225762679436096361638
复制代码
ps:3楼的代码和我一楼的代码是一个意思,就是在前面补0,只不过我的是补足80位(针对数值字符在一
行的情况),而3楼的是补足100位(字符超过100位结果不正确了),而且好像我的代码效率还要稍高一
点。
作者: batman 时间: 2008-8-5 20:38
原帖由 pusofalse 于 2008-8-4 23:12 发表
没保存,害自己写了两遍。贴上来,免得再丢失。
@echo off&setlocal enabledelayedexpansion
set m=0
for /f "tokens=1,* delims=:" %%a in ('findstr/o .* 1.txt') do (
set/a n+=1,l=n-1,y+=1
set ...
使用管道又麻烦又影响效率,加个临时文件不是更好?
作者: terse 时间: 2008-8-6 13:22
原帖由 batman 于 2008-8-5 20:24 发表
谢谢兄弟指出,我的第二个代码已修正,其实3楼的代码并不通用,如下测试结果:
测试文本:
1.txt
29324200852651210028213071109630551685419682237192661910031596813525985
13919282422170532368385309906 ...
是的 我在3楼的代码正如兄指出的通用性不是很好 于是改为多步算法 用100多KB的文件试下 效率似乎高多了 一样没临时文件
不知什么原因用兄 二楼的通用代码试那100多KB的文件在我这里出错了
- @echo off&setlocal enabledelayedexpansion
- for /f "skip=1 tokens=1* delims=:" %%i in ('findstr /o ".*" 2.txt') do (
- set/a m+=2,n=%%i-m-t,t=%%i-m
- set str=%%j
- if !n! gtr !d! set/a d=n
- )
- for /f "skip=1 delims=:" %%i in ('^(echo %str%^&echo.^)^|findstr /o ".*"') do set/a m=%%i-3
- if %m% gtr %d% set/a d=m
- for /l %%i in (1,1,%d%) do set "var=$!var!"
- for /f "delims=" %%i in (2.txt) do (
- set str=!var!%%i
- set .!str:~-%d%! $!random! !random! !random!=a
- )
- for /f "delims=.=$" %%i in ('set .') do echo %%i
- pause
复制代码
漏了最后行的$ 补上
唉!还是发现我的计算还有问题 继续修正
重复行和空格问题 只能处理一项 等有完善方案
我想这样处理重复行和空格问题 也好 就一个分割符的问题了 文本中有分隔符 就处理不了
[ 本帖最后由 terse 于 2008-8-7 01:44 编辑 ]
作者: huahua0919 时间: 2008-8-6 18:14
多次测试没发现Set排序会出错- @echo off&setlocal enabledelayedexpansion
- for /f %%i in (a.txt) do (
- set .0000000000000000000000000000000000000000000%%i=a
- )
- for /f "tokens=1 delims=.=" %%i in ('set .') do (
- set m=%%i
- set m=!m:~-90!
- set _!m!=b
- )
- echo 由小到大排序:
- for /f "tokens=1* delims=0" %%i in ('set _') do (
- set /a n+=1
- for /f "delims==" %%a in ("%%j") do (
- set _!n!=%%a
- echo %%a)
- )
- echo 由大到小排序:
- for /l %%i in (%n% -1 1) do (
- call echo %%_%%i%%
- )
- pause
复制代码
作者: terse 时间: 2008-8-6 18:33
原帖由 huahua0919 于 2008-8-6 18:14 发表
多次测试没发现Set排序会出错@echo off&setlocal enabledelayedexpansion
for /f %%i in (a.txt) do (
set .0000000000000000000000000000000000000000000%%i=a
)
for /f "tokens=1 delims=.=" %%i in ('set .') ...
看你取后90位 以及补上的43个0 你测试的文件行字符如大于90 ;文件行最大字符和最小字符差大于43 理论上有差错
还有 从大排到小 可以在FOR里试 set _^|sort/r 这样可以少个FOR吧
作者: huahua0919 时间: 2008-8-6 18:38
我在网吧,刚好网吧系统里没sort这个命令,所以就没用,
至于上面的问题是根据先有情况测试的,也可以计算出最大和最小之差,然后再添加0和截取的数字应该没问题了,但不知上面所说超过100之类的情况是为什么?
作者: huahua0919 时间: 2008-8-6 18:43
其实用if可以判断的只要把文本全部现成如下形式:- 0002.371620.291160.322081.050531.817416.28422.54770.19123.16180.1285.94102.6814.244
复制代码
截取的每个字符长度一样就可以用if进行比较大小,可以不用set
欢迎光临 批处理之家 (http://bbs.bathome.net/) |
Powered by Discuz! 7.2 |