标题: [文本处理] 需要提取文本连续数字 [打印本页]
作者: ai8866 时间: 2020-6-16 15:08 标题: 需要提取文本连续数字
00001
00002
00003
00004
00008
00010
00013
00014
00015
00016
00021
00022
00023
00026
00030
00031
00035
00036
00049
00060
00061
00062
00065
提取出连续的4位数字
提取后的文本样子如下
00001
00002
00003
00004
00013
00014
00015
00016
00021
00022
00023
00024
作者: went 时间: 2020-6-16 16:38
数据保存为1.txt
for读取每一行,然后再一路判断下去- @echo off
- setlocal enabledelayedexpansion
- cd.>"out.txt"
- (
- for /f %%i in (1.txt) do (
- if "!v1!"=="" (
- set "v1=%%i"
- ) else (
- if "!v2!"=="" (
- set /a "1/(1!v1!+1-1%%i)" >nul 2>nul && ( set "v1=%%i" ) || set "v2=%%i"
- ) else (
- if "!v3!"=="" (
- set /a "1/(1!v2!+1-1%%i)" >nul 2>nul && ( set "v1=%%i" & set "v2=" ) || set "v3=%%i"
- ) else (
- if "!v4!"=="" (
- set /a "1/(1!v3!+1-1%%i)" >nul 2>nul && ( set "v1=%%i" & set "v2=" & set "v3=" ) || set "v4=%%i"
- )
- )
- )
- )
- if not "!v4!"=="" (
- echo !v1!
- echo !v2!
- echo !v3!
- echo !v4!
- echo.
- set "v1=" & set "v2=" & set "v3=" & set "v4="
- )
- )
- )>"out.txt"
- type "out.txt"
- pause&exit
复制代码
作者: ai8866 时间: 2020-6-16 17:02
十分感谢大神
作者: WHY 时间: 2020-6-16 21:56
本帖最后由 WHY 于 2020-6-17 01:03 编辑
- @echo off
- setlocal enabledelayedexpansion
- for /f %%i in (a.txt) do (
- set "s1=!s2!"
- set "s2=!s3!"
- set "s3=!s4!"
- set "s4=%%i"
- if defined s1 (
- set /a n1=1!s1! + 3, n2=1!s2! + 2, n3=1!s3! + 1, n4=1!s4!
- if "!n1!" == "!n2!" if "!n2!" == "!n3!" if "!n3!" == "!n4!" (
- echo;!s1!
- echo;!s2!
- echo;!s3!
- echo;!s4!
- Rem endlocal & setlocal enabledelayedexpansion
- )
- )
- )
- pause
复制代码
作者: flashercs 时间: 2020-6-16 23:12
本帖最后由 flashercs 于 2020-6-17 06:33 编辑
- @echo off
- REM 查找step=1的等差数列arithmetric progression,长度为arraylen
- setlocal enabledelayedexpansion
- set in=series.txt
- set out=ArithmetricProgression.txt
- set arraylen=4
- set step=1
- if %arraylen% lss 1 (
- echo 序列长度必须是大于0的整数
- goto end
- )
- REM init: clear array
- for /f "delims==" %%A in ('2^>nul set arr[') do set %%A=
- set /a i=begin=0,end=begin+arraylen-1
- (
- for /f "delims=" %%A in (%in%) do (
- if !i! gtr %begin% (
- set /a p=i-1
- for %%P in (!p!) do set /a sub=1%%A-1!arr[%%P]!
- if !sub! neq %step% (
- set /a i=begin
- )
- )
- set arr[!i!]=%%A
- if !i! equ %end% (
- for /l %%B in (%begin%,1,%end%) do echo !arr[%%B]!
- echo.
- set /a i=begin
- ) else (
- set /a i+=1
- )
- )
- )
-
- :end
- endlocal
- pause
- exit /b
复制代码
作者: went 时间: 2020-6-16 23:44
各位可以试下这种情况,3种代码输出都不一样
插入00005
具体要看楼主怎么保留- 00001
- 00002
- 00003
- 00004
- 00005
- 00008
- 00010
- 00013
- 00014
- 00015
- 00016
- 00021
- 00022
- 00023
- 00026
- 00030
- 00031
- 00035
- 00036
- 00049
- 00060
- 00061
- 00062
- 00065
复制代码
作者: WHY 时间: 2020-6-17 01:13
回复 6# went
我倒是觉得,只要能够正确处理顶楼的样本就好。
作者: flashercs 时间: 2020-6-17 06:45
回复 6# went
原来忽略了一种情况,现在修改了.可以任意修改数列长度,步长可为负值
作者: ivor 时间: 2020-6-17 17:32
- #! /usr/bin/env python3
- # ! coding:utf-8
-
- #auto detect step is negative or positive
-
- arr = list(map(lambda x: str(x).replace('\n', ''), open('series.txt', 'r')))
- result = list()
- _next = None
- nega_or_posi = (int(arr[1]) - int(arr[0])) /abs(int(arr[1]) - int(arr[0]))
- for i in range(len(arr)):
- if i != len(arr)-1 and int(arr[i].lstrip('0')) + nega_or_posi == int(arr[i+1].lstrip('0')):
- result.append(arr[i])
- _next = arr[i + 1]
- elif isinstance(_next, str):
- result.append(_next)
- _next = None
-
- print(result)
复制代码
作者: ivor 时间: 2020-6-17 19:13
第三方库- import more_itertools as mit
-
- iterable = list(map(lambda x: int(str(x).strip('0\n')), open('num', 'r')))
- result = map(lambda x: [str(i).zfill(5) for i in x], list(filter(lambda x:len(x) > 1, [list(group) for group in mit.consecutive_groups(iterable)])))
- print(list(result))
复制代码
欢迎光临 批处理之家 (http://bbs.bathome.net/) |
Powered by Discuz! 7.2 |