Splitting a string then splitting by character number
我有一个包含许多行的信息文件,其中一些行上有一组数据对。我想提取深度和温度对。这些对在字符 64 之后开始,每对占据 17 个字符的空间。
我目前正在做的是在字符 64 处截断字符串(python 计数为 63),然后每隔 17 个字符分割字符串。
如果对之间有空白,这很好用,但是有些对没有,因为深度很大。
以下是示例:
1
2 |
‘ 1.9901 954.01’
‘ 1.43011675.01’ |
temp 占据前 10 个字符,depth 占据接下来的 7 个字符。所以我想做的是拆分行,这样我可以分别提取所有值,然后将它们配对。
但是,我在创建增量为 7 或 10 的拆分时遇到问题。此外,我不确定 python 将字符串转换为列表并保留字符长度会发生什么。
这是我的工作代码:
1
2 3 4 5 6 7 8 9 10 11 12 |
import os, re import string with open(‘TE_feb_2014.pos’,‘r’) as file: |
这是一个示例数据线(没有上述问题):
1
|
00087501 297017Q990066614201402251006TE 42550TEMP01 18D 2.01 –1.2801 50.01 –1.1601 99.01 –0.5901 148.01 –0.8001 197.01 –1.1001 245.01 –1.7501 295.01 –1.7701 301.01 –1.7801 343.01 –1.7301 392.01 –1.6701 441.01 –1.5901 489.01 –1.4501 538.01 –1.1401 587.01 –0.7201 635.01 –0.3201 684.01 0.3501 731.01 0.6201 733.01 0.6201
|
好像你想要这样的东西,
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
>>> import re
>>> s ="00087501 297017Q990066614201402251006TE 42550TEMP01 18D 2.01 -1.2801 50.01 -1.1601 99.01 -0.5901 148.01 -0.8001 197.01 -1.1001 245.01 -1.7501 295.01 -1.7701 301.01 -1.7801 343.01 -1.7301 392.01 -1.6701 441.01 -1.5901 489.01 -1.4501 538.01 -1.1401 587.01 -0.7201 635.01 -0.3201 684.01 0.3501 731.01 0.6201 733.01 0.6201" >>> m = re.sub(r‘^.{64}’, r”, s) # To remove the first 64 characters from the input string. >>> re.findall(r‘.{1,17}’, m) # To find all the matches which has the maximum of 17 characters and a minimum of `1` character. [‘ 2.01 -1.2801 ‘, ‘ 50.01 -1.1601 ‘, ‘ 99.01 -0.5901 ‘, ‘148.01 -0.8001 ‘, ‘197.01 -1.1001 ‘, ‘245.01 -1.7501 ‘, ‘295.01 -1.7701 ‘, ‘301.01 -1.7801 ‘, ‘343.01 -1.7301 ‘, ‘392.01 -1.6701 ‘, ‘441.01 -1.5901 ‘, ‘489.01 -1.4501 ‘, ‘538.01 -1.1401 ‘, ‘587.01 -0.7201 ‘, ‘635.01 -0.3201 ‘, ‘684.01 0.3501 ‘, ‘731.01 0.6201 ‘, ‘733.01 0.6201’] >>> for i in re.findall(r‘.{1,17}’, m): print(i)
2.01 –1.2801 |
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/267953.html