001、
(base) root@PC1:/home/test# ls ## 测试数据及脚本
a.fasta test.py
(base) root@PC1:/home/test# cat a.fasta ## 测试数据
>scaffold_1
CCCGGGTAAAACGGGTCTTCAAGAAAACGCTCCTCCGTTAATGCCGGCCGATTCAAATAA
CCTCTGGCAACACCCGCTCCGGCAATGTATAGTTCACCGATACATCCAACAGGCAGCATC
GGCCC
>scaffold_2
CTGTTGCTCCTGTTGCTCCTGTTGATCCCGTTGCACCTGTTGGTCCAGTCGGTCCAATTC
>scaffold_3
TTGATCCAGTGGCTCCGGTTACTCCAGTTGATCCTGTTGCGCCTGTTGCTCCAGTTTCTC
CGGTTGGTCCGGTTGATCCGGTTGCACCTGTTACTCCAGTGGCTCCGGTTACTCCCGTCG
CTGTTGCTCCTGTTGCTCCTGTTGATCCCGTTGCACCTGTTGGTCCAGTCGGTCCAATTC
(base) root@PC1:/home/test# cat test.py ## 脚本
#!/usr/bin/python
in_file = open("a.fasta", "r")
out_file = open("result.txt", "w")
import re
total_sca = 0
total_len = 0
total_len_gc = 0
for i in in_file:
i = i.strip()
if i[0] == ">":
total_sca += 1
else:
total_len += len(i)
total_len_gc += len(re.findall("[GCgc]", i))
print("n_scofflod", "total_len", "total_len_gc", "proportion_gc", file = out_file, sep = "/t")
print(total_sca, total_len, total_len_gc, total_len_gc/total_len, file = out_file, sep = "/t")
in_file.close()
out_file.close()
(base) root@PC1:/home/test# python test.py ## 执行程序
(base) root@PC1:/home/test# ls
a.fasta result.txt test.py
(base) root@PC1:/home/test# cat result.txt ## 查看统计结果
n_scofflod total_len total_len_gc proportion_gc
3 365 203 0.5561643835616439
参考:https://mp.weixin.qq.com/s?__biz=MzIxNzc1Mzk3NQ==&mid=2247491482&idx=1&sn=596fd0f0e7d41757e1e539f3223a8c8c&chksm=97f5af82a08226943da69bca8228480d4b708ca2c89f8008281f140682e8814b43cf49d60762&scene=178&cur_album_id=2403674812188688386#rd
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/tech/python/279551.html