001、方法1
[email protected]:/home/test# ls a.fasta test.py [email protected]:/home/test# head -n 5 a.fasta ## 参考基因组文件 >NC_056054.1 Ovis aries strain OAR_USU_Benz2616 breed Rambouillet chromosome 1, ARS-UI_Ramb_v2.0, whole genome shotgun sequence CCTGAGATAACACATTTCTGACTTCTGCCAATTTTGCTGAGAAGCCCAAGGACTGTTGAAAATCAAGAAACACCCAAGAG CAGCCCGGGCCGTTTATGTCTACATACGTGAGCAGGCTCAATGGAGCATGGATAGTGTCCCTCTGGGCATGGCTGCTGGG TCAGTCCTCACCTCCCGCCCAGGGTCTCTCCTGAGCCACCTTCCTCccccagggcagagggagaaCACCAGGACCCACAC TGAAGCCTCTCATTGGTGTGACCCTCAGGAGGCATGTCTGGTCTGGGGTTAGACAGAGCCTGTATCAGAGGGGCTGAGAG [email protected]:/home/test# cat test.py ## 测试程序 #!/usr/bin/python in_file = open("a.fasta", "r") dict1 = dict() for i in in_file: i = i.strip() if i[0] == ">": key = i.split(" ")[0] dict1[key] = [] else: dict1[key].append(i.upper()) print("chr" + "/ta" + "/tt" + "/tc" + "/tg" + "/tgc_ratio" + "/tlength_chr") for i,j in dict1.items(): j = "".join(j) a = j.count("A") t = j.count("T") c = j.count("C") g = j.count("G") n = j.count("N") gc_ratio = (c + g)/(len(j) - n) n_ratio = n/len(j) print("{0}/t{1}/t{2}/t{3}/t{4}/t{5}/t{6}".format(i, a, t, c, g, gc_ratio, len(j))) [email protected]:/home/test# python test.py ## 程序执行结果 chr a t c g gc_ratio length_chr >NC_056054.1 82403250 82167753 57066962 56974237 0.40931875266539836 278617202 >NC_056055.1 73870206 73664071 51359104 51305177 0.4103312258098626 250202058 >NC_056056.1 65030736 65146643 47958587 47951134 0.42421580443997026 226089100 >NC_056057.1 36042375 36096251 24711270 24727203 0.4066429731145337 121578099 >NC_056058.1 31324976 31390024 22730579 22772709 0.4204768791019869 108220788 >NC_056059.1 35575151 35517658 23688161 23687727 0.3999021614967201 118469697 >NC_056060.1 29406577 29709861 21031276 21124704 0.4162631922148832 101274418 >NC_056061.1 27346922 27467584 18457301 18520064 0.40283921219995616 91792871 >NC_056062.1 28059178 28042984 19541289 19535207 0.4105594344480041 95179658 >NC_056063.1 25741738 25783717 17454908 17477108 0.40403698599974086 86459471 >NC_056064.1 16751668 16822146 14486707 14485976 0.4632183158075184 62547497 >NC_056065.1 22959990 23016479 17190305 17236381 0.42817580976766395 80403655 >NC_056066.1 23370039 23543434 18267514 18329848 0.43823489490914563 83511835 >NC_056067.1 17929002 18169794 15175937 15240424 0.45728466069771134 66516657 >NC_056068.1 23961907 23980053 17269787 17324890 0.41914328300049347 82538637 >NC_056069.1 21123795 21200255 14793713 14779101 0.41132272472969056 71897364 >NC_056070.1 20984288 21100700 15531825 15549410 0.42480305427273457 73167223 >NC_056071.1 19199396 19362319 14687520 14734225 0.432777987469305 67984460 >NC_056072.1 16963178 17145902 13196413 13255057 0.4367772419504116 60561550 >NC_056073.1 14560862 14480238 11216357 11192760 0.43554951381448986 51451717 >NC_056074.1 13094151 13171511 10608847 10638185 0.4471864297991606 47514194 >NC_056075.1 14656953 14747140 11051402 11054996 0.4291630223443221 51512491 >NC_056076.1 18129864 18107164 13099588 13103028 0.4196471075331563 62440644 >NC_056077.1 11202652 11267006 10074078 10086500 0.47291734439377725 42630236 >NC_056078.1 12899395 12904427 9529300 9530632 0.42484032878746614 44863754 >NC_056079.1 13069877 13124355 9405991 9451636 0.4185760014919695 45052359 >NC_056080.1 42505821 42516400 29047656 29098348 0.4061376328441594 143171725
参考:https://www.jianshu.com/p/a7b20c2af042
原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/281038.html