python 中统计绵羊 ARS-UI_Ramb_v2.0)参考基因组中GC含量及每条染色体的长度


 

001、方法1

[email protected]:/home/test# ls
a.fasta  test.py
[email protected]:/home/test# head -n 5 a.fasta              ## 参考基因组文件
>NC_056054.1 Ovis aries strain OAR_USU_Benz2616 breed Rambouillet chromosome 1, ARS-UI_Ramb_v2.0, whole genome shotgun sequence
CCTGAGATAACACATTTCTGACTTCTGCCAATTTTGCTGAGAAGCCCAAGGACTGTTGAAAATCAAGAAACACCCAAGAG
CAGCCCGGGCCGTTTATGTCTACATACGTGAGCAGGCTCAATGGAGCATGGATAGTGTCCCTCTGGGCATGGCTGCTGGG
TCAGTCCTCACCTCCCGCCCAGGGTCTCTCCTGAGCCACCTTCCTCccccagggcagagggagaaCACCAGGACCCACAC
TGAAGCCTCTCATTGGTGTGACCCTCAGGAGGCATGTCTGGTCTGGGGTTAGACAGAGCCTGTATCAGAGGGGCTGAGAG
[email protected]:/home/test# cat test.py                     ## 测试程序
#!/usr/bin/python

in_file = open("a.fasta", "r")
dict1 = dict()

for i in in_file:
    i = i.strip()
    if i[0] == ">":
        key = i.split(" ")[0]
        dict1[key] = []
    else:
        dict1[key].append(i.upper())
print("chr" + "/ta" + "/tt" + "/tc" + "/tg"  + "/tgc_ratio"  + "/tlength_chr")
for i,j in dict1.items():
    j = "".join(j)
    a = j.count("A")
    t = j.count("T")
    c = j.count("C")
    g = j.count("G")
    n = j.count("N")
    gc_ratio = (c + g)/(len(j) - n)
    n_ratio = n/len(j)
    print("{0}/t{1}/t{2}/t{3}/t{4}/t{5}/t{6}".format(i, a, t, c, g, gc_ratio, len(j)))
[email protected]:/home/test# python test.py               ## 程序执行结果
chr     a       t       c       g       gc_ratio        length_chr
>NC_056054.1    82403250        82167753        57066962        56974237        0.40931875266539836     278617202
>NC_056055.1    73870206        73664071        51359104        51305177        0.4103312258098626      250202058
>NC_056056.1    65030736        65146643        47958587        47951134        0.42421580443997026     226089100
>NC_056057.1    36042375        36096251        24711270        24727203        0.4066429731145337      121578099
>NC_056058.1    31324976        31390024        22730579        22772709        0.4204768791019869      108220788
>NC_056059.1    35575151        35517658        23688161        23687727        0.3999021614967201      118469697
>NC_056060.1    29406577        29709861        21031276        21124704        0.4162631922148832      101274418
>NC_056061.1    27346922        27467584        18457301        18520064        0.40283921219995616     91792871
>NC_056062.1    28059178        28042984        19541289        19535207        0.4105594344480041      95179658
>NC_056063.1    25741738        25783717        17454908        17477108        0.40403698599974086     86459471
>NC_056064.1    16751668        16822146        14486707        14485976        0.4632183158075184      62547497
>NC_056065.1    22959990        23016479        17190305        17236381        0.42817580976766395     80403655
>NC_056066.1    23370039        23543434        18267514        18329848        0.43823489490914563     83511835
>NC_056067.1    17929002        18169794        15175937        15240424        0.45728466069771134     66516657
>NC_056068.1    23961907        23980053        17269787        17324890        0.41914328300049347     82538637
>NC_056069.1    21123795        21200255        14793713        14779101        0.41132272472969056     71897364
>NC_056070.1    20984288        21100700        15531825        15549410        0.42480305427273457     73167223
>NC_056071.1    19199396        19362319        14687520        14734225        0.432777987469305       67984460
>NC_056072.1    16963178        17145902        13196413        13255057        0.4367772419504116      60561550
>NC_056073.1    14560862        14480238        11216357        11192760        0.43554951381448986     51451717
>NC_056074.1    13094151        13171511        10608847        10638185        0.4471864297991606      47514194
>NC_056075.1    14656953        14747140        11051402        11054996        0.4291630223443221      51512491
>NC_056076.1    18129864        18107164        13099588        13103028        0.4196471075331563      62440644
>NC_056077.1    11202652        11267006        10074078        10086500        0.47291734439377725     42630236
>NC_056078.1    12899395        12904427        9529300 9530632 0.42484032878746614     44863754
>NC_056079.1    13069877        13124355        9405991 9451636 0.4185760014919695      45052359
>NC_056080.1    42505821        42516400        29047656        29098348        0.4061376328441594      143171725

 

参考:https://www.jianshu.com/p/a7b20c2af042

 

原创文章,作者:ItWorker,如若转载,请注明出处:https://blog.ytso.com/281038.html

(0)
上一篇 2022年8月17日
下一篇 2022年8月17日

相关推荐

发表回复

登录后才能评论