zoukankan      html  css  js  c++  java
  • python学习——读取染色体长度(七:for循环对染色体序列进行反向互补)

    对fasta文件genome_test.fa中的染色体序列进行反向互补,并输出到文件genome_test_RC.fa

    genome_test.fa

    >chr1
    ATATATATAT
    >chr2
    ATATATATATCGCGCGCGCG
    >chr3
    ATATATATATCGCGCGCGCGATATATATAT
    >chr4
    ATATATATATCGCGCGCGCGATATATATATCGCGCGCGCG
    >chr5
    ATATATATATCGCGCGCGCGATATATATATCGCGCGCGCGATATATATAT

    新疆Reverse_Complement.py文件,并输入如下python脚本

    python脚本

     1 # import os # 导入模块os
     2 import sys # 导入模块sys
     3 f_fasta = sys.argv[1] # 从命令行获取文件名
     4 f = open(f_fasta) # 打开文件
     5 f_RC = open("genome_test_RC.fa","w+")
     6 # 逐行读取
     7 lines = f.readlines()
     8 for line in lines:
     9     line = line.strip() # 去掉行尾的换行符
    10     if (line.startswith(">")):
    11         chr_id = line + '_RC'
    12     else:
    13         chr_seq = line[::-1].replace('A','t').replace('T','a').replace('C','g').replace('G','c').upper()
    14         # 输出结果
    15         print(chr_id)
    16         print(chr_seq)
    17        
    18         f_RC.write(chr_id + '
    ') 
    19         f_RC.write(chr_seq + '
    ') 
    20 f.close()
    21 f_RC.close()

    从cmd终端命令行输入参数,调用上述python脚本,并对genome_test.fa进行处理

    1 E:15_pythonDEBUG>python Reverse_Complement.py genome_test.fa

    结果

    genome_test_RC.fa

    >chr1_RC
    ATATATATAT
    >chr2_RC
    CGCGCGCGCGATATATATAT
    >chr3_RC
    ATATATATATCGCGCGCGCGATATATATAT
    >chr4_RC
    CGCGCGCGCGATATATATATCGCGCGCGCGATATATATAT
    >chr5_RC
    ATATATATATCGCGCGCGCGATATATATATCGCGCGCGCGATATATATAT

  • 相关阅读:
    (DP+二分查找) leetcode 300. Longest Increasing Subsequence, 673. Number of Longest Increasing Subsequence
    linux
    电脑突然找不到wifi 的解决方法
    (字典序) leetcode 316. Remove Duplicate letters
    vector insert()
    randrange
    blur、medianBlur、GaussianBlur
    clip
    choice
    randint
  • 原文地址:https://www.cnblogs.com/caicai2019/p/10799760.html
Copyright © 2011-2022 走看看