zoukankan      html  css  js  c++  java
  • 从fasta中提取或者过滤掉多个序列

    Google了一下,现成的工具不多。

    自己写代码也可以,就是速度肯定不快,而且每次写也很麻烦。

    偶然看到QIIME的filter_fasta.py有这个功能,从name list中提取多个序列。

    filter_fasta.py -f extract_no_N_200.fasta -o remain.fasta -s out.list
    

      

    [REQUIRED]
    
    -f, --input_fasta_fp
    Path to the input fasta file
    -o, --output_fasta_fp
    The output fasta filepath
    [OPTIONAL]
    
    -m, --otu_map
    An OTU map where sequences ids are those which should be retained.
    -s, --seq_id_fp
    A list of sequence identifiers (or tab-delimited lines with a seq identifier in the first field) which should be retained.
    -b, --biom_fp
    A biom file where otu identifiers should be retained.
    -a, --subject_fasta_fp
    A fasta file where the seq ids should be retained.
    -p, --seq_id_prefix
    Keep seqs where seq_id starts with this prefix.
    --sample_id_fp
    Keep seqs where seq_id starts with a sample id listed in this file. Must be newline delimited and may not contain a header.
    -n, --negate
    Discard passed seq ids rather than keep passed seq ids. [default: False]
    --mapping_fp
    Mapping file path (for use with –valid_states). [default: None]
    --valid_states
    Description of sample ids to retain (for use with –mapping_fp). [default: None]
    

    60w条序列瞬间就处理完了。  

  • 相关阅读:
    一代人的青春--芳华
    用切面对监控日志的实现2
    一个在java后台实现的对图片进行加网纹或水印的工具类
    家乡的河
    家乡的鬼节—十来一儿
    八里沟印象
    双城记
    记忆中的那一树梨花
    用切面对监控日志的实现
    关于poi导出excel三种方式HSSFWorkbook,SXSSFWorkbook,csv的总结
  • 原文地址:https://www.cnblogs.com/leezx/p/8619051.html
Copyright © 2011-2022 走看看