zoukankan      html  css  js  c++  java
  • Python实现目录文件的全量和增量备份

    目标:

      1.传入3个参数:源文件路径,目标文件路径,md5文件

      2.每周一实现全量备份,其余时间增量备份

    1.通过传入的路径,获取该路径下面的所有目录和文件(递归)

    方法一:使用os.listdir

    代码如下:

    #!/usr/bin/env python
    #coding:utf8
    
    import os,sys
    
    def lsdir(folder):
        contents = os.listdir(folder)
        print "%s
    %s
    " % (folder, contents)
        for path in contents:
            full_path = os.path.join(folder, path)
            if os.path.isdir(full_path):
                lsdir(full_path)
    
    if __name__ == "__main__":
        lsdir(sys.argv[1])

    •运行代码,效果如下:

    [root@localhost python]# python listdir.py /a
    /a
    ['b', 'a.txt']
    
    /a/b
    ['c', 'b.txt']
    
    /a/b/c
    ['c.txt']

    方法二:使用os.walk

    代码如下:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    import os,sys
    
    def lsdir(folder):
       contents = os.walk(folder)
       for path, folder, file in contents:
           print "%s
    %s
    " %(path, folder + file)
    
    if __name__ == "__main__":
        lsdir(sys.argv[1])

    •运行代码,测试效果

    [root@localhost python]# python listdir1.py /a
    /a
    ['b', 'a.txt']
    
    /a/b
    ['c', 'b.txt']
    
    /a/b/c
    ['c.txt']

    2.如何计算文件的md5值(每次读取4K,直到读取完文件所有内容,返回一个16进制的md5值)

    代码如下:

    [root@localhost python]# cat md5.py

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    import hashlib
    import sys
    
    def md5(fname):
        m = hashlib.md5()
        with open(fname) as fobj:
            while True:
                data = fobj.read(4096)
                if not data:
                    break
                m.update(data)
        return m.hexdigest()
    
    if __name__ == "__main__":
        print md5(sys.argv[1])

    •运行代码,测试效果

    [root@localhost python]# python md5.py a.txt
    c33da92372e700f98b006dfa5325cf0d
    [root@localhost python]# md5sum a.txt
    c33da92372e700f98b006dfa5325cf0d  a.txt

    *提示:使用linux自带的md5sum和自己编写的Python计算的md5值相通

    3.编写全量和增量备份脚本

    代码如下:

    #!/usr/bin/env python
    #coding:utf8
    
    import time
    import os
    import tarfile
    import cPickle as p
    import hashlib
    
    
    def md5check(fname):
        m = hashlib.md5()
        with open(fname) as fobj:
            while True:
                data = fobj.read(4096)
                if not data:
                    break
                m.update(data)
        return m.hexdigest()
    
    
    def full_backup(src_dir, dst_dir, md5file):
        par_dir, base_dir = os.path.split(src_dir.rstrip('/'))
        back_name = '%s_full_%s.tar.gz' % (base_dir, time.strftime('%Y%m%d'))
        full_name = os.path.join(dst_dir, back_name)
        md5dict = {}
    
        tar = tarfile.open(full_name, 'w:gz')
        tar.add(src_dir)
        tar.close()
        for path, folders, files in os.walk(src_dir):
            for fname in files:
                full_path = os.path.join(path, fname)
                md5dict[full_path] = md5check(full_path)
    
        with open(md5file, 'w') as fobj:
            p.dump(md5dict, fobj)
    
    
    
    def incr_backup(src_dir, dst_dir, md5file):
        par_dir, base_dir = os.path.split(src_dir.rstrip('/'))
        back_name = '%s_incr_%s.tar.gz' % (base_dir, time.strftime('%Y%m%d'))
        full_name = os.path.join(dst_dir, back_name)
        md5new = {}
    
        for path, folders, files in os.walk(src_dir):
            for fname in files:
                full_path = os.path.join(path, fname)
                md5new[full_path] = md5check(full_path)
    
        with open(md5file) as fobj:
            md5old = p.load(fobj)
        
        with open(md5file, 'w') as fobj:
            p.dump(md5new, fobj)
    
        tar = tarfile.open(full_name, 'w:gz')
        for key in md5new:
            if md5old.get(key) != md5new[key]:
                tar.add(key)
        tar.close()
    
    
    if __name__ == '__main__':
        src_dir = '/Users/xkops/gxb/'
        dst_dir = '/tmp/'
        md5file = '/Users/xkops/md5.data'
        if time.strftime('%a') == 'Mon':
            full_backup(src_dir, dst_dir, md5file)
        else:
            incr_backup(src_dir, dst_dir, md5file)

    •运行代码,测试效果(执行前,修改需要备份的文件和路径),运行之后检查/tmp下是否生成了当天的备份文件。

  • 相关阅读:
    无限级分类表设计
    多表连接
    连接(上接子查询那一篇随笔)
    数据库中常用指令
    子查询
    mysql查询表达式解析
    mysql单表删除记录DELETE
    mysql 单表更新记录UPDATE
    七言
    时分秒计算案例
  • 原文地址:https://www.cnblogs.com/xkops/p/6265950.html
Copyright © 2011-2022 走看看