zoukankan      html  css  js  c++  java
  • Python中fnmatch模块的使用

    fnmatch()函数匹配能力介于简单的字符串方法和强大的正则表达式之间,如果在数据处理操作中只需要简单的通配符就能完成的时候,这通常是一个比较合理的方案。此模块的主要作用是文件名称的匹配,并且匹配的模式使用的Unix shell风格。源码很简单:

    """Filename matching with shell patterns.
    
    fnmatch(FILENAME, PATTERN) matches according to the local convention.
    fnmatchcase(FILENAME, PATTERN) always takes case in account.
    
    The functions operate by translating the pattern into a regular
    expression.  They cache the compiled regular expressions for speed.
    
    The function translate(PATTERN) returns a regular expression
    corresponding to PATTERN.  (It does not compile it.)
    """
    import os
    import posixpath
    import re
    import functools
    
    __all__ = ["filter", "fnmatch", "fnmatchcase", "translate"]
    
    def fnmatch(name, pat):
        """Test whether FILENAME matches PATTERN.
    
        Patterns are Unix shell style:
    
        *       matches everything
        ?       matches any single character
        [seq]   matches any character in seq
        [!seq]  matches any char not in seq
    
        An initial period in FILENAME is not special.
        Both FILENAME and PATTERN are first case-normalized
        if the operating system requires it.
        If you don't want this, use fnmatchcase(FILENAME, PATTERN).
        """
        name = os.path.normcase(name)
        pat = os.path.normcase(pat)
        return fnmatchcase(name, pat)
    
    @functools.lru_cache(maxsize=256, typed=True)
    def _compile_pattern(pat):
        if isinstance(pat, bytes):
            pat_str = str(pat, 'ISO-8859-1')
            res_str = translate(pat_str)
            res = bytes(res_str, 'ISO-8859-1')
        else:
            res = translate(pat)
        return re.compile(res).match
    
    def filter(names, pat):
        """Return the subset of the list NAMES that match PAT."""
        result = []
        pat = os.path.normcase(pat)
        match = _compile_pattern(pat)
        if os.path is posixpath:
            # normcase on posix is NOP. Optimize it away from the loop.
            for name in names:
                if match(name):
                    result.append(name)
        else:
            for name in names:
                if match(os.path.normcase(name)):
                    result.append(name)
        return result
    
    def fnmatchcase(name, pat):
        """Test whether FILENAME matches PATTERN, including case.
    
        This is a version of fnmatch() which doesn't case-normalize
        its arguments.
        """
        match = _compile_pattern(pat)
        return match(name) is not None
    
    
    def translate(pat):
        """Translate a shell PATTERN to a regular expression.
    
        There is no way to quote meta-characters.
        """
    
        i, n = 0, len(pat)
        res = ''
        while i < n:
            c = pat[i]
            i = i+1
            if c == '*':
                res = res + '.*'
            elif c == '?':
                res = res + '.'
            elif c == '[':
                j = i
                if j < n and pat[j] == '!':
                    j = j+1
                if j < n and pat[j] == ']':
                    j = j+1
                while j < n and pat[j] != ']':
                    j = j+1
                if j >= n:
                    res = res + '\['
                else:
                    stuff = pat[i:j].replace('\','\\')
                    i = j+1
                    if stuff[0] == '!':
                        stuff = '^' + stuff[1:]
                    elif stuff[0] == '^':
                        stuff = '\' + stuff
                    res = '%s[%s]' % (res, stuff)
            else:
                res = res + re.escape(c)
        return r'(?s:%s)' % res
    

    fnmatch的中的5个函数["filter", "fnmatch", "fnmatchcase", "translate"]

    • filter 返回列表形式的结果
    def gen_find(filepat, top):
        """
        查找符合Shell正则匹配的目录树下的所有文件名
        :param filepat: shell正则
        :param top: 目录路径
        :return: 文件绝对路径生成器
        """
        for path, _, filenames in os.walk(top):
            for file in fnmatch.filter(filenames, filepat):
                yield os.path.join(path, file)
    
    • fnmatch
    电动叉车

    # 列出元组中所有的python文件 pyfiles = [py for py in ('restart.py', 'index.php', 'file.txt') if fnmatch(py, '*.py')] # 字符串的 startswith() 和 endswith() 方法对于过滤一个目录的内容也是很有用的
    • fnmatchcase 区分大小写的文件匹配
    # 这两个函数通常会被忽略的一个特性是在处理非文件名的字符串时候它们也是很有用的。 比如,假设你有一个街道地址的列表数据
    address = [
        '5412 N CLARK ST',
        '1060 W ADDISON ST',
        '1039 W GRANVILLE AVE',
        '2122 N CLARK ST',
        '4802 N BROADWAY',
    ]
    print([addr for addr in address if fnmatchcase(addr, '* ST')])
    
    • translate 这个似乎很少有人用到,前面说了fnmatch是Unix shell匹配风格,可以使用translate将其转换为正则表达式,举个栗子
    shell_match = 'Celery_?*.py'
    print(translate(shell_match))
    # 输出结果:(?s:Celery_..*.py)
    

    Celery_..*.py就是正则表达式的写法。

  • 相关阅读:
    从一个网页上摘取想要的元素
    Oracle数据库迁移
    java 内部类
    关于robot framework 环境搭建的几点注意
    robotframework 页面已经locate到元素 但是操作提示element is no longer valid!
    转 PyQt学习资料
    Java 大数值
    【转】Excel 使用技巧
    Java调用WebService
    String StringBuffer StringBuider
  • 原文地址:https://www.cnblogs.com/xyou/p/10043705.html
Copyright © 2011-2022 走看看