zoukankan      html  css  js  c++  java
  • Python中fnmatch模块的使用

    fnmatch()函数匹配能力介于简单的字符串方法和强大的正则表达式之间,如果在数据处理操作中只需要简单的通配符就能完成的时候,这通常是一个比较合理的方案。此模块的主要作用是文件名称的匹配,并且匹配的模式使用的Unix shell风格。源码很简单:

    """Filename matching with shell patterns.
    
    fnmatch(FILENAME, PATTERN) matches according to the local convention.
    fnmatchcase(FILENAME, PATTERN) always takes case in account.
    
    The functions operate by translating the pattern into a regular
    expression.  They cache the compiled regular expressions for speed.
    
    The function translate(PATTERN) returns a regular expression
    corresponding to PATTERN.  (It does not compile it.)
    """
    import os
    import posixpath
    import re
    import functools
    
    __all__ = ["filter", "fnmatch", "fnmatchcase", "translate"]
    
    def fnmatch(name, pat):
        """Test whether FILENAME matches PATTERN.
    
        Patterns are Unix shell style:
    
        *       matches everything
        ?       matches any single character
        [seq]   matches any character in seq
        [!seq]  matches any char not in seq
    
        An initial period in FILENAME is not special.
        Both FILENAME and PATTERN are first case-normalized
        if the operating system requires it.
        If you don't want this, use fnmatchcase(FILENAME, PATTERN).
        """
        name = os.path.normcase(name)
        pat = os.path.normcase(pat)
        return fnmatchcase(name, pat)
    
    @functools.lru_cache(maxsize=256, typed=True)
    def _compile_pattern(pat):
        if isinstance(pat, bytes):
            pat_str = str(pat, 'ISO-8859-1')
            res_str = translate(pat_str)
            res = bytes(res_str, 'ISO-8859-1')
        else:
            res = translate(pat)
        return re.compile(res).match
    
    def filter(names, pat):
        """Return the subset of the list NAMES that match PAT."""
        result = []
        pat = os.path.normcase(pat)
        match = _compile_pattern(pat)
        if os.path is posixpath:
            # normcase on posix is NOP. Optimize it away from the loop.
            for name in names:
                if match(name):
                    result.append(name)
        else:
            for name in names:
                if match(os.path.normcase(name)):
                    result.append(name)
        return result
    
    def fnmatchcase(name, pat):
        """Test whether FILENAME matches PATTERN, including case.
    
        This is a version of fnmatch() which doesn't case-normalize
        its arguments.
        """
        match = _compile_pattern(pat)
        return match(name) is not None
    
    
    def translate(pat):
        """Translate a shell PATTERN to a regular expression.
    
        There is no way to quote meta-characters.
        """
    
        i, n = 0, len(pat)
        res = ''
        while i < n:
            c = pat[i]
            i = i+1
            if c == '*':
                res = res + '.*'
            elif c == '?':
                res = res + '.'
            elif c == '[':
                j = i
                if j < n and pat[j] == '!':
                    j = j+1
                if j < n and pat[j] == ']':
                    j = j+1
                while j < n and pat[j] != ']':
                    j = j+1
                if j >= n:
                    res = res + '\['
                else:
                    stuff = pat[i:j].replace('\','\\')
                    i = j+1
                    if stuff[0] == '!':
                        stuff = '^' + stuff[1:]
                    elif stuff[0] == '^':
                        stuff = '\' + stuff
                    res = '%s[%s]' % (res, stuff)
            else:
                res = res + re.escape(c)
        return r'(?s:%s)' % res
    

    fnmatch的中的5个函数["filter", "fnmatch", "fnmatchcase", "translate"]

    • filter 返回列表形式的结果
    def gen_find(filepat, top):
        """
        查找符合Shell正则匹配的目录树下的所有文件名
        :param filepat: shell正则
        :param top: 目录路径
        :return: 文件绝对路径生成器
        """
        for path, _, filenames in os.walk(top):
            for file in fnmatch.filter(filenames, filepat):
                yield os.path.join(path, file)
    
    • fnmatch
    电动叉车

    # 列出元组中所有的python文件 pyfiles = [py for py in ('restart.py', 'index.php', 'file.txt') if fnmatch(py, '*.py')] # 字符串的 startswith() 和 endswith() 方法对于过滤一个目录的内容也是很有用的
    • fnmatchcase 区分大小写的文件匹配
    # 这两个函数通常会被忽略的一个特性是在处理非文件名的字符串时候它们也是很有用的。 比如,假设你有一个街道地址的列表数据
    address = [
        '5412 N CLARK ST',
        '1060 W ADDISON ST',
        '1039 W GRANVILLE AVE',
        '2122 N CLARK ST',
        '4802 N BROADWAY',
    ]
    print([addr for addr in address if fnmatchcase(addr, '* ST')])
    
    • translate 这个似乎很少有人用到,前面说了fnmatch是Unix shell匹配风格,可以使用translate将其转换为正则表达式,举个栗子
    shell_match = 'Celery_?*.py'
    print(translate(shell_match))
    # 输出结果:(?s:Celery_..*.py)
    

    Celery_..*.py就是正则表达式的写法。

  • 相关阅读:
    luoguP5024 保卫王国 动态dp
    luoguP4571 [JSOI2009]瓶子和燃料 裴蜀定理
    luoguP3235 [HNOI2014]江南乐 数论分块 + 博弈论
    luoguP4101 [HEOI2014]人人尽说江南好 结论
    hdu 3032 NIm or not Nim? Multi SG
    luoguP4279 [SHOI2008]小约翰的游戏 Anti-SG 博弈论
    luoguP3480 [POI2009]KAM-Pebbles 阶梯Nim
    Educational Codeforces Round 65 (Div. 2)
    [PKUSC2018]主斗地(搜索+贪心)
    Codeforces Round #557 (Div. 1)
  • 原文地址:https://www.cnblogs.com/xyou/p/10043705.html
Copyright © 2011-2022 走看看