zoukankan      html  css  js  c++  java
  • pyparsing自定义解析规则

    1.Word(token)

    用于匹配由允许的字符集组成的单词,常见的错误是使用特定字符串Word("expr")匹配"expr"

     - L {alphas}  字母
     - L {nums}  数字
     - L {alphanums} 数字字母混合

     

    2.Suppress

    忽略表达式中内容

    import pyparsing as pp
    
    source = "a , b, c, d"
    wd = pp.Word(pp.alphas)
    wd_list = wd + pp.ZeroOrMore(','+ wd)
    print wd_list.parseString(source)
    # result ['a', ',', 'b', ',', 'c', ',', 'd']
    
    # ZeroOrMore
    wd_list = wd +pp.ZeroOrMore(pp.Suppress(',')+wd)
    
    print wd_list.parseString(source)
    # ['a', 'b', 'c', 'd']
    

     3. Group 

     使用group将返回的结果,使匹配的合成一个字符串

    from pyparsing import *
    
    wd = Word(alphas)
    comma = Literal(",")
    greetee = OneOrMore(wd)
    end = oneOf("! ?")
    greeting = wd + comma + greetee + end
    # result::['Hello', ',', 'World', '!']
    print greeting.parseString("Hello,World!")
    wd = Group(Word(alphas)) comma = Literal(",") greetee = OneOrMore(wd) end = oneOf("! ?") greeting = wd + comma + greetee + end # [['Hello'], ',', ['World'], '!'] print greeting.parseString("Hello,World!")
    from pyparsing import *
    
    wd = Word(alphas)
    comma = Literal(",").suppress()
    greetee = OneOrMore(wd)
    end = oneOf("! ?").suppress()
    greeting = wd + comma + greetee + end
    #['Hello', 'World']
    print greeting.parseString("Hello,World!")
    

     4. setResultsName 给每一个token匹配起一个漂亮的名字

     给匹配的token起一个名字,方便在解析后的ParseResults对象中像字典一样调用

    from pyparsing import *
    
    integer = Word(nums)
    date_str = (integer("year")+'/'+integer("month")+'/'+integer("day"))
    # integer("year") equivalent to interger.setResultsName("year")
    data = date_str.parseString('2019/04/17')
    
    # year,type:<type 'str'>,value:2019
    print('year,type:%s,value:%s' %(type(data.year),data.year))

    5 setParseAction 对每个解析的token进行处理

      处理的方法可以自定义,其中三个参数见下

    - s   = the original string being parsed (see note below) # 原字符串
    - loc = the location of the matching substring # 匹配的token所处位置
    - toks = a list of the matched tokens # 匹配的token列表

    比如想对上例中的日期转为int数字,可以自定义一个parseAction,如下:
    from pyparsing import *
    
    integer = Word(nums).setParseAction(lambda s,lo,tokens:int(tokens[0]))
    
    date_str = (integer("year")+'/'+integer("month")+'/'+integer("day"))
    data = date_str.parseString('2019/04/17')
    
    # year,type:<type 'int'>,value:2019
    print('year,type:%s,value:%s' %(type(data.year),data.year))
    

    6. parseString 解析传入的字符串

       str:第一个参数传入需要解析字符串

     parseAll: 第二个参数是否为完全匹配解析。1.解析配置的模式必须与字符串一致,否则会报错。2.匹配的tokens放置在tokens列表中,在上面定义parseAction时,使用tokens[0],因为token中只有一个匹配的token,但在此种模式下,tokens中可能存在多个token

    7. delimitedList 只需要传入一个匹配格式,就可以 Word,Word....若干个匹配,默认每个Word使用逗号断开

    om pyparsing import Word, alphas, alphanums, Combine, oneOf, Optional, delimitedList, Group, Keyword
    
    testdata = """
      int func1(float *vec, int len, double arg1);
      int func2(float **arr, float *vec, int len, double arg1, double arg2);
      """
    # function retun type is alphas and function name is number,alphas and _
    ident = Word(alphas, alphanums + "_")
    # define var: var type and before var name *.
    vartype = Combine( oneOf("float double int char") + Optional(Word("*")), adjacent = False)
    # return type and var name or  * var name
    arglist = delimitedList(Group(vartype("type") + ident("name")))
    
    functionCall = Keyword("int") + ident("name") + "(" + arglist("args") + ")" + ";"
    
    for fn,s,e in functionCall.scanString(testdata):
        print(fn.name)
        for a in fn.args:
            print(" - %(name)s (%(type)s)" % a)
    
    # output:
    # func1
    #  - vec (float*)
    #  - len (int)
    #  - arg1 (double)
    # func2
    #  - arr (float**)
    #  - vec (float*)
    #  - len (int)
    #  - arg1 (double)
    #  - arg2 (double)
    

     

  • 相关阅读:
    外贸术语缩写大全简写解释
    免费Shopify主题Dawn
    语音搜索对未来SEO的影响
    基于Prometheus和Grafana监控redis,Oracle,mysql,pg以及sqlserver的方法总结
    使用Grafana 监控 SQLSERVER数据库
    使用influxdb以及Grafana监控vCenter的操作步骤
    Grafana监控Redis的使用情况
    OpenPower服务使用node-exporter prometheus以及grafana进行性能监控的流程
    Mysql localhost 无法登录 root用户的处理过程
    总结: Redis 查看key大小的简单总结
  • 原文地址:https://www.cnblogs.com/CaesarLinsa/p/10714056.html
Copyright © 2011-2022 走看看