zoukankan      html  css  js  c++  java
  • pyparsing:定制自己的解析器

    在工作中,经常需要解析不同类型的文件,常用的可能就是正则表达式了,简单点的,可能会使用awk。这里要推荐一种比较小众的方式,使用pyparsing来解析文件。

    pyparsing可以做些什么呢?主要可以相当方便地定制自己的tokenizer,因此可以很容易拓展,实现自己的parser

    下面看一个traceview的解析例子

    16803 AsyncTask #3
    16804 pool-2-thread-5
    16806 pool-3-thread-1
    16807 uil-pool-2-thread-1
    16808 uil-pool-2-thread-2
    16809 uil-pool-2-thread-3
    16810 uil-pool-2-thread-4
    Trace (threadID action usecs class.method signature):
    16736 xit         0 ..dalvik.system.VMDebug.startMethodTracingFilename (Ljava/lang/String;IIZI)V	VMDebug.java
    16804 xit         0 ..com.android.org.conscrypt.NativeCrypto.EVP_DigestUpdate (Lcom/android/org/conscrypt/OpenSSLDigestContext;[BII)V	NativeCrypto.java
    16736 xit       218 .dalvik.system.VMDebug.startMethodTracing (Ljava/lang/String;IIZI)V	VMDebug.java
    16736 xit       225 android.os.Debug.startMethodTracing (Ljava/lang/String;II)V	Debug.java
    16736 xit       230-android.os.Debug.startMethodTracing (Ljava/lang/String;I)V	Debug.java
    16736 xit       266-java.lang.reflect.Method.invoke (Ljava/lang/Object;[Ljava/lang/Object;Z)Ljava/lang/Object;	Method.java
    16804 ent       528 ..java.lang.ClassLoader.loadClass (Ljava/lang/String;)Ljava/lang/Class;	ClassLoader.java
    16804 ent       543 ...java.lang.ClassLoader.loadClass (Ljava/lang/String;Z)Ljava/lang/Class;	ClassLoader.java
    16804 ent       548 ....java.lang.ClassLoader.findLoadedClass (Ljava/lang/String;)Ljava/lang/Class;	ClassLoader.java
    16804 ent       567 .....java.lang.BootClassLoader.getInstance ()Ljava/lang/BootClassLoader;	ClassLoader.java
    16804 xit       576 .....java.lang.BootClassLoader.getInstance ()Ljava/lang/BootClassLoader;	ClassLoader.java
    16804 xit       681 ....java.lang.ClassLoader.findLoadedClass (Ljava/lang/String;)Ljava/lang/Class;	ClassLoader.java
    16804 ent       689 ....com.uc.base.aerie.hack.ClassLoaderSupport$a.loadClass (Ljava/lang/String;Z)Ljava/lang/Class;	ProGuard
    16804 ent       704 .....java.lang.ClassLoader.getParent ()Ljava/lang/ClassLoader;	ClassLoader.java
    8
    16804 ent       726 ......java.lang.BootClassLoader.loadClass (Ljava/lang/String;Z)Ljava/lang/Class;	ClassLoader.java
    16804 ent       730 .......java.lang.ClassLoader.findLoadedClass (Ljava/lang/String;)Ljava/lang/Class;	ClassLoader.java
    16804 ent       734 ........java.lang.BootClassLoader.getInstance ()Ljava/lang/BootClassLoader;	ClassLoader.java
    16804 xit       740 ........java.lang.BootClassLoader.getInstance ()Ljava/lang/BootClassLoader;	ClassLoader.java
    16804 xit       754 .......java.lang.ClassLoader.findLoadedClass (Ljava/lang/String;)Ljava/lang/Class;	ClassLoader.java
    16804 xit       759 ......java.lang.BootClassLoader.loadClass (Ljava/lang/String;Z)Ljava/lang/Class;	ClassLoader.java
    16804 xit       763 .....java.lang.ClassLoader.loadClass (Ljava/lang/String;)Ljava/lang/Class;	ClassLoader.java

    这是一部分转换后的原始日志,格式比较标准,因此可以这么定制

    import os
    
    from pyparsing import Word, nums, Combine, alphas, Literal, ZeroOrMore, Group, 
        Suppress
    
    
    semiFlag = Literal(";")
    dotFlag = Suppress(Literal("."))
    multiDot = ZeroOrMore(dotFlag)
    
    threadID =Word(nums, max=5)
    actionField = Word(alphas)
    usecsField = Word(nums, max=8)
    
    clsField = Word(alphas+".")
    methodField = Combine("(" + ZeroOrMore(Word(alphas + ";/")) + ")" + Word(alphas + "/") + semiFlag)
    
    regex = threadID + actionField + usecsField + multiDot + Group(clsField + methodField) + clsField
    
    
    with open(os.path.join(os.getcwd(), "StepBeforeFirstDraw_o.txt"), "rb") as f:
        lineno = 0
        flag = 0
        while 1:
            line = f.readline()
            lineno += 1
            if "threadID action usecs" in line:
                flag = lineno
                continue
            if flag > 0:
                try:
                    regex.parseString(line).toXML("")
                except Exception as e:
                    pass

    解析结果为:

    /usr/bin/python2.7 /home/alex/workspace/virtual_space/project/calclex.py
    ['16804', 'ent', '528', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '543', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;Z)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '548', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '567', ['java.lang.BootClassLoader.getInstance', '()Ljava/lang/BootClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '576', ['java.lang.BootClassLoader.getInstance', '()Ljava/lang/BootClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '681', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '704', ['java.lang.ClassLoader.getParent', '()Ljava/lang/ClassLoader;'], 'ClassLoader.java']
    ['16804', 'ent', '726', ['java.lang.BootClassLoader.loadClass', '(Ljava/lang/String;Z)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '730', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '734', ['java.lang.BootClassLoader.getInstance', '()Ljava/lang/BootClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '740', ['java.lang.BootClassLoader.getInstance', '()Ljava/lang/BootClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '754', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'xit', '759', ['java.lang.BootClassLoader.loadClass', '(Ljava/lang/String;Z)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'xit', '763', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'xit', '771', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;Z)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'xit', '774', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '809', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '814', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;Z)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '818', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '822', ['java.lang.BootClassLoader.getInstance', '()Ljava/lang/BootClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '827', ['java.lang.BootClassLoader.getInstance', '()Ljava/lang/BootClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '842', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '853', ['java.lang.ClassLoader.getParent', '()Ljava/lang/ClassLoader;'], 'ClassLoader.java']
    ['16804', 'xit', '857', ['java.lang.ClassLoader.getParent', '()Ljava/lang/ClassLoader;'], 'ClassLoader.java']
    ['16804', 'ent', '861', ['java.lang.ClassLoader.loadClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '865', ['java.lang.BootClassLoader.loadClass', '(Ljava/lang/String;Z)Ljava/lang/Class;'], 'ClassLoader.java']
    ['16804', 'ent', '869', ['java.lang.ClassLoader.findLoadedClass', '(Ljava/lang/String;)Ljava/lang/Class;'], 'ClassLoader.java']

    这样已经很方便去做二次处理了,而且解析规则的可读性也会比正则的强。

  • 相关阅读:
    常用和实用的git命令,让你快速入门git
    如何获取电脑的IP和mac地址
    关于vue插件的使用和修改
    BullsEye游戏优化布局
    BullsEye游戏总结
    Android游戏小demo
    算法及相应算法应用之令牌桶算法
    php IDE之phpStorm使用小记
    php中openssl_encrypt方法
    mysql界面工具
  • 原文地址:https://www.cnblogs.com/alexkn/p/7129168.html
Copyright © 2011-2022 走看看