zoukankan      html  css  js  c++  java
  • python-使用正则快速解析QQ群聊记录

    使用正则表达式,对QQ群聊天记录进行解析,用于分析日期、成员等维度发言情况。

    原始文本是
    2014-03-28 15:04:25 №┽◎Eagle(369029696)

    解析之后
    yyyy=2014
    mm = 03
    dd = 28
    hh = 15
    mi  =04
    ss = 25
    nick = №┽◎Eagle
    qq = 369029696

    代码如下

    # -*- coding: utf-8 -*-  
    """ 
     zhangbo2012
     http://www.cnblogs.com/zhangbo2012/
    """
    import re
    
    def resolving_by_user(filepath):
        with open(filepath,'r') as rf:
            filecontent = rf.read()
    
        resolving_result={}
    
        #2014-03-28 15:04:25 №┽◎Eagle(369029696)
        p = re.compile(r'(d{4})-(d{2})-(d{2}) (d{2}):(d{2}):(d{2}) (.*)((.*?))
    ')
        for [yyyy,mm,dd,hh,mi,ss,nick,qq] in p.findall(filecontent):
            if qq in resolving_result.keys():
                temps = resolving_result[qq]
                temps["qq"]=qq
                temps["nick"]=nick
                temps["worldcnt"]+=1
                resolving_result[qq]=temps
            else:
                resolving_result[qq] ={"qq":qq,"nick":nick,"worldcnt":1}
    
        for value in resolving_result.values():
            print str.rjust(repr(value['qq']),15)+str.rjust(repr(value['worldcnt']),10)
    
    if __name__=='__main__':
        resolving_by_user("2.txt")
     
  • 相关阅读:
    javascript实战演练,制作新按钮,‘新窗口打开网站’,点击打开新窗
    P1332 血色先锋队
    P4643 [国家集训队]阿狸和桃子的游戏
    T149876 公约数
    P1462 通往奥格瑞玛的道路
    P1083 借教室
    Tribles UVA
    Fence Repair POJ
    Crossing Rivers
    关于一轮
  • 原文地址:https://www.cnblogs.com/zhangbo2012/p/3700699.html
Copyright © 2011-2022 走看看