zoukankan      html  css  js  c++  java
  • 使用Python获取Chrome浏览器历史记录

    摘要:近来突发奇想,想知道自己在谷歌浏览器中的访问记录,于是写了个Python小程序来获取。

    源代码参考自一位外国小哥的文章,稍作改动!

    代码参考如下:

    #统计谷歌浏览器访问历史记录
    #time@2017-08-30
    #author@yooongchun
    
    import os  
    import sqlite3  
    import operator  
    from collections import OrderedDict  
    import matplotlib.pyplot as plt  
    
    def parse(url):  
        try:  
            parsed_url_components = url.split('//')  
            sublevel_split = parsed_url_components[1].split('/', 1)  
            domain =sublevel_split[0].replace("www.", "")  
            return domain
        except IndexError:  
            print('URL format error!') 
    
    def analyze(results):  
        prompt =input("[.] Type <c> to print or <p> to plot
    [>] ")
    
        if prompt == "c":
            with open('./history.txt','w') as f:
                for site, count in sites_count_sorted.items():
                    f.write(site+'	'+str(count)+'
    ')
        elif prompt == "p":
            key=[]
            value=[]
            for k,v in results.items():
                key.append(k)
                value.append(v)
            n=25
            X=range(n)
            Y=value[:n]
            plt.bar(X,Y,align='edge')
            plt.xticks(rotation=45)  
            plt.xticks(X,key[:n])
            for x,y in zip(X,Y):
                plt.text(x+0.4, y+0.05,y, ha='center', va= 'bottom')
            plt.show()
        else:  
            print("[.] Uh?")  
            quit()  
    
    if __name__=='__main__':
        #path to user's history database (Chrome)  
        data_path=r'C:UsersCNYOZHA22AppDataLocalGoogleChromeUser DataDefault'
        files=os.listdir(data_path)
    
        history_db = os.path.join(data_path, 'history')  
    
        #querying the db  
        c = sqlite3.connect(history_db)  
        cursor = c.cursor()  
        select_statement = "SELECT urls.url, urls.visit_count FROM urls, visits WHERE urls.id = visits.url;"  
        cursor.execute(select_statement)  
    
        results = cursor.fetchall() #tuple  
    
        sites_count = {} #dict makes iterations easier :D  
    
        for url, count in results:  
            url = parse(url)  
            if url in sites_count:  
                sites_count[url] += 1  
            else:  
                sites_count[url] = 1  
    
        sites_count_sorted = OrderedDict(sorted(sites_count.items(), key=operator.itemgetter(1), reverse=True))  
    
        analyze (sites_count_sorted)  
  • 相关阅读:
    Collection
    DP
    JVM
    算法 *-* 并查集Union-Find(连通性)
    log4j
    log4j
    第254期:宠物如何导航回家
    第254期:宠物如何导航回家
    多线程
    多线程
  • 原文地址:https://www.cnblogs.com/yczha/p/13160264.html
Copyright © 2011-2022 走看看