zoukankan      html  css  js  c++  java
  • 抓取网页内容生成kindle电子书

    参考:

    • http://calibre-ebook.com/download_linux
    • http://blog.codinglabs.org/articles/convert-html-to-kindle-book.html

    The Linux Command Line

    #TLCL.recipe
    from calibre.web.feeds.recipes import BasicNewsRecipe
    class The_Linux_Command_Line(BasicNewsRecipe):
     
        title = 'The Linux Command Line'
        description = 'The Linux Command Line'
        cover_url = 'http://img5.douban.com/lpic/s7056078.jpg'
     
        url_pre = 'http://billie66.github.io/TLCL/book/'
        no_stylesheets = True
        keep_only_tags = [{ 'class': 'typo' }]    #内容的寻找范围
     
        def parse_index(self):
            soup = self.index_to_soup(self.url_pre)#目录页
     
            div = soup.find('div', {'class': 'contents'})#目录页的寻找范围
     
            articles = []
            for link in div.findAll('a'):
                    
                til = link.contents[0].strip()
                url = self.url_pre + link['href']
                a = { 'title': til, 'url': url }
     
                articles.append(a)
     
            results = [('The Linux Command Line', articles)]
     
            return results
  • 相关阅读:
    php错误抑制符
    php执行运算符
    php中一个经典的!==的用法
    php实现简单验证码的功能
    jquery是什么
    php连接符
    php与java语法的区别
    考雅思策略
    php魔术常量
    PHP中数据类型转换的三种方式
  • 原文地址:https://www.cnblogs.com/flowjacky/p/4461595.html
Copyright © 2011-2022 走看看