zoukankan      html  css  js  c++  java
  • ipynb to pdf

    Q: 如何把jupyter notebook 转为 pdf 文档?
    A: 尝试了几种python包, 结果都没有成功. 包括: xhtml2pdf,
    查看官方的介绍说用pandoc也是一种方法, 但是觉得安装一个可怕的Latex和pandoc太麻烦了.

    还好, 找到了一个开源方法: 用wkhtmltopdf 程序.
    用python写一个脚本, 调用wkhtmltopdf, 运行命令行指令, 得以实现. 非常符合我的预期. 简明, 优雅.

    wkhtml2pdf 简介

    wkhtmltopdf,一个集成好了的exe文件(C++编写),

    基本的调用方法是:

    "c:Program Filesinwkhtmltopdf.exe" https://github.com/mementum/backtrader/blob/master/docs2/signal_strategy/signal_strategy.rst signal_strategy.pdf

    Loading pages (1/6)
    Counting pages (2/6)
    Resolving links (4/6)
    Loading headers and footers (5/6)
    Printing pages (6/6)
    Done
    

    C:Documents and SettingsAdministratorduanqsstrategy_study>dir *.pdf

     驱动器 C 中的卷是 160GB_XP
     卷的序列号是 EC5F-C44B
    
     C:Documents and SettingsAdministratorduanqsstrategy_study 的目录
    
    2017-04-17  14:47           120,295 signal_strategy.pdf
    2017-04-17  13:32           597,111 backtest.pdf
                   2 个文件        717,406 字节
                   0 个目录 19,999,031,296 可用字节
    

    可以先在命令行测试一下,有其他的需要, 可以在命令行通过wkhtmltopdf --help查询,

    如果是超长页的话,可以用命令:

    wkhtmltopdf.exe http://passport.yupsky.com/ac/register e:yupskyreg.pdf -H --outline

    Here:
    -H 是显示扩展帮助
    --outline 是添加pdf的左侧概要!(缺省设置)

    而且可以批量生成哦,中间用空格隔开

    python 脚本: (封装了运行wkhtml2pdf.exe 命令行的py脚本)

    
    # code:utf-8
    
    '''
    IPython/Jupyter Problems saving notebook as PDF - Stack Overflow  
    http://stackoverflow.com/questions/29156653/ipython-jupyter-problems-saving-notebook-as-pdf
    
    This Python script has GUI to select with explorer a Ipython Notebook you want to convert to pdf. 
    The approach with wkhtmltopdf is the only approach I found works well and provides high quality pdfs. 
    Other approaches described here are problematic, syntax highlighting does not work or graphs are messed up.
    
    You'll need to install wkhtmltopdf: http://wkhtmltopdf.org/downloads.html
    and Nbconvert
    
    pip install nbconvert
    # OR
    conda install nbconvert
    
    '''
    # Script adapted from CloudCray
    # Original Source: https://gist.github.com/CloudCray/994dd361dece0463f64a
    # 2016--06-29
    # This will create both an HTML and a PDF file
    
    import subprocess
    import os
    from Tkinter import Tk
    from tkFileDialog import askopenfilename
    
    WKHTMLTOPDF_PATH = "C:/Program Files/wkhtmltopdf/bin/wkhtmltopdf"  # or wherever you keep it
    
    def export_to_html(filename):
        cmd = 'ipython nbconvert --to html "{0}"'
        subprocess.call(cmd.format(filename), shell=True)
        return filename.replace(".ipynb", ".html")
    
    
    def convert_to_pdf(filename):
        cmd = '"{0}" "{1}" "{2}"'.format(WKHTMLTOPDF_PATH, filename, filename.replace(".html", ".pdf"))
        subprocess.call(cmd, shell=True)
        return filename.replace(".html", ".pdf")
    
    
    def export_to_pdf(filename):
        fn = export_to_html(filename)
        return convert_to_pdf(fn)
    
    def main():
        print("Export IPython notebook to PDF")
        print("    Please select a notebook:")
    
        Tk().withdraw() # Starts in folder from which it is started, keep the root window from appearing 
        x = askopenfilename() # show an "Open" dialog box and return the path to the selected file
        x = str(x.split("/")[-1])
    
        print(x)
    
        if not x:
            print("No notebook selected.")
            return 0
        else:
            fn = export_to_pdf(x)
            print("File exported as:
    	{0}".format(fn))
            return 1
    
    main()
    

    这里也记录一下尝试xhtml2pdf的经过.

    安装完了以后, 编写脚本, 运行时主要是: 卡在了html5lib这个包里:
    异常是:
    inputstream
    CSS parser
    等等.
    搞定不了, 所以放弃之.

    install xhtml2pdf and update html5lib from old vertion to new version (1.0b8)

    Here is the logging:

    C:Documents and SettingsAdministrator>pip install xhtml2pdf
    Collecting xhtml2pdf
      Downloading xhtml2pdf-0.0.6.zip (120kB)
        100% |████████████████████████████████| 122kB 467kB/s
    Collecting html5lib (from xhtml2pdf)
      Using cached html5lib-0.999999999-py2.py3-none-any.whl
    Collecting pyPdf2 (from xhtml2pdf)
      Downloading PyPDF2-1.26.0.tar.gz (77kB)
        100% |████████████████████████████████| 81kB 10kB/s
    Requirement already satisfied: Pillow in d:anaconda2libsite-packages (from xhtml2pdf)
    Collecting reportlab>=2.2 (from xhtml2pdf)
      Downloading reportlab-3.4.0-cp27-cp27m-win32.whl (2.1MB)
        100% |████████████████████████████████| 2.1MB 261kB/s
    Collecting webencodings (from html5lib->xhtml2pdf)
      Downloading webencodings-0.5.1-py2.py3-none-any.whl
    Requirement already satisfied: setuptools>=18.5 in d:anaconda2libsite-packages (from html5lib->xhtml2pdf)
    Requirement already satisfied: six in d:anaconda2libsite-packages (from html5lib->xhtml2pdf)
    Requirement already satisfied: pip>=1.4.1 in d:anaconda2libsite-packages (from reportlab>=2.2->xhtml2pdf)
    Requirement already satisfied: packaging>=16.8 in d:anaconda2libsite-packages (from setuptools>=18.5->html5lib->xhtml
    2pdf)
    Requirement already satisfied: appdirs>=1.4.0 in d:anaconda2libsite-packages (from setuptools>=18.5->html5lib->xhtml2
    pdf)
    Requirement already satisfied: pyparsing in d:anaconda2libsite-packages (from packaging>=16.8->setuptools>=18.5->html
    5lib->xhtml2pdf)
    Building wheels for collected packages: xhtml2pdf, pyPdf2
      Running setup.py bdist_wheel for xhtml2pdf ... done
      Stored in directory: C:Documents and SettingsAdministratorLocal SettingsApplication DatapipCachewheelsecebdb
    13a2be9c15f492c65086709a69042924ebfb7aa4c4cc7284f1
      Running setup.py bdist_wheel for pyPdf2 ... done
      Stored in directory: C:Documents and SettingsAdministratorLocal SettingsApplication DatapipCachewheels866a6a
    1ce004a5996894d33d93e1fb1b67c30973dc945cc5875a1dd0
    Successfully built xhtml2pdf pyPdf2
    Installing collected packages: webencodings, html5lib, pyPdf2, reportlab, xhtml2pdf
    Successfully installed html5lib-0.999999999 pyPdf2-1.26.0 reportlab-3.4.0 webencodings-0.5.1 xhtml2pdf-0.0.6
    
    C:Documents and SettingsAdministrator>pip install html5lib==1.0b8
    Collecting html5lib==1.0b8
      Downloading html5lib-1.0b8.tar.gz (889kB)
        100% |████████████████████████████████| 890kB 311kB/s
    Requirement already satisfied: six in d:anaconda2libsite-packages (from html5lib==1.0b8)
    Building wheels for collected packages: html5lib
      Running setup.py bdist_wheel for html5lib ... done
      Stored in directory: C:Documents and SettingsAdministratorLocal SettingsApplication DatapipCachewheelsd4d1b
    a6b6f9f204af55c9bb8c97eae2a78b690b7150a7b850bb9403
    Successfully built html5lib
    Installing collected packages: html5lib
      Found existing installation: html5lib 0.999999999
        Uninstalling html5lib-0.999999999:
          Successfully uninstalled html5lib-0.999999999
    Successfully installed html5lib-1.0b8
    
    C:Documents and SettingsAdministrator>
    
    
  • 相关阅读:
    CVPR顶会热词统计
    @Annotation学习
    把一张表已有的数据对另一张表数据进行修改
    两张表数据不一致进行对比
    学习借鉴
    借鉴tcp
    借鉴tcp
    osi七层
    http学习
    Json学习
  • 原文地址:https://www.cnblogs.com/duan-qs/p/6722557.html
Copyright © 2011-2022 走看看