zoukankan      html  css  js  c++  java
  • 利用Adobe Acrobat 7.0 Professional 自带的导出图片的功能(转)

    最近在研究pdf to img 看了网上一些资料没找到个实用的,用第三方程序怕有问题,于是就想利用Adobe Acrobat 7.0 Professional 自带的导出图片的功能,资料忒少啊。找了半天,下面这个比较有用。不过说句实在话,我只是用别人的东西,没自己原创的东西。

    原文:http://fidodido2010.spaces.live.com/blog/cns!42DBF9483C966838!129.entry

    ---------------------------------------------------------------------------------------------------------------------------

    PDF转换成其他格式的COM解决方案
    需求起源:

    最近图片格式之间相互转换做得比较多,以往的转换LEADTOOLS R13肯定能搞定,却发现有一部分PDF用LEADTOOLS打不开,只好又把Acrobat捡起来了.

    研究了半天,总算找到个办法,却被AcroExch.PDDoc的一个接口GetJSObject()难住了.照理说,这个接口返回的是一个JavaScript对象,可是C++下却没有相应的类型说明,只能用IDispatch,调用机制及IType完全搞不清楚,所有的能G到的相关内容几乎全是VB的,只有一个可怜的老外问过"Using GetJSObject() in C++"的问题,而且答案还是"since it involves low level COM API's to get directly to the IDispatch for the object.".

    正想以头抢地,或者考虑整个工程迁移到VB下去,忽然想起,何不用VB做个专门调用这个接口的COM,在C++下面调用?

    说干就干

     

     

    用VB6编写Acrobat COM:

    建立一个VB6的ActiveX Dll工程,工程名称改为MPDF2SIMG(Multi-page PDF to Single-page Image),模块名改为Converter,添加引用"Adobe Acrobat 7.0 Type Library".模块的全部代码如下:

    Option Explicit On

    Dim oApp As Acrobat.CAcroApp
    Dim oMultiPageDoc As Acrobat.CAcroPDDoc
    Dim oSinglePageDoc As Acrobat.CAcroPDDoc
    Dim JSO As Object


    Private Sub Class_Initialize()
        oApp = CreateObject("AcroExch.App")
        oMultiPageDoc = CreateObject("AcroExch.PDDoc")
    End Sub

    Public Function ConvertPDF(ByVal SourcePDF As String, _
        ByVal TargetFolder As String, _
        ByVal TargetFormat As String, _
        ByVal StartImgNumber As Integer) As Integer

        Dim iNumbers As Integer
        Dim i As Integer
        Dim OutPath As String
        Dim OutFile As String

        OutPath = TargetFolder
        If Right(OutPath, 1) <> "\" Then OutPath = OutPath & "\"

        On Error GoTo err1

        oMultiPageDoc.Open(SourcePDF)

        iNumbers = oMultiPageDoc.GetNumPages

        For i = 0 To iNumbers - 1
            oSinglePageDoc = CreateObject("AcroExch.PDDoc")
            oSinglePageDoc.Create()
            oSinglePageDoc.InsertPages(-1, oMultiPageDoc, i, 1, 0)
            JSO = oSinglePageDoc.GetJSObject
            OutFile = OutPath & Format(i + StartImgNumber, "00000000") & _
                ".tif"
            JSO.SaveAs(OutFile, "com.adobe.acrobat." & TargetFormat)
            JSO = Nothing
            oSinglePageDoc = Nothing
        Next

        oMultiPageDoc.Close()
        oApp.CloseAllDocs()
        ConvertPDF = iNumbers
        Exit Function
    err1:
        ConvertPDF = -1
    End Function

    Private Sub Class_Terminate()
        oMultiPageDoc = Nothing
        oSinglePageDoc = Nothing
    End Sub然后编译成DLL.  使用这个DLL的方法:1.在计算机上运行regsvr32 mpdf2simg.dll注册这个DLL.2.使用这个DLL的C++程序里导入该COM的类型库,代码如下:#import "E:\project\Converter\mpdf2simg.dll"
    using namespace MPDF2SIMG;3.定义COM型变量并建立实例,代码如下:

    _ConverterPtr    pConverter;
    HRESULT hr = pConverter.CreateInstance(_T("MPDF2SIMG.Converter"));
    if(!FAILED(hr))
    {
        //do something if failed.
        ...
    }4.调用该COM的接口

    int nConv = pConverter->ConvertPDF(
        CString(_T("xxxxxx\\source.pdf")).AllocSysString(),
        CString(_T("d:\\TargetPath")).AllocSysString(),
        CString(_T("tiff")).AllocSysString(),
        nStart);该调用会将指定的SoucePDF转至TargetPath下连续的单页TIFF文件,文件名为8位数字编号形式,编号起始由nStart指定.

    调用成功返回转换的页数,失败返回-1

     

     

    其他支持的格式:

    值  可用扩展名 
    "com.adobe.acrobat.eps"  eps 
    "com.adobe.acrobat.html-3-20"  html, htm 
    "com.adobe.acrobat.html-4-01-css-1-00"  html, htm 
    "com.adobe.acrobat.jpeg"  jpeg, jpg, jpe 
    "com.adobe.acrobat.jp2k"  jpf, jpx, jp2, j2k, j2c,jpc 
    "com.adobe.acrobat.doc"  doc 
    "com.adobe.acrobat.png"  png 
    "com.adobe.acrobat.ps"  ps 
    "com.adobe.acrobat.rtf"  rtf 
    "com.adobe.acrobat.accesstext"  txt 
    "com.adobe.acrobat.plain-text"  txt 
    "com.adobe.acrobat.tiff"  tiff, tif 
    "com.adobe.acrobat.xml-1-00"  xml


     

     

     

    已知问题和BUG:

    如果C++程序使用多字节字符集编译, TargetPath中含有中文字符会导致无法正常转换,ConvertPDF调用会导致弹出"无法保存文件"的Acrobat对话框, 点确定ConvertPDF会返回-1.Unicode字符集未做测试.

     

     

    补充说明:

    使用Acrobat COM, 应在计算机上部署Adobe Acrobat (Not Reader).

    -----------------------------------我自己的就简单了用-Adobe professinal-------------------------------

       Dim gApp As Acrobat.CAcroApp
            Dim oMultiPageDoc As Acrobat.CAcroPDDoc
            Dim oSinglePageDoc As Acrobat.CAcroPDDoc
            Dim iNumbers As Integer
            Dim StartImgNumber As Integer
            Dim OutFile As String
            Dim i As Integer
            Dim jso As Object
            gApp = CreateObject("AcroExch.App")
            oMultiPageDoc = CreateObject("AcroExch.PDDoc")

            'pdf和生成的文件要在同一个文件夹下
            If oMultiPageDoc.Open("F:\test.pdf") Then

                iNumbers = oMultiPageDoc.GetNumPages

                For i = 0 To iNumbers - 1
                    oSinglePageDoc = CreateObject("AcroExch.PDDoc")
                    oSinglePageDoc.Create()
                    oSinglePageDoc.InsertPages(-1, oMultiPageDoc, i, 1, 0)
                    jso = oSinglePageDoc.GetJSObject
                    OutFile = Format(i + StartImgNumber, "00000000") & ".png"
                    jso.SaveAs("F:\" & OutFile, "com.adobe.acrobat.png")
                    jso = Nothing
                    oSinglePageDoc = Nothing
                Next

            End If

    -----附送一个用GhostScriptView的----------------

    C:\Program Files\gs\gs8.61\bin\gswin32c.exe -dSAFER -dBATCH -dNOPAUSE -r300 -sDEVICE=png16m -dGraphicsAlphaBits=4

    -sOutputFile="F:\test.pdf"  "F:\test\"

  • 相关阅读:
    paip.解决Invalid byte 2 of 2byte UTF8 sequence.
    poj1157
    poj1258
    poj1160
    poj1113
    poj1159
    !!!GRETA正则表达式模板类库
    【原创】C#与C++的混合编程采用其中的第三种方法
    WinApi.cs
    C#:正则表达式30分钟入门教程
  • 原文地址:https://www.cnblogs.com/panzhilei/p/1846806.html
Copyright © 2011-2022 走看看