zoukankan      html  css  js  c++  java
  • newspaper安装

    安装系统是centos6.5 32位

    Python环境 Python2.7

    由于直接pip安装时,lmlx安装有问题

    1. 安装lmlx

    http://mdba.cn/?p=86

    yum install libxml* -y
    yum install libxslt* -y
     
    wget http://lxml.de/files/lxml-3.1.2.tgz
    tar xzvf lxml-3.1.2.tgz
    cd lxml-3.1.2
    python setup.py build
    python setup.py install
     
    #验证是否安装成功
    shell > python
    >>> import lxml

    2. 安装newspaper
    https://github.com/codelucas/newspaper

    pip install newspaper
    curl https://raw.github.com/codelucas/newspaper/master/download_corpora.py | python2.7

    3. 测试

    from newspaper import Article
    url = 'http://edition.cnn.com/2014/08/14/world/meast/gaza-couple-wedding-at-unrwa-shelter/index.html?hpt=hp_c2'
    
    a = Article(url, language='zh') # Chinese
    a.download()
    a.parse()
    print a.title
    
    url = 'http://www.tuicool.com/articles/fYneUz'
    a = Article(url, language='zh') # Chinese
    a.download()
    a.parse()
    print a.title
    
    url = 'http://www.tuicool.com/articles/AJJ7nu3'
    a = Article(url, language='zh') # Chinese
    a.download()
    a.parse()
    print a.title
    

      



  • 相关阅读:
    .NET2.0基础类库中的范型——FunctionalProgramming
    原则
    080711 30℃
    关于mcp
    080714 33℃
    080715 31℃
    080716 30℃
    11号
    iOS-调用系统的短信和发送邮件功能,实现短信分享邮件分享
    UIView总结
  • 原文地址:https://www.cnblogs.com/huiwq1990/p/3913851.html
Copyright © 2011-2022 走看看