zoukankan      html  css  js  c++  java
  • Web Scraping using Python Scrapy_BS4

    Use BeautifulSoup and Python to scrap a website

    Lib:

    • urllib
    • Parsing HTML Data

    Web scraping script

    from urllib.request import urlopen as uReq
    from bs4 import BeautifulSoup as soup
    
    quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
    uClient = uReq(quotes_page)
    page_html = uClient.read()
    uClient.close()
    page_soup = soup(page_html, "html.parser")
    quotes = page_soup.findAll("div", {"class":"quotes"})
    
    for quote in quotes:
        fav_quote = quote.findAll("p", {"class":"aquote"})
        aquote = fav_quote[0].text.strip()
    
        fav_authors = quote.findAll("p",{"class":"author"})
        author = fav_authors[0].text.strip()
    
        print(aquote)
        print(author)

    Run this script successfully

     Following is the whole result of this scraping.

    I hear and i forget. I see and i remember. I do and i understand.
    Confucious
    Feeling gratitude and not expressing it is like wrapping a present and not giving it.
    William Arthur Ward
    Our greatest glory is not in never falling but in rising every time we fall.
    Confucious
    The secret of getting aheadis getting started.
    Mark Twain
    Believe you can   and you're halfway there.
    Theodore Roosevelt
    Resentment is like drinking Poison and  waiting for your enemies to die.
    Nelson Mandela
    Silence is a true friend   who never betrays.
    Confucius
    The best way to find yourself is to   lose yourself in the service of others.
    Mahatma Gandhi
    Never succumb to the temptation of bitterness.
    Martin Luther King Jnr
    The journey of a thousand miles  begins with one step.
    Lao Tzu
    It is health that is real wealth and  not pieces of gold and silver.
    Mahatma Gandhi
    Yesterday is not ours to recover  but tomorrow is ours to win or lose.
    Lyndon B Johnson
    It's not what happens to you  but how you react to it that matters .
    Epictetus
    Beware of what you become  in pursuit of what you want.
    Jim Rohn
    The best revenge  is massive success.
    Frank Sinatra
    Do not take life too seriously You will never get out of it alive.
    Elbert Hubbard
    Don't judge each day by the harvest you reap  but by the seeds that yiu plant.
    Robert Loius Stevenson
    Your attitude and not your aptitude  will determine your altitude
    Zig Ziglar
    Imagination is more important  than knowledge.
    Albert Einstein

    .

    相信未来 - 该面对的绝不逃避,该执著的永不怨悔,该舍弃的不再留念,该珍惜的好好把握。
  • 相关阅读:
    PfSense基于BSD的软件防火墙的安装、配置与应用
    Puppet安装与配置简介(附视频教程)
    Vmware ESX5i 环境下部署Windows Storage Server 2008 R2
    揭秘TPM安全芯片技术及加密应用
    WebRTC实现网页版多人视频聊天室
    Oracle-BPM安装详解
    Specifying the Code to Run on a Thread
    Processes and Threads
    64、ORM学员管理系统-----联合查询
    拦截导弹
  • 原文地址:https://www.cnblogs.com/keepmoving1113/p/11788677.html
Copyright © 2011-2022 走看看