zoukankan      html  css  js  c++  java
  • Web Scraping using Python Scrapy_BS4

    Use BeautifulSoup and Python to scrap a website

    Lib:

    • urllib
    • Parsing HTML Data

    Web scraping script

    from urllib.request import urlopen as uReq
    from bs4 import BeautifulSoup as soup
    
    quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
    uClient = uReq(quotes_page)
    page_html = uClient.read()
    uClient.close()
    page_soup = soup(page_html, "html.parser")
    quotes = page_soup.findAll("div", {"class":"quotes"})
    
    for quote in quotes:
        fav_quote = quote.findAll("p", {"class":"aquote"})
        aquote = fav_quote[0].text.strip()
    
        fav_authors = quote.findAll("p",{"class":"author"})
        author = fav_authors[0].text.strip()
    
        print(aquote)
        print(author)

    Run this script successfully

     Following is the whole result of this scraping.

    I hear and i forget. I see and i remember. I do and i understand.
    Confucious
    Feeling gratitude and not expressing it is like wrapping a present and not giving it.
    William Arthur Ward
    Our greatest glory is not in never falling but in rising every time we fall.
    Confucious
    The secret of getting aheadis getting started.
    Mark Twain
    Believe you can   and you're halfway there.
    Theodore Roosevelt
    Resentment is like drinking Poison and  waiting for your enemies to die.
    Nelson Mandela
    Silence is a true friend   who never betrays.
    Confucius
    The best way to find yourself is to   lose yourself in the service of others.
    Mahatma Gandhi
    Never succumb to the temptation of bitterness.
    Martin Luther King Jnr
    The journey of a thousand miles  begins with one step.
    Lao Tzu
    It is health that is real wealth and  not pieces of gold and silver.
    Mahatma Gandhi
    Yesterday is not ours to recover  but tomorrow is ours to win or lose.
    Lyndon B Johnson
    It's not what happens to you  but how you react to it that matters .
    Epictetus
    Beware of what you become  in pursuit of what you want.
    Jim Rohn
    The best revenge  is massive success.
    Frank Sinatra
    Do not take life too seriously You will never get out of it alive.
    Elbert Hubbard
    Don't judge each day by the harvest you reap  but by the seeds that yiu plant.
    Robert Loius Stevenson
    Your attitude and not your aptitude  will determine your altitude
    Zig Ziglar
    Imagination is more important  than knowledge.
    Albert Einstein

    .

    相信未来 - 该面对的绝不逃避,该执著的永不怨悔,该舍弃的不再留念,该珍惜的好好把握。
  • 相关阅读:
    Post和Get的区别(兼谈页面间传值的方式)
    ClickOnce部署Winform程序的方方面面
    TSQL查询进阶深入浅出视图
    一个java volatile测试揭开的陷阱
    java volatile的一个验证反例
    [Swing扩展组件分享]为JTable添加选择列(CheckBox)的包装类
    JTextField限制输入长度的完美解决方案
    swing程序的关闭机制看好你的swing.Timer,别让它成为程序不能退出的原凶
    举例理解单元测试
    打印出txt中出现频率最高的十个词——软件工程个人项目C语言
  • 原文地址:https://www.cnblogs.com/keepmoving1113/p/11788677.html
Copyright © 2011-2022 走看看