zoukankan      html  css  js  c++  java
  • Web Scraping using Python Scrapy_BS4

    Use BeautifulSoup and Python to scrap a website

    Lib:

    • urllib
    • Parsing HTML Data

    Web scraping script

    from urllib.request import urlopen as uReq
    from bs4 import BeautifulSoup as soup
    
    quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
    uClient = uReq(quotes_page)
    page_html = uClient.read()
    uClient.close()
    page_soup = soup(page_html, "html.parser")
    quotes = page_soup.findAll("div", {"class":"quotes"})
    
    for quote in quotes:
        fav_quote = quote.findAll("p", {"class":"aquote"})
        aquote = fav_quote[0].text.strip()
    
        fav_authors = quote.findAll("p",{"class":"author"})
        author = fav_authors[0].text.strip()
    
        print(aquote)
        print(author)

    Run this script successfully

     Following is the whole result of this scraping.

    I hear and i forget. I see and i remember. I do and i understand.
    Confucious
    Feeling gratitude and not expressing it is like wrapping a present and not giving it.
    William Arthur Ward
    Our greatest glory is not in never falling but in rising every time we fall.
    Confucious
    The secret of getting aheadis getting started.
    Mark Twain
    Believe you can   and you're halfway there.
    Theodore Roosevelt
    Resentment is like drinking Poison and  waiting for your enemies to die.
    Nelson Mandela
    Silence is a true friend   who never betrays.
    Confucius
    The best way to find yourself is to   lose yourself in the service of others.
    Mahatma Gandhi
    Never succumb to the temptation of bitterness.
    Martin Luther King Jnr
    The journey of a thousand miles  begins with one step.
    Lao Tzu
    It is health that is real wealth and  not pieces of gold and silver.
    Mahatma Gandhi
    Yesterday is not ours to recover  but tomorrow is ours to win or lose.
    Lyndon B Johnson
    It's not what happens to you  but how you react to it that matters .
    Epictetus
    Beware of what you become  in pursuit of what you want.
    Jim Rohn
    The best revenge  is massive success.
    Frank Sinatra
    Do not take life too seriously You will never get out of it alive.
    Elbert Hubbard
    Don't judge each day by the harvest you reap  but by the seeds that yiu plant.
    Robert Loius Stevenson
    Your attitude and not your aptitude  will determine your altitude
    Zig Ziglar
    Imagination is more important  than knowledge.
    Albert Einstein

    .

    相信未来 - 该面对的绝不逃避,该执著的永不怨悔,该舍弃的不再留念,该珍惜的好好把握。
  • 相关阅读:
    axios的兼容性
    js中的特殊符号含义
    div垂直居中
    HTTP协议(一):介绍
    HTTP协议(二)header标头说明
    AJAX 状态值(readyState)与状态码(status)详解
    Javascript替代eval方法
    vue基础知识之vue-resource/axios
    ES6的export与Nodejs的module.exports
    PM2来部署nodejs服务器永久开启
  • 原文地址:https://www.cnblogs.com/keepmoving1113/p/11788677.html
Copyright © 2011-2022 走看看