zoukankan      html  css  js  c++  java
  • Web Scraping using Python Scrapy_BS4

    Use BeautifulSoup and Python to scrap a website

    Lib:

    • urllib
    • Parsing HTML Data

    Web scraping script

    from urllib.request import urlopen as uReq
    from bs4 import BeautifulSoup as soup
    
    quotes_page = "https://bluelimelearning.github.io/my-fav-quotes/"
    uClient = uReq(quotes_page)
    page_html = uClient.read()
    uClient.close()
    page_soup = soup(page_html, "html.parser")
    quotes = page_soup.findAll("div", {"class":"quotes"})
    
    for quote in quotes:
        fav_quote = quote.findAll("p", {"class":"aquote"})
        aquote = fav_quote[0].text.strip()
    
        fav_authors = quote.findAll("p",{"class":"author"})
        author = fav_authors[0].text.strip()
    
        print(aquote)
        print(author)

    Run this script successfully

     Following is the whole result of this scraping.

    I hear and i forget. I see and i remember. I do and i understand.
    Confucious
    Feeling gratitude and not expressing it is like wrapping a present and not giving it.
    William Arthur Ward
    Our greatest glory is not in never falling but in rising every time we fall.
    Confucious
    The secret of getting aheadis getting started.
    Mark Twain
    Believe you can   and you're halfway there.
    Theodore Roosevelt
    Resentment is like drinking Poison and  waiting for your enemies to die.
    Nelson Mandela
    Silence is a true friend   who never betrays.
    Confucius
    The best way to find yourself is to   lose yourself in the service of others.
    Mahatma Gandhi
    Never succumb to the temptation of bitterness.
    Martin Luther King Jnr
    The journey of a thousand miles  begins with one step.
    Lao Tzu
    It is health that is real wealth and  not pieces of gold and silver.
    Mahatma Gandhi
    Yesterday is not ours to recover  but tomorrow is ours to win or lose.
    Lyndon B Johnson
    It's not what happens to you  but how you react to it that matters .
    Epictetus
    Beware of what you become  in pursuit of what you want.
    Jim Rohn
    The best revenge  is massive success.
    Frank Sinatra
    Do not take life too seriously You will never get out of it alive.
    Elbert Hubbard
    Don't judge each day by the harvest you reap  but by the seeds that yiu plant.
    Robert Loius Stevenson
    Your attitude and not your aptitude  will determine your altitude
    Zig Ziglar
    Imagination is more important  than knowledge.
    Albert Einstein

    .

    相信未来 - 该面对的绝不逃避,该执著的永不怨悔,该舍弃的不再留念,该珍惜的好好把握。
  • 相关阅读:
    Nginx和php是怎么通信的?
    浏览器输入URL到响应页面的全过程
    一个简单清晰的Redis操作类-php
    两种简单的方法Docker构建LANMP
    Docker镜像的构成__Dockerfile
    进入Docker容器
    Docker镜像的构成__docker commit
    Docker的安装
    PHP基于TP5使用Websocket框架之GatewayWorker开发电商平台买家与卖家实时通讯
    OC @property @synthesize和id
  • 原文地址:https://www.cnblogs.com/keepmoving1113/p/11788677.html
Copyright © 2011-2022 走看看