zoukankan      html  css  js  c++  java
  • Amber is an implementation of the Smalltalk language that runs on top of the JavaScript runtime.

    Amber is an implementation of the Smalltalk-80 language. It is designed to make client-side development faster and easier. It allows developers to write client-side heavy web applications in Smalltalk.

    Amber includes an integrated development environment with a class browser, workspace, transcript, object inspector and debugger.

    Amber is written in itself, including the parser and compiler, and compiles into efficient JavaScript, mapping one-to-one with the JS equivalent.

    Try a right now!

    You can join the Google Group or the #amber-lang IRC channel on freenode.

    webscraping - Python library for web scraping - Google Project Hosting

    Overview

    The webscraping library aims to make web scraping easier.

    All code is pure Python and has been run across multiple Linux servers, Windows machines, as well as Google App Engine.

    Examples

    common

    >>> from webscraping import common
    >>> common.remove_tags('hello <b>world</b>!')
    'hello world!'
    
    >>> common.extract_domain('http://www.google.com.au/tos.html')
    'google.com.au'
    
    >>> common.unescape('&lt;hello&nbsp;&amp;&nbsp;world&gt;')
    '<hello & world>'
    
    >>> common.extract_emails('hello richard AT sitescraper DOT net world')
    ['richard@sitescraper.net']
    
    >>> cj = common.firefox_cookie()
    >>> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    >>> html = opener.open(url).read() # use current firefox cookies to access url

    download

    >>> from webscraping import download
    >>> D = download.Download()
    
    >>> # crawl given domain
    >>> domain = ...
    >>> for url in D.crawl(domain):
    >>>    html = D.cache[url]

    pdict

    >>> from webscraping import pdict 
    >>> cache = pdict.PersistentDict(CACHE_FILE)
    >>> cache['a'] = range(5) # pickle stored in sqlite database
    >>> 'a' in cache
    True
    >>> cache['a']
    [0, 1, 2, 3, 4]

    (see a further example here)

    xpath

    >>> from webscraping import xpath
    >>> html = urllib2.urlopen(url).read()
    >>> xpath.parse(html, '/html/body/ul[2]/li[@class="info"]/div[1]')
    ['div content']
    >>> xpath.parse(html, '/html/body/ul[2]/li[@class="info"]/a/@href')
    ['url1', 'url2', 'url3']
  • 相关阅读:
    Swing中如何比较好的判断鼠标左键双击
    学习rsyslog
    学习rsync
    在线手册
    Linux开源镜像站大全
    Linux命令
    Android使用sqlite数据库的使用
    Android学习笔记-listview实现方式之BaseAdapter
    Android学习笔记-保存数据的实现方法2-SharedPreferences
    Android学习笔记-获取手机内存,SD卡存储空间。
  • 原文地址:https://www.cnblogs.com/lexus/p/2465790.html
Copyright © 2011-2022 走看看