zoukankan      html  css  js  c++  java
  • 苏宁易购价格爬取(golang)

    如果商品地址为:http://product.suning.com/0070230548/10608983060.html

    则价格地址:

    http://pas.suning.com/nspcsale_0_000000010608983060_000000010608983060_0070230548_20_021_0210101_500353_1000267_9264_12113_Z001___R9006849_3.3_1___000278188__.html?callback=pcData&_=1558663936729

     

    如果商品地址为:http://product.suning.com/0000000000/144016246.html

    则价格地址:

    http://pas.suning.com/nspcsale_0_000000000144016246_000000000144016246_0000000000_20_021_0210101_500353_1000267_9264_12113_Z001___R9006850_2.86_0___000278188__.html?callback=pcData&_=1558664442552

    参考:3.1、苏宁百万级商品爬取 思路讲解 商品爬取

    python 抓取 苏宁价格地址

     

     python和go共同爬取了相同的数据(135个商品的数据),go用时19.457s,python(未使用任何爬虫框架)用时178.672s

     

    go:

     1 package main
     2 
     3 import (
     4     "fmt"
     5     "io/ioutil"
     6     "net/http"
     7     "regexp"
     8     "strings"
     9 )
    10 
    11 func GetGoodPrice(url string) string {
    12     re := regexp.MustCompile(`com/(.*?).html`)
    13     keynum := re.FindAllStringSubmatch(url, -1)
    14     keynum0 := keynum[0][1]
    15     key0 := strings.Split(keynum0, "/")[0]
    16     key1 := strings.Split(keynum0, "/")[1]
    17     priceurl := "http://pas.suning.com/nspcsale_0_000000000" + key1 + "_000000000" + key1 + "_" + key0 + "_20_021_0210101_500353_1000267_9264_12113_Z001___R9006849_3.3_1___000278188__.html?callback=pcData&_=1558663936729"
    18     if len(key1) == 11 {
    19         priceurl = "http://pas.suning.com/nspcsale_0_0000000" + key1 + "_0000000" + key1 + "_" + key0 + "_20_021_0210101_500353_1000267_9264_12113_Z001___R9006849_3.3_1___000278188__.html?callback=pcData&_=1558663936729"
    20     }
    21 
    22     resp, err := http.Get(priceurl)
    23     if err != nil {
    24         panic(err)
    25     }
    26     if resp.StatusCode != 200 {
    27         fmt.Println("err")
    28     }
    29     s, _ := ioutil.ReadAll(resp.Body)
    30     resp.Body.Close()
    31 
    32     re0 := regexp.MustCompile(`"netPrice":"(.*?)","warrantyList`)
    33     price := re0.FindAllStringSubmatch(string(s), -1)
    34     // fmt.Println(price)
    35     // fmt.Println(priceurl)
    36     return price[0][1]
    37 }
    38 func main() {
    39     url := `http://product.suning.com/0000000000/144016246.html`
    40     price := GetGoodPrice(url)
    41     fmt.Println(price)
    42 }
  • 相关阅读:
    Razor 常用又容易忘记语法
    游览器 reflow
    正则表达式
    migration to end point routing
    js 翻译 c# 注意事项
    Angular 学习笔记 work with excel (导出 excel)
    html4,5 basic
    IIS 服务器配置
    meta 的用途
    正则表达 常用
  • 原文地址:https://www.cnblogs.com/cekong/p/10916498.html
Copyright © 2011-2022 走看看