zoukankan      html  css  js  c++  java
  • Puppeteer使用

    1. Install

    # --ignore-scripts  can jump install Chromium
    $ npm i --save puppeteer --ignore-scripts
    

    2. Api

    • Detailed configuration can refer to official documents.

    2.1 New browser

    • New browser object
    const browser = await puppeteer.launch({
        slowMo: 200,
        timeout: 15000,
        ignoreHTTPSErrors: true,
        headless: false,
        devtools: true, // Open developer tools
        defaultViewport: {
         1280,
        height: 1000,
        }
    })
    

    2.2 Close browser

    • Close browser
    await browser.close();
    

    2.3 New page

    • New page object
    const page = await browser.newPage();
    

    2.4 Close page

    • Close Page
    await page.close();
    
    • Set page cookie
    •  # cookie format
       ...cookies <...Object>
               name <string> required
               value <string> required
               url <string>
               domain <string>
               path <string>
               expires <number> Unix time in seconds.
               httpOnly <boolean>
               secure <boolean>
               sameSite <"Strict"|"Lax">
          ```
      
        let cookie = fs.readFileSync(cookieFilePath, 'utf8')
        cookie = JSON.parse(cookie)
        if (cookie) await page.setCookie(...cookie);
    
      const cookie = await page.cookies();
    
      await page.deleteCookie();
    

    2.8 Open url

      await page.goto('https://www.facebook.com', {
          timeout: 50000,
          waitUntil: ['networkidle0'] // There is no longer triggered when a network connection
      })
    

    2.9 Search dom

    // Wait for Dom to load
    await page.waitForSelector('li > div > div[aria-label]', { timeout: 20000 });
    
    // Query a Dom by selector
    const btn = await page.$('span div[aria-label]:nth-child(1)');
    
    // Click btn
    await btn.click();
    
    // Query multiple Doms by selector
    const doms = await page.$$('li > div > div[aria-label]');
    
    // Query the content of a single dom
    const val = await btn.$eval('div', el => el.textContent);
    
    // Wait...
    await page.waitForTimeout(1000); // ms
    
    ...
    

    3. Note

    • 上述的结点查询使用的是Selector选择器,对应与Console的 document.querySelector 和 document.querySelectorAll,其它选择器可以查看文档
    • Api使用中发现部分方法存在bug,并得不到理论值,github好多问题也没解决。。。

    4. Appendix

  • 相关阅读:
    Tomcat6 一些调优设置内存和连接数
    【原创】使用c3p0数据库连接池时出现com.mchange.v2.resourcepool.TimeoutException
    JVM内存的设置
    JBOSS以及tomcat最大连接数配置和jvm内存配置
    摘抄python __init__
    Python中__init__方法介绍
    Python 绝对简明手册
    python中eval, exec, execfile,和compile [转载]
    extern、static、auto、register 定义变量的不同用法
    Python 网络编程说明
  • 原文地址:https://www.cnblogs.com/xpengp/p/14036068.html
Copyright © 2011-2022 走看看