zoukankan      html  css  js  c++  java
  • PuppeteerSharp读取页面完整HTML(.NetCore)

    1.使用NUGET安装PuppeteerSharp

    通过工具或者命令方式安装

    2.初始化浏览器

    await new BrowserFetcher().DownloadAsync(BrowserFetcher.DefaultRevision);

    3.具体代码

    using (Browser browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true }))
    {
        using (var page = await browser.NewPageAsync())
        {
            //设置浏览器的页面大小
            await page.SetViewportAsync(new ViewPortOptions
            {
                Width = 1024,
                Height = 768
            });
            await page.GoToAsync("http://www.baidu.com");
            var html = await page.GetContentAsync();
    
            var sourceFile = "";
            var memoryStream = new MemoryStream(Encoding.Default.GetBytes(html));
            var sr = new StreamReader(memoryStream);
            sourceFile = sr.ReadToEnd();
            
            //针对源代码进行分析
            sr.Close();
            
            //将页面保存为图片
            //await page.ScreenshotAsync(@"D:1.png",
            //    new ScreenshotOptions() { FullPage = true, Type = ScreenshotType.Png });
        }
    }
  • 相关阅读:
    软件工程个人作业01
    学习进度条
    课堂练习:增加信息
    JavaWeb学习-1
    构建之法阅读笔记02
    java笔记04: String的理解与运用
    java:凯撒密码
    java笔记3(动手动脑)
    Java学习笔记--异常
    Advice详解
  • 原文地址:https://www.cnblogs.com/ykbb/p/11947035.html
Copyright © 2011-2022 走看看