zoukankan      html  css  js  c++  java
  • C#用正则表达式 获取网页源代码标签的属性或值

    1.有url获取到网页源代码:

    using System.Web;
            using System.IO;
            using System.Net;
            private void GetHtmlinfo(string PageUrl)
            {
                WebRequest request = WebRequest.Create(PageUrl);
                WebResponse response = request.GetResponse();
                Stream resStream = response.GetResponseStream();
                StreamReader sr = new StreamReader(resStream, System.Text.Encoding.UTF8);
                string htmlinfo = sr.ReadToEnd();
                resStream.Close();
                sr.Close();       
               
            }

    2.获取标签中的值:

    using System.Text.RegularExpressions;
             /// 获取字符中指定标签的值  
          /// </summary>  
            /// <param name="str">字符串</param>  
            /// <param name="title">标签</param>  
            /// <returns></returns>  
            public static string GetTitleContent(string str, string title1, string title2)  
            {  
                string tmpStr = string.Format("<{0}[^>]*?>(?<Text>[^<]*)</ {1}>", title1, title2); //获取<title>之间内容  
      
                Match TitleMatch = Regex.Match(str, tmpStr, RegexOptions.IgnoreCase);  
      
                string result = TitleMatch.Groups["Text"].Value;  
                return result;  
            }

    Example:
     HTML 源文件:<span class="t1_tx">现排名:<b class="color1">20</b>

     Parameter: title1 = @"span class=""t1_tx"">现排名:<b class=""color1""";

                      title2 - "b";

    3.获取标签中的属性:

    /// 获取字符中指定标签的值  
          /// </summary>  
            /// <param name="str">字符串</param>  
            /// <param name="title">标签</param>  
            /// <param name="attrib">属性名</param>  
            /// <returns>属性</returns>  
            public static string GetTitleContent(string str, string title,string attrib)  
            {  
      
                string tmpStr = string.Format("<{0}[^>]*?{1}=(['""]?)(?<url>[^'""\s>]+)\1[^>]*>", title, attrib); //获取<title>之间内容  
      
                Match TitleMatch = Regex.Match(str, tmpStr, RegexOptions.IgnoreCase);  
      
                string result = TitleMatch.Groups["url"].Value;  
                return result;  
            }
  • 相关阅读:
    js添加获取删除cookie
    华为Scan Kit二维码扫描
    Android中使用抖动动画吸引来用户注意-属性动画
    material_dialogs 动画弹框
    flutter 通过widget自定义toast,提示信息
    flutter 通过用户信息配置路由拦截 shared_preferences
    fluterr shared_preferences 存储用户信息 MissingPluginException(No implementation found for method getAll on channel
    Android Scroller及实际使用
    Antd Tree简单使用
    iOS开发--runtime常用API
  • 原文地址:https://www.cnblogs.com/zyh-club/p/5256702.html
Copyright © 2011-2022 走看看