zoukankan      html  css  js  c++  java
  • asp.net正则表达式学习例子

    asp.net 获取网页Document时常会用到

    edited by:曹永思-博客园

    1、获取某个class的div内的标签

    获取<div class="imgList2">****</div>内的标签

    方法一:

     string g = " <div.*?class="imgList2">(?<html>[\s\S]*?)</div>";
                Regex reg = new Regex(g, RegexOptions.None);
                MatchCollection mc = reg.Matches(strResult);
                string v = "";
                foreach (Match m in mc)
                {
                    v += m.Value + "
    ";
                }
    View Code

    方法二(通用方法,获取指定前后内容之间的内容):

    string list_a_group_str = GetValue(strResult.Trim(), "<div class="imgList2">", "</div>");
      public static string GetValue(string str, string start, string end)
            {
                Regex regex = new Regex(string.Concat(new string[]    {
            "(?<=(",
            start,
            "))[.\s\S]*?(?=(",
            end,
            "))"
        }), RegexOptions.Multiline | RegexOptions.Singleline);
                return regex.Match(str).Value;
            }
    View Code

    2、获取所有a标签的href和text

    获取<div class="page both"></div>里所有a标签的href和text

    string list_page_group_str = GetValue(strResult.Trim(), "<div class="page both">", "</div>");
                Regex reg = new Regex(@"(?is)<a(?:(?!href=).)*href=(['""]?)(?<url>[^""s>]*)1[^>]*>(?<text>(?:(?!</?a).)*)</a>");
                MatchCollection mc = reg.Matches(list_page_group_str);
                foreach (Match m in mc)
                {
                    string url = m.Groups["url"].Value + "
    ";
                    string text = m.Groups["text"].Value + "
    ";
                }
    View Code
  • 相关阅读:
    mysql服务设置远程连接 解决1251 client does not support ..问题
    Docker的简单使用
    Kick Start 2018
    Kick Start 2018
    Kick Start 2018
    LeetCode——三维形体的表面积
    面试金典——按摩师
    LeetCode——使数组唯一的最小增量
    LeetCode——单词接龙 II
    LeetCode——N皇后 II
  • 原文地址:https://www.cnblogs.com/yonsy/p/4777760.html
Copyright © 2011-2022 走看看