zoukankan      html  css  js  c++  java
  • ASP.NET过滤HTML标签只保留换行与空格的方法

    这篇文章主要介绍了ASP.NET过滤HTML标签只保留换行与空格的方法,包含网上常见的方法以及对此方法的改进,具有一定的参考借鉴价值,需要的朋友可以参考下
     

    本文实例讲述了ASP.NET过滤HTML标签只保留换行与空格的方法。分享给大家供大家参考。具体分析如下:

    自己从网上找了一个过滤HTML标签的方法,我也不知道谁的才是原创的,反正很多都一样。我把那方法复制下来,代码如下:

    ///   <summary>
    ///   去除HTML标记
    ///   </summary>
    ///   <param name="NoHTML">包括HTML的源码   </param>
    ///   <returns>已经去除后的文字</returns>
    public static string NoHTML(string Htmlstring)
    {
      //删除脚本
      Htmlstring = Regex.Replace(Htmlstring, @"<script[^>]*?>.*?</script>", "",
        RegexOptions.IgnoreCase);
      //删除HTML
      Htmlstring = Regex.Replace(Htmlstring, @"<(.[^>]*)>", "",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"([
    ])[s]+", "",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"-->", "", RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"<!--.*", "", RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(quot|#34);", """,
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(amp|#38);", "&",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(lt|#60);", "<",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(gt|#62);", ">",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(nbsp|#160);", "   ",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(iexcl|#161);", "xa1",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(cent|#162);", "xa2",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(pound|#163);", "xa3",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&(copy|#169);", "xa9",
        RegexOptions.IgnoreCase);
      Htmlstring = Regex.Replace(Htmlstring, @"&#(d+);", "",
        RegexOptions.IgnoreCase);
    
      Htmlstring.Replace("<", "");
      Htmlstring.Replace(">", "");
      Htmlstring.Replace("
    ", "");
      Htmlstring = HttpContext.Current.Server.HtmlEncode(Htmlstring).Trim();
      return Htmlstring;
    }

    以上代码是从网上直接复制过来的,这个确实能过滤掉所有的HTML标签,但是这个不是我想要的,这个过滤得太干净了,我如果用textarea输入框的话,我是要保留空格跟换行的。

    然后我就自己改了一下这个方法,textarea的换行是 ,所以我得把这些标签重新匹配替换成<br>,这样的话从数据库中读取到页面时,就能正确的换行了,把空格替换成HTML的空格符,大功告成。

    ///   <summary>
    ///   去除HTML标记(保留br跟
    )
    ///   </summary>
    ///   <param   name="NoHTML">包括HTML的源码   </param>
    ///   <returns>已经去除后的文字</returns>
    public static string NewNoHTML(string Htmlstring)
    {
        //Htmlstring.Replace("\r\n", "%r%n").Replace("<br>","%br%").Replace("<br/>","%br&%").Replace("\n","%n");
        //删除脚本
        Htmlstring = Regex.Replace(Htmlstring, @"<script[^>]*?>.*?</script>", "",
          RegexOptions.IgnoreCase);
        //删除HTML
        Htmlstring = Regex.Replace(Htmlstring, @"<(.[^>]*)>", "",
          RegexOptions.IgnoreCase);
      
        Htmlstring = Regex.Replace(Htmlstring, @"-->", "", RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"<!--.*", "", RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(quot|#34);", """,
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(amp|#38);", "&",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(lt|#60);", "<",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(gt|#62);", ">",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(nbsp|#160);", "   ",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(iexcl|#161);", "xa1",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(cent|#162);", "xa2",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(pound|#163);", "xa3",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&(copy|#169);", "xa9",
          RegexOptions.IgnoreCase);
        Htmlstring = Regex.Replace(Htmlstring, @"&#(d+);", "",
          RegexOptions.IgnoreCase);
    
        Htmlstring.Replace("<", "");
        Htmlstring.Replace(">", "");
        //Htmlstring.Replace("
    ", "");
        Htmlstring = HttpContext.Current.Server.HtmlEncode(Htmlstring);
        Htmlstring = Regex.Replace(Htmlstring, @"((
    ))", "<br>");
        Htmlstring = Regex.Replace(Htmlstring, @"(
    |
    )", "<br>");
        Htmlstring = Regex.Replace(Htmlstring, @"(s)", " ");
        return Htmlstring;
    }

    这个过滤可以用于让用户输入发布内容时的过滤。

    希望本文所述对大家的asp.net程序设计有所帮助。

  • 相关阅读:
    LeetCode 226. Invert Binary Tree
    LeetCode 221. Maximal Square
    LeetCode 217. Contains Duplicate
    LeetCode 206. Reverse Linked List
    LeetCode 213. House Robber II
    LeetCode 198. House Robber
    LeetCode 188. Best Time to Buy and Sell Stock IV (stock problem)
    LeetCode 171. Excel Sheet Column Number
    LeetCode 169. Majority Element
    运维工程师常见面试题
  • 原文地址:https://www.cnblogs.com/Lethe/p/6222555.html
Copyright © 2011-2022 走看看