1.去掉HTML标签:
/** * 去掉HTML外面的标签 * @author CY * */ public class TrimHTML { public static void main(String[] args) { String d3 = "<div id='mylinks'><a id='blog_nav_sitehome' class='menu' href='http://www.cnblogs.com/'>博客园</a> <a id='blog_nav_myhome' class='menu' href='http://www.cnblogs.com/tenWood/'>首页</a> <a id='blog_nav_newpost' class='menu' rel='nofollow' href='https://i.cnblogs.com/EditPosts.aspx?opt=1'>新随笔</a> <a id='blog_nav_contact' class='menu' rel='nofollow' href='https://msg.cnblogs.com/send/%E6%9C%89%E7%82%B9%E6%87%92%E6%83%B0%E7%9A%84%E5%B0%8F%E9%9D%92%E5%B9%B4'>联系</a> <a id='blog_nav_rss' class='menu' href='http://www.cnblogs.com/tenWood/rss'>订阅</a><a id='blog_nav_rss_image' href='http://www.cnblogs.com/tenWood/rss'><img src='//www.cnblogs.com/images/xml.gif' alt='订阅'></a> <a id='blog_nav_admin' class='menu' rel='nofollow' href='https://i.cnblogs.com/'>管理</a></div>"; String result = d3.replaceAll("<[^<>]+>", ""); System.out.println(result); } }
打印如下:
博客园 首页 新随笔 联系 订阅 管理
方法二:(参考博客:http://www.cnblogs.com/devinzhang/archive/2012/05/09/2491619.html)
public class TrimHTML2 { public static void main(String[] args) { String d3 = "<div id='mylinks'><a id='blog_nav_sitehome' class='menu' href='http://www.cnblogs.com/'>博客园</a> <a id='blog_nav_myhome' class='menu' href='http://www.cnblogs.com/tenWood/'>首页</a> <a id='blog_nav_newpost' class='menu' rel='nofollow' href='https://i.cnblogs.com/EditPosts.aspx?opt=1'>新随笔</a> <a id='blog_nav_contact' class='menu' rel='nofollow' href='https://msg.cnblogs.com/send/%E6%9C%89%E7%82%B9%E6%87%92%E6%83%B0%E7%9A%84%E5%B0%8F%E9%9D%92%E5%B9%B4'>联系</a> <a id='blog_nav_rss' class='menu' href='http://www.cnblogs.com/tenWood/rss'>订阅</a><a id='blog_nav_rss_image' href='http://www.cnblogs.com/tenWood/rss'><img src='//www.cnblogs.com/images/xml.gif' alt='订阅'></a> <a id='blog_nav_admin' class='menu' rel='nofollow' href='https://i.cnblogs.com/'>管理</a></div>我是尾巴"; Pattern p = Pattern.compile("<([^>]*)>", Pattern.CASE_INSENSITIVE); Matcher m = p.matcher(d3); StringBuffer sb = new StringBuffer(); while(m.find()){ m.appendReplacement(sb, ""); } m.appendTail(sb); System.out.println(sb.toString()); } }
打印:
博客园 首页 新随笔 联系 订阅 管理我是尾巴
------