基于C# 语言的两个html解析器
1)Html Agility Pack
代码段示例:
HtmlDocument doc = new HtmlDocument(); doc.Load("file.htm"); foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"]) { HtmlAttribute att = link["href"]; att.Value = FixLink(att); } doc.Save("file.htm");
2) JSoup的Net移植版本 NSoup
http://htmlagilitypack.codeplex.com/
更推荐NSoup
NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(HtmlString); NSoup.Nodes.Document doc = NSoup.NSoupClient.Connect("http://www.oschina.net/").Get(); ebClient webClient = new WebClient(); String HtmlString=Encoding.GetEncoding("utf-8").GetString(webClient.DownloadData("http://www.oschina.net/")); NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(HtmlString); WebRequest webRequest=WebRequest.Create("http://www.oschina.net/"); NSoup.Nodes.Document doc = NSoup.NSoupClient.Parse(webRequest.GetResponse().GetResponseStream(),"utf-8");