zoukankan html css js c++ java

BlogEngine.net搜索

谈及Blogengine的搜索,真的好强大，也许我少见多怪，呵呵。看过以前一个大大写的文章，知道这里有开放式搜索这一应用

A B

A图没有打开博客，搜索引擎里就没有B图里的添加“Name of the blog”这一选项，是不是很神奇，呵呵。
B里就是多了一个 <link href="http://localhost:52457/BlogEngine.NET/opensearch.axd" title="Name of the blog" rel="search" type="application/opensearchdescription+xml">
如果把“Name of the blog”添加进去，那么就可以选择它作为搜索引擎，进行搜索，自然搜索的页面就是跳到我们的blog里了，呵呵。

看看具体Blogengine是怎么去搜索的
按一贯的思维去调试，输入内容然后点击search，是以 http://localhost:52457/BlogEngine.NET/search.aspx?q=1 跳转 q后面就是搜索的内容。
一步一步走

 1 protected override void OnLoad(EventArgs e)
 2     {
 3         base.OnLoad(e);
 4 
 5         rep.ItemDataBound += new RepeaterItemEventHandler(rep_ItemDataBound);
 6 
 7         var term = Request.QueryString["q"];
 8         if (!Utils.StringIsNullOrWhitespace(term))
 9         {
10             bool includeComments = (Request.QueryString["comment"] == "true");
11 
12             var encodedTerm = Server.HtmlEncode(term);
13             Page.Title = Server.HtmlEncode(Resources.labels.searchResultsFor) + " '" + encodedTerm + "'";
14             h1Headline.InnerHtml = Resources.labels.searchResultsFor + " '" + encodedTerm + "'";
15 
16             Uri url;
17             if (!Uri.TryCreate(term, UriKind.Absolute, out url))
18             {
19                 List<IPublishable> list = Search.Hits(term, includeComments);
20                 BindSearchResult(list);
21             }
22             else
23             {
24                 SearchByApml(url);
25             }
26         }
27         else
28         {
29             Page.Title = Resources.labels.search;
30             h1Headline.InnerHtml = Resources.labels.search;
31         }
32 
33     }

看到List<IPublishable> list = Search.Hits(term, includeComments); 这句，我们顺藤摸瓜

 1 ///<summary>
 2 /// Searches all the posts and returns a ranked result set.
 3 ///</summary>
 4 ///<param name="searchTerm">The term to search for</param>
 5 ///<param name="includeComments">True to include a post's comments and their authors in search</param>
 6 ///<returns>A list of IPublishable.</returns>
 7         public static List<IPublishable> Hits(string searchTerm, bool includeComments)
 8         {
 9             lock (SyncRoot)
10             {
11                 var results = BuildResultSet(searchTerm, includeComments);
12                 var items = results.ConvertAll(ResultToPost);
13                 results.Clear();
14                 OnSearcing(searchTerm);
15                 return items;
16             }
17         }

搜索所有内容，并且返回一个有序的结果集。看程序很显然还得继续跟 BuildResultSet

 1  ///<summary>
 2 /// Builds the results set and ranks it.
 3 ///</summary>
 4 ///<param name="searchTerm">
 5 /// The search Term.
 6 ///</param>
 7 ///<param name="includeComments">
 8 /// The include Comments.
 9 ///</param>
10         private static List<Result> BuildResultSet(string searchTerm, bool includeComments)
11         {
12             var results = new List<Result>();
13             var term = CleanContent(searchTerm.ToLowerInvariant().Trim(), false);
14             var terms = term.Split(new[] { '' }, StringSplitOptions.RemoveEmptyEntries);
15             var regex = string.Format(CultureInfo.InvariantCulture, "({0})", string.Join("|", terms));
16 
17             foreach (var entry in Catalog)
18             {
19                 var result = new Result();
20                 if (!(entry.Item is Comment))
21                 {
22                     var titleMatches = Regex.Matches(entry.Title, regex).Count;
23                     result.Rank = titleMatches * 20;
24 
25                     var postMatches = Regex.Matches(entry.Content, regex).Count;
26                     result.Rank += postMatches;
27 
28                     var descriptionMatches = Regex.Matches(entry.Item.Description, regex).Count;
29                     result.Rank += descriptionMatches * 2;
30                 }
31                 else if (includeComments)
32                 {
33                     var commentMatches = Regex.Matches(entry.Content + entry.Title, regex).Count;
34                     result.Rank += commentMatches;
35                 }
36 
37                 if (result.Rank > 0)
38                 {
39                     result.Item = entry.Item;
40                     results.Add(result);
41                 }
42             }
43 
44             results.Sort();
45             return results;
46         }

先不管Catalog具体是怎样，这里的匹配操作都是为了给result.Rank 这里的权值赋值，匹配数越多，权值越高，那么排序也就越靠前，把权值大于0的结果添加进list<result>
集合里，然后sort()排序，这里没有指定comparer那就是默认的,当然blogengine自己写了

 1 ///<summary>
 2 /// Compares the current object with another object of the same type.
 3 ///</summary>
 4 ///<param name="other">
 5 /// An object to compare with this object.
 6 ///</param>
 7 ///<returns>
 8 /// A 32-bit signed integer that indicates the relative order of the objects being compared. The return value 
 9 ///     has the following meanings: Value Meaning Less than zero This object is less than the other parameter.Zero 
10 ///     This object is equal to other. Greater than zero This object is greater than other.
11 ///</returns>
12         public int CompareTo(Result other)
13         {
14             return other.Rank.CompareTo(this.Rank);
15         }

最后返回List<Result>排序后的结果集。再说Catalog是什么呢？他是一个用来被搜索的集合Collection<Entry>，看看Entry的结构

 1 ///<summary>
 2 /// A search optimized post object cleansed from HTML and stop words.
 3 ///</summary>
 4     internal struct Entry
 5     {
 6         #region Constants and Fields
 7 
 8         ///<summary>
 9 ///     The content of the post cleansed for stop words and HTML
10 ///</summary>
11         internal string Content;
12 
13         ///<summary>
14 ///     The post object reference
15 ///</summary>
16         internal IPublishable Item;
17 
18         ///<summary>
19 ///     The title of the post cleansed for stop words
20 ///</summary>
21         internal string Title;
22 
23         #endregion
24     }

回过去看BuildResultSet函数里的匹配方法，我们就会发现原来如此了。我们知道有这么一个东西是用来被搜索的，那么它是如何形成的呢？

 1  ///<summary>
 2 /// Initializes static members of the <see cref="Search"/> class.
 3 ///</summary>
 4         static Search()
 5         {
 6             BuildCatalog();
 7             Post.Saved += Post_Saved;
 8             Page.Saved += Page_Saved;
 9             BlogSettings.Changed += delegate { BuildCatalog(); };
10             Post.CommentAdded += Post_CommentAdded;
11             Post.CommentRemoved += delegate { BuildCatalog(); };
12             Comment.Approved += Post_CommentAdded;
13         }

在静态构造函数内有一个BuildCatalog的方法用来建立搜索集合，同时为其他的post，page。。。等等都添加了事件，也就是说他们一有变动，那么就更新catalog，从这里
又可以看出搜索的集合包含了很多对象，其实他们都有一个公共点就是继承了IPublishable接口
至此，有了搜索的关键字，也有了被搜索的集合，那么自然可以返回搜索后的集合了。
这里的搜索让我想起了lucene.net，呵呵，同样要考虑权值这一说，不过lucene的分词就高级多了，不像这里只能整个关键字去匹配,"ABC"就只能搜出含“ABC”的，而不能搜
出含有“A”或“B”或“C”之类的。

time waits for no one

如果，您认为阅读这篇博客让您有些收获，不妨点击一下右下角的【推荐】按钮。
因为，我的写作热情离不开您的肯定支持。

感谢您的阅读，如果您对我的博客所讲述的内容有兴趣，请继续关注我的后续博客。

查看全文

相关阅读:
Java线程：新特征-阻塞栈
 Java线程：新特征-阻塞队列
 Java线程：新特征-信号量
 Java线程：新特征-锁（下）
Java线程：新特征-锁（上）
Java线程：新特征-有返回值的线程
 Java线程：新特征-线程池
 Java线程：volatile关键字
 Java线程：并发协作-死锁
 通过Roslyn动态生成程序集

原文地址：https://www.cnblogs.com/whosedream/p/2259823.html