zoukankan      html  css  js  c++  java
  • Lucene+Pangu分词

    1:Lucene

    Lucene是一个全文搜索框架,而不是应用产品。因此它并不像www.baidu.com 或者google Desktop那么拿来就能用,它只是提供了一种工具让你能实现这些产品。

    2:Pangu分词

    盘古分词是一个中英文分词组件。

    借用以上两个组件可以对中文分词实现全文搜索。

    先说下大概概念

    //一、Document
                //Document:文档对象,是一条原始的数据
    
                //二、Field
                //如果一个字段要显示到最终的结果中,那么一定要存储,否则就不存储
                //如果要根据这个字段进行搜索,那么这个字段就必须创建索引。
                //如何确定一个字段是否需要分词?
                //前提是这个字段首先要创建索引。然后如果这个字段的值是不可分割的,那么就不需要分词。例如:ID
                //DoubleField、FloatField、IntField、LongField、StringField、TextField这些子类一定会被创建索引,但是不会被分词,而且不一定会被存储到文档列表。要通过构造函数中的参数Store来指定:如果Store.YES代表存储,Store.NO代表不存储
                //TextField即创建索引,又会被分词。StringField会创建索引,但是不会被分词。
                //StoreField一定会被存储,但是一定不创建索引
    
                //三、Directory
                //FSDirectory:文件系统目录,会把索引库指向本地磁盘。
                //特点:速度略慢,但是比较安全
                //RAMDirectory:内存目录,会把索引库保存在内存。
                //特点:速度快,但是不安全
    
                //四、Analyzer(分词器类) 若有中文 需要第三方类库对中文进行分词 如pangu分词
                //提供分词算法,可以把文档中的数据按照算法分词
    
                //五、IndexWriterConfig(索引写出器配置类)
                //1) 设置配置信息:Lucene的版本和分词器类型
                //2)设置是否清空索引库中的数据
    
                //六、IndexWriter(索引写出器类)
                //索引写出工具,作用就是 实现对索引的增(创建索引)、删(删除索引)、改(修改索引)
                //可以一次创建一个,也可以批量创建索引
    
                //核心API
                //5.2.2.1 QueryParser(查询解析器)
                //1)QueryParser(单一字段的查询解析器)
                //2)MultiFieldQueryParser(多字段的查询解析器)
    
                //5.2.2.2 Query(查询对象,包含要查询的关键词信息)
                //1)通过QueryParser解析关键字,得到查询对象
                //2)自定义查询对象(高级查询)
                //我们可以通过Query的子类,直接创建查询对象,实现高级查询(后面详细讲)
    
                //5.2.2.3 IndexSearch(索引搜索对象,执行搜索功能)
                //IndexSearch可以帮助我们实现:快速搜索、排序、打分等功能。
                //IndexSearch需要依赖IndexReader类
                //查询后得到的结果,就是打分排序后的前N名结果。N可以通过第2个参数来指定:
    
                //5.2.2.4 TopDocs(查询结果对象)
                //通过IndexSearcher对象,我们可以搜索,获取结果:TopDocs对象
                //在TopDocs中,包含两部分信息:
                //int totalHits :查询到的总条数
                //ScoreDoc[] scoreDocs    : 得分文档对象的数组
    
                //5.2.2.5 ScoreDoc(得分文档对象)
                //ScoreDoc是得分文档对象,包含两部分数据:
                //int doc    :文档的编号----lucene给文档的一个唯一编号
                //float score    :文档的得分信息
                //拿到编号后,我们还需要根据编号来获取真正的文档信息

    上代码:

    1:接口:

       public interface IDBService<TContext> where TContext : Microsoft.EntityFrameworkCore.DbContext
        {
            TContext Context { get; }
            
        }
    }
    //为泛型方法封装的。泛型方法里面实现的和搜索类实现的一样。
    public interface IService<T>
        {
            List<T> QueryToList(int limit);
            List<T> QueryToList(int start, int end);
            List<T> QueryToList(string txt);
            List<T> QueryAll();
            T QuerySignle(string guid);
            T QuerySignle(int pk_id);
        }

    2:搜索类:

    using System;
    using System.Collections.Generic;
    using System.Text;
    using LuceneCore.Respority;
    using LuceneCore.DB;
    using Lucene.Net.Store;
    using LuceneCore.Util;
    using System.Linq;
    
    namespace LuceneCore.Respority
    {
        class DbContextService<TDbContext> : IDBService<TDbContext> where TDbContext : Microsoft.EntityFrameworkCore.DbContext
        {
            public TDbContext Context { get; }
            private List<Type> listType = new List<Type>();
    
    
            //private static FSDirectory dir = FSDirectory.Open("");
            private Lucene.Net.Index.IndexWriter writer = null;// new Lucene.Net.Index.IndexWriter(dir, new Lucene.Net.Analysis.PanGu.PanGuAnalyzer(), Lucene.Net.Index.IndexWriter.MaxFieldLength.LIMITED);
            private Lucene.Net.Index.IndexReader reader = null;
    
            public DbContextService(TDbContext _Context)
            {
                Context = _Context;
                foreach (System.Reflection.PropertyInfo pro in Context.GetType().GetProperties())
                {
                    if (pro.PropertyType.GenericTypeArguments != null && pro.PropertyType.GenericTypeArguments.Length > 0)
                    {
                        listType.Add(pro.PropertyType.GenericTypeArguments[0]);
                    }
                }
            }
            public void Test<T>(T t)
            {
                CreateIndexWriter(@"C:\Users\FanLin\Desktop\index");
                var list = new List<T>();
                list.Add(t);
                CreateIndex(list);
                QueryByKeyWord<T>("搜索", false, null, 100, null, "NAME");
                var query1 = CreateQuery<T,Lucene.Net.Search.TermQuery, Util.Attributes.TermQueryAttribute>("搜索");
                var query = CreateBooleanQuery((query1, Lucene.Net.Search.Occur.SHOULD));
                QueryBySpecidQuery(query);
    
            }
            /// <summary>
            /// 创建IndexWriter和IndexReader对象
            /// </summary>
            /// <param name="filePath"></param>
            /// <returns></returns>
            public (Lucene.Net.Index.IndexWriter, Lucene.Net.Index.IndexReader) CreateIndexWriter(string filePath)
            {
                FSDirectory dir = FSDirectory.Open(filePath);
                if (writer == null)
                {
                    writer = new Lucene.Net.Index.IndexWriter(dir, new Lucene.Net.Analysis.PanGu.PanGuAnalyzer(), Lucene.Net.Index.IndexWriter.MaxFieldLength.LIMITED);
                    writer.MergeFactor = 100;//控制多个segment合并的频率,默认10
                    writer.UseCompoundFile = true;//创建符合文件 减少索引文件数量
                }
                //var directory = System.IO.Directory.CreateDirectory(filePath);
                if (reader == null)
                {
                    reader = Lucene.Net.Index.IndexReader.Open(dir, false);
                }
                return (writer, reader);
            }
            /// <summary>
            /// 提交
            /// </summary>
            private void CommitIndexWriter()
            {
                if (writer == null)
                    return;
                writer.Commit();
            }
            //清除
            private void FlushIndexWriter()
            {
                if (writer == null)
                    return;
                writer.Flush(true, true, true);
            }
            /// <summary>
            /// 关闭
            /// </summary>
            private void CloseIndexWriter()
            {
                if (writer == null)
                    return;
                writer.Flush(true, true, true);
            }
            /// <summary>
            /// 创建索引
            /// </summary>
            public void CreateIndex<T>(IEnumerable<T> tList)
            {
                // var config = new IndexWriterConfig(Lucene.Net.Util.LuceneVersion.LUCENE_48, _analyzer);
                if (writer == null) return;
                // 遍历实体集,添加到索引库
                foreach (var entity in tList)
                {
                    this.CreateIndex(entity);
                }
                this.CommitIndexWriter();
                this.CloseIndexWriter();
            }
            /// <summary>
            /// 创建索引
            /// </summary>
            private void CreateIndex<T>(T tSignal)
            {
                if (writer == null) return;
                if (writer.NumDocs() <= 0)
                {
                    writer.AddDocument(tSignal.ToDocument());
                }
                else
                {
                    this.UpdateIndex(tSignal);
                }
                //if (cmit) this.CommitIndexWriter(true);
            }
            /// <summary>
            /// 增加索引
            /// </summary>
            private void UpdateIndex<T>(T t)
            {
    
                if (writer == null) return;
                if (reader == null) return;
                //如果 有先删除
                writer.UpdateDocument(new Lucene.Net.Index.Term(t.GetCustomAttributeColumn_KeyAndValue<T, System.ComponentModel.DataAnnotations.KeyAttribute>().field, t.GetCustomAttributeColumn_KeyAndValue<T, System.ComponentModel.DataAnnotations.KeyAttribute>().txt), t.ToDocument());
            }
            /// <summary>
            /// 批量修改索引
            /// </summary>
            /// <param name="ts"></param>
            public void UpdateIndexEnumerable<T>(IEnumerable<T> ts)
            {
                //如果 有先删除
                foreach (var t in ts)
                {
                    UpdateIndex(t);
                }
                this.CommitIndexWriter();
                this.CloseIndexWriter();
            }
            /// <summary>
            /// 删除索引
            /// </summary>
            private void DeleteIndex<T>(T t)
            {
                if (reader == null) return;
                reader.DeleteDocuments(new Lucene.Net.Index.Term(t.GetCustomAttributeColumn_KeyAndValue<T, System.ComponentModel.DataAnnotations.KeyAttribute>().field, t.GetCustomAttributeColumn_KeyAndValue<T, System.ComponentModel.DataAnnotations.KeyAttribute>().txt));
            }
            /// <summary>
            /// 批量删除索引
            /// </summary>
            public void DeleteIndex<T>(IEnumerable<T> tEnum)
            {
                if (reader == null) return;
                foreach (var t in tEnum)
                {
                    DeleteIndex(t);
                }
                this.CommitIndexWriter();
                this.CloseIndexWriter();
            }
            /// <summary>
            /// 删除所有索引
            /// </summary>
            public void DeleteAllIndex()
            {
                if (writer == null) return;
                writer.DeleteAll();
            }
    
            /// <summary>
            ///  根据关键字来查询
            /// </summary>
            /// <param name="keyWord">查询关键字</param>
            /// <param name="isMulti">是否对多字段查询</param>
            /// <param name="filter">过滤条件</param>
            /// <param name="n">需要查出多少条</param>
            /// <param name="sort">排序规则</param>
            /// <param name="filterColumn">对哪些字段做查询</param>
            /// <returns></returns>
            public IEnumerable<T> QueryByKeyWord<T>(string keyWord, bool isMulti, Lucene.Net.Search.Filter filter, int n, Lucene.Net.Search.Sort sort, params string[] filterColumn)
            {
                // 索引搜索工具
                //IndexSearch可以帮助我们实现:快速搜索、排序、打分等功能。
                //IndexSearch需要依赖IndexReader类
                Lucene.Net.Search.IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(reader);
                //关键字
                //string kwyWord = "";
                Lucene.Net.Search.Query query = null;
                //参数分别表示:版本号 对哪个字段分词查询 分词器
                //这是对单一字段做分词查询
                //单一字段查询 或者多个字段查询
                if (isMulti && filterColumn.Length <= 1)
                {
                    throw new ArgumentNullException(filter + "为null,请传入需要查询的字段");
                }
                if (!isMulti && filterColumn.Length > 1)
                {
                    throw new ArgumentNullException(filter + "当前不是多字段查询,只允许传入一个字段名称!");
                }
                Lucene.Net.QueryParsers.QueryParser queryParser = null;
                if (!isMulti)
                {
                    //对单字段做查询
                    queryParser = new Lucene.Net.QueryParsers.QueryParser(Lucene.Net.Util.Version.LUCENE_30, filterColumn[0], new Lucene.Net.Analysis.PanGu.PanGuAnalyzer());
                    //解析关键字
                    // query = queryParser.Parse(keyWord);
                }
                else
                {
                    queryParser = new Lucene.Net.QueryParsers.MultiFieldQueryParser(Lucene.Net.Util.Version.LUCENE_30, filterColumn, new Lucene.Net.Analysis.PanGu.PanGuAnalyzer());
                }
                //解析关键字
                query = queryParser.Parse(keyWord);
                //query的分类
                //返回100条数据
                // TopDocs(查询结果对象)
                //通过IndexSearcher对象,我们可以搜索,获取结果:TopDocs对象
                //是否需要过滤和排序
                Lucene.Net.Search.TopDocs topDocs = searcher.Search(query, 100);
                IList<T> values = new List<T>();
                //ScoreDoc是得分文档对象
    
                foreach (Lucene.Net.Search.ScoreDoc sDoc in topDocs.ScoreDocs)
                {
                    float score = sDoc.Score;
                    //拿到document
                    Lucene.Net.Documents.Document document = reader.Document(sDoc.Doc);
                    values.Add(document.ToEntity<T>());
                    //将document转换成entity
    
                }
                return values;
            }
            /// <summary>
            /// 根据query查询
            /// </summary>
            /// <param name="query"></param>
            private void QueryBySpecidQuery(Lucene.Net.Search.Query query)
            {
                // 索引搜索工具
                //IndexSearch可以帮助我们实现:快速搜索、排序、打分等功能。
                //IndexSearch需要依赖IndexReader类
                if (query == null) return;
                if (reader == null) return;
                Lucene.Net.Search.IndexSearcher searcher = new Lucene.Net.Search.IndexSearcher(reader);
                if (searcher == null) return;
                //query的分类
                //返回100条数据
                // TopDocs(查询结果对象)
                //通过IndexSearcher对象,我们可以搜索,获取结果:TopDocs对象
                Lucene.Net.Search.TopDocs topDocs = searcher.Search(query, 100);
                //ScoreDoc是得分文档对象
                foreach (Lucene.Net.Search.ScoreDoc sDoc in topDocs.ScoreDocs)
                {
                    int id = sDoc.Doc;
                    float score = sDoc.Score;
                    //拿到document
                    Lucene.Net.Documents.Document document = reader.Document(id);
                    //将document转换成entity
    
                }
            }
            //TermQuery可以用“field:key”方式,例如“content:lucene”。
            //BooleanQuery中‘与’用‘+’,‘或’用‘ ’,例如“content:java contenterl”。
            //WildcardQuery仍然用‘?’和‘*’,例如“content:use*”。
            //PhraseQuery用‘~’,例如“content:"中日"~5”。
            //PrefixQuery用‘*’,例如“中*”。
            //FuzzyQuery用‘~’,例如“content: wuzza ~”。
            //RangeQuery用‘[]’或‘{}’,前者表示闭区间,后者表示开区间,例如“time:[20060101 TO 20060130]”,注意TO区分大小写。
            //例如“标题或正文包括lucene,并且时间在20060101到20060130之间的文章” 可以表示为:“+ (title:lucene content:lucene) +time:[20060101 TO 20060130]”
    
            /*
             #region 特殊查询
             /// <summary>
             /// 单条匹配查询
             /// </summary>
             public void GetTermQuery(string content)
             {
                 List<string> fids = GetCustomAttributeForQuery<Util.Attributes.TermQueryAttribute>();
                 if (fids != null && fids.Count() > 0)
                 {
                     Lucene.Net.Search.Query query = new Lucene.Net.Search.TermQuery(new Lucene.Net.Index.Term(fids[0], content));
                 }
             }
             /// <summary>
             /// 通配符查询
             /// </summary>
             public void GetWildcardQuery(string charContent)
             {
                 List<string> fids = GetCustomAttributeForQuery<Util.Attributes.WildcardQueryAttribute>();
                 if (fids != null && fids.Count() > 0)
                 {
                     Lucene.Net.Search.Query query = new Lucene.Net.Search.WildcardQuery(new Lucene.Net.Index.Term(fids[0], charContent));
                 }
                 //QueryLucene(query);
             }
             /// <summary>
             /// 模糊查询
             /// </summary>
             public void GetFuzzyQuery(string content)
             {
                 List<string> fids = GetCustomAttributeForQuery<Util.Attributes.FuzzyQueryAttribute>();
                 if (fids != null && fids.Count() > 0)
                 {
                     Lucene.Net.Search.Query query = new Lucene.Net.Search.FuzzyQuery(new Lucene.Net.Index.Term(fids[0], content));
                 }
                 //QueryLucene(query);
             }
             /// <summary>
             /// 搜索两个单词距离指定间隔的数据
             /// </summary>
             public void GetPhraseQuery(int slop, string prv, string next)
             {
                 Lucene.Net.Search.PhraseQuery query = new Lucene.Net.Search.PhraseQuery();
                 query.Slop = 5;
                 query.Add(new Lucene.Net.Index.Term("content ", prv));
                 query.Add(new Lucene.Net.Index.Term("content", next));
             }
             /// <summary>
             /// 以指定字符开头
             /// </summary>
             public void GetPrefixQuery(string content)
             {
                 Lucene.Net.Search.PrefixQuery query = new Lucene.Net.Search.PrefixQuery(new Lucene.Net.Index.Term("content ", content));
             }
             #endregion
            */
            /// <summary>
            /// 根据传入的特性查找定义了此特性的属性
            /// </summary>
            /// <typeparam name="Attr"></typeparam>
            /// <returns></returns>
            private static List<string> GetCustomAttributeForQuery<T, Attr>()
            {
                var pro = typeof(T).GetProperties().Where(x => x.IsDefined(typeof(Attr), false));
                var list = pro.Select(x => x?.Name).ToList();
                return list;
                // var prop=
                //return typeof(T).GetProperties().Select(x =>
                //{
                //    if (x.IsDefined(typeof(Attr), false))
                //    {
                //        return x.Name;
                //    }
                //    else
                //    {
                //        return null;
                //    }
                //}).ToList();
            }
            /// <summary>
            /// 创建组合查询
            /// </summary>
            /// <param name="querysAndoccurs">传入需要进行组合的查询方式和逻辑关联</param>
            /// <returns></returns>
            public Lucene.Net.Search.Query CreateBooleanQuery(params (Lucene.Net.Search.Query query, Lucene.Net.Search.Occur occur)[] querysAndoccurs)
            {
                //Lucene.Net.Search.Query query = new Lucene.Net.Search.FuzzyQuery(new Lucene.Net.Index.Term("", ""));
                //Lucene.Net.Search.Query query2 = Lucene.Net.Search.NumericRangeQuery.NewLongRange("", 0, 0, true, true);
                Lucene.Net.Search.BooleanQuery bQuery = new Lucene.Net.Search.BooleanQuery();
                //* 交集:Occur.MUST + Occur.MUST
                //* 并集:Occur.SHOULD + Occur.SHOULD
                //* 非:Occur.MUST_NOT
                querysAndoccurs.ToList().ForEach(tuple =>
                {
                    bQuery.Add(tuple.query, tuple.occur);
                });
                return bQuery;
            }
    
            /// <summary>
            /// 创建指定的查询方式。只支持单个查询。
            /// </summary>
            /// <typeparam name="U">继承自Query的查询类。如WildcardQuery/PrefixQuery/FuzzyQuery/TermQuery四种等</typeparam>
            /// <typeparam name="Attr">实体属性上定义的特性名称,表示其需要执行那种查询方式,若有多个则组合成组合查询</typeparam>
            /// <param name="content">查询关键字</param>
            /// <returns></returns>
            public Lucene.Net.Search.Query CreateQuery<T, U, Attr>(string content) where U : Lucene.Net.Search.Query where Attr : Util.Attributes.SpecidQueryAttribute
            {
                var tp = typeof(U);
                var name = tp.Name;
                //Lucene.Net.Search.Query query1 = new Lucene.Net.Search.TermQuery(new Lucene.Net.Index.Term("", ""));
                switch (name)
                {
                    case "WildcardQuery":
                    case "PrefixQuery":
                    case "FuzzyQuery":
                    case "TermQuery":
                        return Activator.CreateInstance(typeof(U), new Lucene.Net.Index.Term(GetCustomAttributeForQuery<T,Attr>()[0], content)) as Lucene.Net.Search.Query;
                }
                throw new KeyNotFoundException("传入的泛型<U>不正确");
            }
            /// <summary>
            /// 范围查询,包括数字范围或者查询两个字符串指定间隔的数据
            /// </summary>
            /// <typeparam name="U"></typeparam>
            /// <param name="minOrprv">U为long/int/float/double时传入数字,表示范围,为string时传入前边的字符串</param>
            /// <param name="maxOrnext">U为long/int/float/double时传入数字,表示范围,为string时传入后边的字符串</param>
            /// <param name="whenString">U为string时传入前后的字符串,这个为两个字符串之间的间隔</param>
            /// <returns></returns>
            public Lucene.Net.Search.Query CreateQuery<T, U>(U minOrprv, U maxOrnext, int whenString = 0)
            {
                Lucene.Net.Search.Query query = null;
                List<string> fids = GetCustomAttributeForQuery<T, Util.Attributes.NumericRangeQueryAttribute>();
                if (fids != null && fids.Count() > 0)
                {
                    switch (typeof(U).Name)
                    {
                        case "long":
                        case "Int64":
                            query = Lucene.Net.Search.NumericRangeQuery.NewLongRange(fids[0], long.Parse(minOrprv.ToString()), long.Parse(maxOrnext.ToString()), true, true);
                            break;
                        case "int":
                        case "Int32":
                            query = Lucene.Net.Search.NumericRangeQuery.NewIntRange(fids[0], int.Parse(minOrprv.ToString()), int.Parse(maxOrnext.ToString()), true, true);
                            break;
                        case "double":
                            query = Lucene.Net.Search.NumericRangeQuery.NewDoubleRange(fids[0], double.Parse(minOrprv.ToString()), double.Parse(maxOrnext.ToString()), true, true);
                            break;
                        case "float":
                            query = Lucene.Net.Search.NumericRangeQuery.NewFloatRange(fids[0], float.Parse(minOrprv.ToString()), float.Parse(maxOrnext.ToString()), true, true);
                            break;
                        case "string":
                            var pQuery = new Lucene.Net.Search.PhraseQuery();
                            pQuery.Slop = whenString;
                            pQuery.Add(new Lucene.Net.Index.Term(fids[0], minOrprv.ToString()));
                            pQuery.Add(new Lucene.Net.Index.Term(fids[0], maxOrnext.ToString()));
                            query = pQuery;
                            //  query = new Lucene.Net.Search.MultiPhraseQuery();
                            break;
                        default:
                            //不允许的
                            throw new InvalidProgramException("传入参数<U>有误,仅允许Long/Float/Double/Int/String");
                    }
                }
                return query;
            }
            /// <summary>
            /// 排序方式,如果要使用,需要添加SpecdSortAttribute
            /// </summary>
            /// <param name="t"></param>
            /// <returns></returns>
            public Lucene.Net.Search.Sort SetSort<T>(T t)
            {
                //需要拿字段和类型
                //排序
                Lucene.Net.Search.Sort sort = new Lucene.Net.Search.Sort(t.GetSortAttributeValue());//排序 哪个前哪个后
                return sort;
            }
            /// <summary>
            /// 过滤条件
            /// </summary>
            /// <returns></returns>
            public Lucene.Net.Search.Filter SetFilter()
            {
                // Lucene.Net.Search.filter
                return Lucene.Net.Search.NumericRangeFilter.NewIntRange("time", 20220112, 20220113, true, true);
            }
        }
    }

     

    3:扩展方法

    public static class ModelExtison
        {
           /// <summary>
           /// 将实体转换成Document
           /// </summary>
           /// <typeparam name="T"></typeparam>
           /// <param name="t"></param>
           /// <returns></returns>
            public static Lucene.Net.Documents.Document ToDocument<T>(this T t)
            {
                Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
                foreach (var propertyInfo in t.GetType().GetProperties())
                {
                    var attr = propertyInfo.GetCustomAttributes(typeof(LuceneCore.Util.Attributes.LuceneDefineAttribute), false);
                    if (attr != null)
                    {
                        if (attr.Length > 0)
                        {
                            foreach (LuceneCore.Util.Attributes.LuceneDefineAttribute customAttr in attr)
                            {
                                doc.Add(new Lucene.Net.Documents.Field(propertyInfo.Name, propertyInfo.GetValue(t) == null ? "" : propertyInfo.GetValue(t).ToString(), customAttr.Store, customAttr.Index));
                            }
                        }
                        else
                        {
                            var attrtmp = new LuceneCore.Util.Attributes.LuceneDefineAttribute();
                            doc.Add(new Lucene.Net.Documents.Field(propertyInfo.Name, propertyInfo.GetValue(t) == null ? "" : propertyInfo.GetValue(t).ToString(), attrtmp.Store, attrtmp.Index));
                        }
                    }
                }
                return doc;
            }
            /// <summary>
            /// 将document转换为实体
            /// </summary>
            /// <typeparam name="T"></typeparam>
            /// <param name="document"></param>
            /// <returns></returns>
            public static T ToEntity<T>(this Lucene.Net.Documents.Document document)
            {
                T t = default(T);
                t = (T)Activator.CreateInstance(typeof(T));
                foreach (System.Reflection.PropertyInfo pro in typeof(T).GetProperties())
                {
                    pro.SetValue(t, Convert.ChangeType(document.Get(pro.Name), pro.PropertyType));
                }
                return t;
            }
            /// <summary>
            /// 返回标注了指定特性的字段和值
            /// </summary>
            /// <typeparam name="T"></typeparam>
            /// <typeparam name="Attr"></typeparam>
            /// <param name="obj"></param>
            /// <returns></returns>
            public static (string field, string txt) GetCustomAttributeColumn_KeyAndValue<T,Attr>(this T obj)
            {
                foreach (System.Reflection.PropertyInfo pro in typeof(T).GetProperties())
                {
                    if (pro.IsDefined(typeof(Attr), false))
                    {
                        var val = pro.GetValue(pro.Name);
                        return (pro.Name, val == null ? "" : val.ToString());
                    }
                }
                return ("", "");
            }
            /// <summary>
            /// 根据标注了排序特性的字段的类型返回排序类型
            /// </summary>
            /// <typeparam name="T"></typeparam>
            /// <param name="obj"></param>
            /// <returns></returns>
            public static Lucene.Net.Search.SortField[] GetSortAttributeValue<T>(this T obj)
            {
    
                Lucene.Net.Search.SortField[] tuples = null;
                Dictionary<Lucene.Net.Search.SortField, int> keyValuePairs = new Dictionary<Lucene.Net.Search.SortField, int>();
                var defindSpecidAttr = typeof(T).GetProperties().Where(x => x.IsDefined(typeof(Util.Attributes.SpecidSortAttribute), true));
                if (defindSpecidAttr != null && defindSpecidAttr.Count() > 0)
                {
                    int index = 0;
                    foreach (var item in defindSpecidAttr)
                    {
                        var attr = (Util.Attributes.SpecidSortAttribute)item.GetCustomAttributes(false).FirstOrDefault(x => x.GetType() == typeof(Util.Attributes.SpecidSortAttribute));
                        Lucene.Net.Search.SortField sortField = new Lucene.Net.Search.SortField(item.Name, item.PropertyType.GetSortFieldType(), attr.Desc);//降序
                        keyValuePairs.Add(sortField, attr.Order);
                        tuples[index] = sortField;
                        index++;
                    }
                }
                tuples = keyValuePairs.OrderBy(x => x.Value).Select(x => x.Key).ToArray();
                // tuples = tuples.OrderBy(x => x.Order);
                return tuples;
            }
            /// <summary>
            /// 具体根据某一类型返回具体数值
            /// </summary>
            /// <param name="t"></param>
            /// <returns></returns>
            public static int GetSortFieldType(this Type t)
            {
                return t switch
                {
    
                    _ when t.Name == "Int32" || t.Name == "int" => Lucene.Net.Search.SortField.INT,
                    _ when t.Name == "Int64" || t.Name == "long" => Lucene.Net.Search.SortField.LONG,
                    _ when t.Name == "float" => Lucene.Net.Search.SortField.FLOAT,
                    _ when t.Name == "double" => Lucene.Net.Search.SortField.DOUBLE,
                    _ when t.Name == "string" => Lucene.Net.Search.SortField.STRING,
                    _ when t.Name == "byte" => Lucene.Net.Search.SortField.BYTE,
                    _ when t.Name == "short" || t.Name == "Int16" => Lucene.Net.Search.SortField.SHORT,
                    _ => -1
                };
            }
    
            public static void GetString<T>(this object t)
            {
                Console.WriteLine(typeof(T).Name);
            }
        }

    4:特性

    4.1(有具体实现)

    //Lucene标注特性
     [AttributeUsage(AttributeTargets.Property)]
        class LuceneDefineAttribute : Attribute
        {
            public LuceneDefineAttribute()
            {
                Store = Lucene.Net.Documents.Field.Store.YES;
                Index = Lucene.Net.Documents.Field.Index.ANALYZED;
            }
            public Lucene.Net.Documents.Field.Store Store { get; set; }
            public Lucene.Net.Documents.Field.Index Index { get; set; }
        }
    
    //排序特性
     [System.AttributeUsage(System.AttributeTargets.Property)]
        internal class SpecidSortAttribute : System.Attribute
        {
            public SpecidSortAttribute()
            {
                Desc = true;
                Order = 0;
            }
            public bool Desc { get; set; }
            public int Order { get; set; }
        }

    4.2(无具体实现,只是标注一下有便于反射时使用)

      public class SpecidQueryAttribute : Attribute
        {
        }
    public class TermQueryAttribute : SpecidQueryAttribute
        {
        }
        public class WildcardQueryAttribute : SpecidQueryAttribute
        {
        }
        public class FuzzyQueryAttribute : SpecidQueryAttribute
        {
        }
        public class NumericRangeQueryAttribute : SpecidQueryAttribute
        {
        }
        public class PhraseQueryAttribute : SpecidQueryAttribute
        {
        }
        public class PrefixQueryAttribute : SpecidQueryAttribute
        {
        }
    class SpecidFilterAttribute:Attribute
        {
        }

    使用:

     public void Test<T>(T t)
            {
                CreateIndexWriter(@"C:\Users\FanLin\Desktop\index");
                var list = new List<T>();
                list.Add(t);
                CreateIndex(list);
                QueryByKeyWord<T>("搜索", false, null, 100, null, "NAME");
                var query1 = CreateQuery<T,Lucene.Net.Search.TermQuery, Util.Attributes.TermQueryAttribute>("搜索");
                var query = CreateBooleanQuery((query1, Lucene.Net.Search.Occur.SHOULD));
                QueryBySpecidQuery(query);
    
            }

    总结:对Lucene一个简单的封装,也让自己了解一下,虽然平时用不到。而且还有什么比抄代码更快乐的呢

              本来想把DContext注入进去,使T通用,但是目前能力有限还没有找到好方法。就只能使用泛型了。

    参考文档:

    http://www.luofenming.com/show.aspx?id=ART2021110700001

    https://blog.csdn.net/qq_21137441/article/details/98941178

  • 相关阅读:
    Java for LeetCode 025 Reverse Nodes in k-Group
    Java for LeetCode 024 Swap Nodes in Pairs
    Java for LeetCode 023 Merge k Sorted Lists
    【JAVA、C++】LeetCode 022 Generate Parentheses
    【JAVA、C++】LeetCode 021 Merge Two Sorted Lists
    【JAVA、C++】LeetCode 020 Valid Parentheses
    【JAVA、C++】LeetCode 019 Remove Nth Node From End of List
    9-[记录操作]--数据的增删改,权限管理
    8-[表操作]--foreign key、表与表的关系
    7-[表操作]--完整性约束
  • 原文地址:https://www.cnblogs.com/fanlin92/p/15802013.html
Copyright © 2011-2022 走看看