zoukankan      html  css  js  c++  java
  • Solr实现SQL的查询与统计--转载

    原文地址:http://shiyanjun.cn/archives/78.html

    Cloudera公司已经推出了基于Hadoop平台的查询统计分析工具Impala,只要熟悉SQL,就可以熟练地使用Impala来执行查询与分析的功能。不过Impala的SQL和关系数据库的SQL还是有一点微妙地不同的。
    下面,我们设计一个表,通过该表中的数据,来将SQL查询与统计的语句,使用Solr查询的方式来与SQL查询对应。这个翻译的过程,是非常有趣的,你可以看到Solr一些很不错的功能。
    用来示例的表结构设计,如图所示:

    table

    下面,我们通过给出一些SQL查询统计语句,然后对应翻译成Solr查询语句,然后对比结果。

    查询对比

    • 条件组合查询

    SQL查询语句:

    1 SELECT log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type
    2 FROM v_i_event
    3 WHERE prov_id = 1 AND net_type = 1 AND area_id = 10304 AND time_type = 1 AND time_id >= 20130801 AND time_id <= 20130815
    4 ORDER BY log_id LIMIT 10;

    查询结果,如图所示:
    query
    Solr查询URL:

    1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=prov_id:1 AND net_type:1 AND area_id:10304 AND time_type:1 AND time_id:[20130801 TO 20130815]&sort=log_id asc&start=0&rows=10

    查询结果,如下所示:

    <response>
        <lst name="responseHeader">
            <int name="status">0</int>
            <int name="QTime">4</int>
        </lst>
        <result name="response" numFound="77" start="0">
            <doc>
                <int name="log_id">6827</int>
                <long name="start_time">1375072117</long>
                <long name="end_time">1375081683</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">11002</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6827</int>
                <long name="start_time">1375072117</long>
                <long name="end_time">1375081683</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">11000</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">14001</int>
                <int name="cnt">5</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">11002</int>
                <int name="cnt">23</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">10200</int>
                <int name="cnt">55</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">14000</int>
                <int name="cnt">4</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">11000</int>
                <int name="cnt">1</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">10201</int>
                <int name="cnt">31</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">8002</int>
                <int name="cnt">8</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6851</int>
                <long name="start_time">1375142158</long>
                <long name="end_time">1375146391</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10304</int>
                <int name="idt_id">8000</int>
                <int name="cnt">30</int>
                <int name="net_type">1</int>
            </doc>
        </result>
    </response>

    对比上面结果,除了根据idt_id排序方式不同以外(Impala是升序,Solr是降序),其他是相同的。

    • 单个字段分组统计

    SQL查询语句:

    1 SELECT prov_id, SUM(cnt) AS sum_cnt, AVG(cnt) AS avg_cnt, MAX(cnt) AS max_cnt, MIN(cnt) ASmin_cnt, COUNT(cnt) AS count_cnt
    2 FROM v_i_event
    3 GROUP BY prov_id;

    查询结果,如图所示:
    group
    Solr查询URL:

    1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&rows=0&indent=true

    查询结果,如下所示:

    <response>
        <lst name="responseHeader">
            <int name="status">0</int>
            <int name="QTime">2</int>
        </lst>
        <result name="response" numFound="4088" start="0"></result>
        <lst name="stats">
            <lst name="stats_fields">
                <lst name="cnt">
                    <double name="min">0.0</double>
                    <double name="max">1258.0</double>
                    <long name="count">4088</long>
                    <long name="missing">0</long>
                    <double name="sum">32587.0</double>
                    <double name="sumOfSquares">9170559.0</double>
                    <double name="mean">7.971379647749511</double>
                    <double name="stddev">46.69344567709268</double>
                    <lst name="facets" />
                </lst>
            </lst>
        </lst>
    </response>

    对比查询结果,Solr提供了更多的统计项,如标准差(stddev)等,与SQL查询结果是一致的。

    • IN条件查询

    SQL查询语句:

    1 SELECT log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_typ
    2 FROM v_i_event
    3 WHERE prov_id = 1 AND net_type = 1 AND city_id IN(106,103) AND idt_idIN(12011,5004,6051,6056,8002) AND time_type = 1 AND time_id >= 20130801 AND time_id <= 20130815
    4 ORDER BY log_id, start_time DESC LIMIT 10;

    查询结果,如图所示:
    in
    Solr查询URL:

    http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt,net_type&fq=prov_id:1 AND net_type:1 AND (city_id:106 OR city_id:103) AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND time_id:[20130801 TO 20130815]&sort=log_id asc ,start_time desc&start=0&rows=10

    或者:

    http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id, cnt ,net_type&fq=prov_id:1&fq=net_type:1&fq=(city_id:106 OR city_id:103)&fq=(idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002)&fq=time_type:1&fq=time_id:[20130801 TO 20130815]&sort=log_id asc,start_time desc&start=0&rows=10

    查询结果,如下所示:

    <response>
        <lst name="responseHeader">
            <int name="status">0</int>
            <int name="QTime">6</int>
        </lst>
        <result name="response" numFound="63" start="0">
            <doc>
                <int name="log_id">6553</int>
                <long name="start_time">1374054184</long>
                <long name="end_time">1374054254</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">12011</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6553</int>
                <long name="start_time">1374054184</long>
                <long name="end_time">1374054254</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">5004</int>
                <int name="cnt">2</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6555</int>
                <long name="start_time">1374055060</long>
                <long name="end_time">1374055158</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">70104</int>
                <int name="idt_id">5004</int>
                <int name="cnt">3</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6555</int>
                <long name="start_time">1374055060</long>
                <long name="end_time">1374055158</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">70104</int>
                <int name="idt_id">12011</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6595</int>
                <long name="start_time">1374292508</long>
                <long name="end_time">1374292639</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">5004</int>
                <int name="cnt">4</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6611</int>
                <long name="start_time">1374461233</long>
                <long name="end_time">1374461245</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">5004</int>
                <int name="cnt">1</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6612</int>
                <long name="start_time">1374461261</long>
                <long name="end_time">1374461269</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">5004</int>
                <int name="cnt">1</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6612</int>
                <long name="start_time">1374461261</long>
                <long name="end_time">1374461269</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">12011</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6613</int>
                <long name="start_time">1374461422</long>
                <long name="end_time">1374461489</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">6056</int>
                <int name="cnt">1</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6613</int>
                <long name="start_time">1374461422</long>
                <long name="end_time">1374461489</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">6051</int>
                <int name="cnt">1</int>
                <int name="net_type">1</int>
            </doc>
        </result>
    </response>

    对比查询结果,是一致的。

    • 开区间范围条件查询

    SQL查询语句:

    1 SELECT log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type
    2 FROM v_i_event
    3 WHERE net_type = 1 AND idt_id IN(12011,5004,6051,6056,8002) AND time_type = 1 ANDstart_time >= 1373598465 AND end_time < 1374055254
    4 ORDER BY log_id, start_time, idt_id DESC LIMIT 30;

    查询结果,如图所示:
    open
    Solr查询URL:

    1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30

    1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1 AND (idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002) AND time_type:1 AND start_time:[1373598465 TO 1374055254] AND -start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30

    1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&fl=log_id,start_time,end_time,prov_id,city_id,area_id,idt_id,cnt,net_type&fq=net_type:1&fq=idt_id:12011 OR idt_id:5004 OR idt_id:6051 OR idt_id:6056 OR idt_id:8002&fq =time_type:1&fq=start_time:[1373598465 TO 1374055254]&fq =-start_time:1374055254&sort=log_id asc,start_time asc,idt_id desc&start=0&rows=30

    查询结果,如下所示:

    <response>
        <lst name="responseHeader">
            <int name="status">0</int>
            <int name="QTime">5</int>
        </lst>
        <result name="response" numFound="4" start="0">
            <doc>
                <int name="log_id">6553</int>
                <long name="start_time">1374054184</long>
                <long name="end_time">1374054254</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">12011</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6553</int>
                <long name="start_time">1374054184</long>
                <long name="end_time">1374054254</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">10307</int>
                <int name="idt_id">5004</int>
                <int name="cnt">2</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6555</int>
                <long name="start_time">1374055060</long>
                <long name="end_time">1374055158</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">70104</int>
                <int name="idt_id">12011</int>
                <int name="cnt">0</int>
                <int name="net_type">1</int>
            </doc>
            <doc>
                <int name="log_id">6555</int>
                <long name="start_time">1374055060</long>
                <long name="end_time">1374055158</long>
                <int name="prov_id">1</int>
                <int name="city_id">103</int>
                <int name="area_id">70104</int>
                <int name="idt_id">5004</int>
                <int name="cnt">3</int>
                <int name="net_type">1</int>
            </doc>
        </result>
    </response>
    • 多个字段分组统计(只支持count函数)

    SQL查询语句:

    1 SELECT city_id, area_id, COUNT(cnt) AS count_cnt
    2 FROM v_i_event
    3 WHERE prov_id = 1 AND net_type = 1
    4 GROUP BY city_id, area_id;

    查询结果,如图所示:
    group2
    Solr查询URL:

    1 http://slave1:8888/solr-cloud/i_event/select?q=*:*&facet=true&facet.pivot=city_id,area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true

    查询结果,如下所示:

    <response>
        <lst name="responseHeader">
            <int name="status">0</int>
            <int name="QTime">72</int>
        </lst>
        <result name="response" numFound="1171" start="0"></result>
        <lst name="facet_counts">
            <lst name="facet_queries" />
            <lst name="facet_fields" />
            <lst name="facet_dates" />
            <lst name="facet_ranges" />
            <lst name="facet_pivot">
                <arr name="city_id,area_id">
                    <lst>
                        <str name="field">city_id</str>
                        <int name="value">103</int>
                        <int name="count">678</int>
                        <arr name="pivot">
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10307</int>
                                <int name="count">298</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10315</int>
                                <int name="count">120</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10317</int>
                                <int name="count">86</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10304</int>
                                <int name="count">67</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10310</int>
                                <int name="count">49</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">70104</int>
                                <int name="count">48</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10308</int>
                                <int name="count">6</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">0</int>
                                <int name="count">2</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10311</int>
                                <int name="count">2</int>
                            </lst>
                        </arr>
                    </lst>
                    <lst>
                        <str name="field">city_id</str>
                        <int name="value">0</int>
                        <int name="count">463</int>
                        <arr name="pivot">
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">0</int>
                                <int name="count">395</int>
                            </lst>
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10307</int>
                                <int name="count">68</int>
                            </lst>
                        </arr>
                    </lst>
                    <lst>
                        <str name="field">city_id</str>
                        <int name="value">106</int>
                        <int name="count">10</int>
                        <arr name="pivot">
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10304</int>
                                <int name="count">10</int>
                            </lst>
                        </arr>
                    </lst>
                    <lst>
                        <str name="field">city_id</str>
                        <int name="value">110</int>
                        <int name="count">8</int>
                        <arr name="pivot">
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">0</int>
                                <int name="count">8</int>
                            </lst>
                        </arr>
                    </lst>
                    <lst>
                        <str name="field">city_id</str>
                        <int name="value">118</int>
                        <int name="count">8</int>
                        <arr name="pivot">
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">10316</int>
                                <int name="count">8</int>
                            </lst>
                        </arr>
                    </lst>
                    <lst>
                        <str name="field">city_id</str>
                        <int name="value">105</int>
                        <int name="count">4</int>
                        <arr name="pivot">
                            <lst>
                                <str name="field">area_id</str>
                                <int name="value">0</int>
                                <int name="count">4</int>
                            </lst>
                        </arr>
                    </lst>
                </arr>
            </lst>
        </lst>
    </response>

    对比上面结果,Solr查询结果,需要从上面的各组中进行合并,得到最终的统计结果,结果和SQL结果是一致的。

    • 多个字段分组统计(支持count、sum、max、min等函数)

    一次对多个字段进行独立分组统计,Solr可以很好的支持。这相当于执行两个带有GROUP BY子句的SQL,这两个GROUP BY分别只对一个字段进行汇总统计。
    SQL查询语句:

    1 SELECT city_id, area_id, COUNT(cnt) AS count_cnt
    2 FROM v_i_event
    3 WHERE prov_id = 1 AND net_type = 1
    4 GROUP BY city_id;
    5  
    6 SELECT city_id, area_id, COUNT(cnt) AS count_cnt
    7 FROM v_i_event
    8 WHERE prov_id = 1 AND net_type = 1
    9 GROUP BY area_id;

    查询结果,不再显示。
    Solr查询URL:

    1 >http://slave1:8888/solr-cloud/i_event/select?q=*:*&stats=true&stats.field=cnt&f.cnt.stats.facet=city_id&&f.cnt.stats.facet=area_id&fq=prov_id:1 AND net_type:1&rows=0&indent=true

    查询结果,如下所示:

    <response>
        <lst name="responseHeader">
            <int name="status">0</int>
            <int name="QTime">6</int>
        </lst>
        <result name="response" numFound="1171" start="0"></result>
        <lst name="stats">
            <lst name="stats_fields">
                <lst name="cnt">
                    <double name="min">0.0</double>
                    <double name="max">167.0</double>
                    <long name="count">1171</long>
                    <long name="missing">0</long>
                    <double name="sum">3701.0</double>
                    <double name="sumOfSquares">249641.0</double>
                    <double name="mean">3.1605465414175917</double>
                    <double name="stddev">14.260812879164407</double>
                    <lst name="facets">
                        <lst name="city_id">
                            <lst name="0">
                                <double name="min">0.0</double>
                                <double name="max">167.0</double>
                                <long name="count">463</long>
                                <long name="missing">0</long>
                                <double name="sum">2783.0</double>
                                <double name="sumOfSquares">238819.0</double>
                                <double name="mean">6.010799136069115</double>
                                <double name="stddev">21.92524420257807</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="110">
                                <double name="min">0.0</double>
                                <double name="max">1.0</double>
                                <long name="count">8</long>
                                <long name="missing">0</long>
                                <double name="sum">3.0</double>
                                <double name="sumOfSquares">3.0</double>
                                <double name="mean">0.375</double>
                                <double name="stddev">0.5175491695067657</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="106">
                                <double name="min">0.0</double>
                                <double name="max">0.0</double>
                                <long name="count">10</long>
                                <long name="missing">0</long>
                                <double name="sum">0.0</double>
                                <double name="sumOfSquares">0.0</double>
                                <double name="mean">0.0</double>
                                <double name="stddev">0.0</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="105">
                                <double name="min">0.0</double>
                                <double name="max">0.0</double>
                                <long name="count">4</long>
                                <long name="missing">0</long>
                                <double name="sum">0.0</double>
                                <double name="sumOfSquares">0.0</double>
                                <double name="mean">0.0</double>
                                <double name="stddev">0.0</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="103">
                                <double name="min">0.0</double>
                                <double name="max">55.0</double>
                                <long name="count">678</long>
                                <long name="missing">0</long>
                                <double name="sum">915.0</double>
                                <double name="sumOfSquares">10819.0</double>
                                <double name="mean">1.3495575221238938</double>
                                <double name="stddev">3.7625525739676986</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="118">
                                <double name="min">0.0</double>
                                <double name="max">0.0</double>
                                <long name="count">8</long>
                                <long name="missing">0</long>
                                <double name="sum">0.0</double>
                                <double name="sumOfSquares">0.0</double>
                                <double name="mean">0.0</double>
                                <double name="stddev">0.0</double>
                                <lst name="facets" />
                            </lst>
                        </lst>
                        <lst name="area_id">
                            <lst name="10308">
                                <double name="min">0.0</double>
                                <double name="max">1.0</double>
                                <long name="count">6</long>
                                <long name="missing">0</long>
                                <double name="sum">1.0</double>
                                <double name="sumOfSquares">1.0</double>
                                <double name="mean">0.16666666666666666</double>
                                <double name="stddev">0.408248290463863</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10310">
                                <double name="min">0.0</double>
                                <double name="max">5.0</double>
                                <long name="count">49</long>
                                <long name="missing">0</long>
                                <double name="sum">40.0</double>
                                <double name="sumOfSquares">108.0</double>
                                <double name="mean">0.8163265306122449</double>
                                <double name="stddev">1.2528878206593208</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="0">
                                <double name="min">0.0</double>
                                <double name="max">167.0</double>
                                <long name="count">409</long>
                                <long name="missing">0</long>
                                <double name="sum">2722.0</double>
                                <double name="sumOfSquares">238550.0</double>
                                <double name="mean">6.6552567237163816</double>
                                <double name="stddev">23.243931908854</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10311">
                                <double name="min">0.0</double>
                                <double name="max">0.0</double>
                                <long name="count">2</long>
                                <long name="missing">0</long>
                                <double name="sum">0.0</double>
                                <double name="sumOfSquares">0.0</double>
                                <double name="mean">0.0</double>
                                <double name="stddev">0.0</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10304">
                                <double name="min">0.0</double>
                                <double name="max">55.0</double>
                                <long name="count">77</long>
                                <long name="missing">0</long>
                                <double name="sum">370.0</double>
                                <double name="sumOfSquares">9476.0</double>
                                <double name="mean">4.805194805194805</double>
                                <double name="stddev">10.064318107786017</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="70104">
                                <double name="min">0.0</double>
                                <double name="max">3.0</double>
                                <long name="count">48</long>
                                <long name="missing">0</long>
                                <double name="sum">51.0</double>
                                <double name="sumOfSquares">117.0</double>
                                <double name="mean">1.0625</double>
                                <double name="stddev">1.1560433254047038</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10307">
                                <double name="min">0.0</double>
                                <double name="max">12.0</double>
                                <long name="count">366</long>
                                <long name="missing">0</long>
                                <double name="sum">274.0</double>
                                <double name="sumOfSquares">768.0</double>
                                <double name="mean">0.7486338797814208</double>
                                <double name="stddev">1.2418218134151426</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10315">
                                <double name="min">0.0</double>
                                <double name="max">4.0</double>
                                <long name="count">120</long>
                                <long name="missing">0</long>
                                <double name="sum">143.0</double>
                                <double name="sumOfSquares">359.0</double>
                                <double name="mean">1.1916666666666667</double>
                                <double name="stddev">1.2588899560996694</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10316">
                                <double name="min">0.0</double>
                                <double name="max">0.0</double>
                                <long name="count">8</long>
                                <long name="missing">0</long>
                                <double name="sum">0.0</double>
                                <double name="sumOfSquares">0.0</double>
                                <double name="mean">0.0</double>
                                <double name="stddev">0.0</double>
                                <lst name="facets" />
                            </lst>
                            <lst name="10317">
                                <double name="min">0.0</double>
                                <double name="max">5.0</double>
                                <long name="count">86</long>
                                <long name="missing">0</long>
                                <double name="sum">100.0</double>
                                <double name="sumOfSquares">262.0</double>
                                <double name="mean">1.1627906976744187</double>
                                <double name="stddev">1.3093371930442208</double>
                                <lst name="facets" />
                            </lst>
                        </lst>
                    </lst>
                </lst>
            </lst>
        </lst>
    </response>
    • 多个字段联合分组统计(支持count、sum、max、min等函数)

    SQL查询语句:

    1 SELECT city_id, area_id, SUM(cnt) AS sum_cnt, AVG(cnt) AS avg_cnt, MAX(cnt) AS max_cnt,MIN(cnt) AS min_cnt, COUNT(cnt) AS count_cnt
    2 FROM v_i_event
    3 WHERE prov_id = 1 AND net_type = 1
    4 GROUP BY city_id, area_id;

    查询结果,如图所示:
    group_join_2
    Solr目前不能简单的支持这种查询,如果想要满足这种查询统计,需要在schema的设计上,将一个字段设置为多值,然后通过多个值进行分组统计。如果应用中查询统计分析的模式比较固定,预先知道哪些字段会用于联合分组统计,完全可以在设计的时候,考虑设置多值字段来满足这种需求。

    参考链接

  • 相关阅读:
    matplotlib数据可视化之柱形图
    xpath排坑记
    Leetcode 100. 相同的树
    Leetcode 173. 二叉搜索树迭代器
    Leetcode 199. 二叉树的右视图
    Leetcode 102. 二叉树的层次遍历
    Leetcode 96. 不同的二叉搜索树
    Leetcode 700. 二叉搜索树中的搜索
    Leetcode 2. Add Two Numbers
    Leetcode 235. Lowest Common Ancestor of a Binary Search Tree
  • 原文地址:https://www.cnblogs.com/davidwang456/p/4818749.html
Copyright © 2011-2022 走看看