zoukankan      html  css  js  c++  java
  • spark统计

    http://www.myexception.cn/sql/2004512.html

    http://blog.csdn.net/ssw_1990/article/details/52220466

    http://www.tuicool.com/articles/uIRZFv

    http://mt.sohu.com/20160514/n449468405.shtml

    http://blog.csdn.net/asongoficeandfire/article/details/21490101

    http://confluence.jetbrains.com/display/IntelliJIDEA/Working+with+Scala+Console

    http://my.oschina.net/jamesju/blog/83659

     经过资料查询,做几个实验。

    基本与sql的实现方式一致,方便理解。

    第一步 实现分析 
    所有订单中每年最畅销货品: 
    1、求出每年每个货品的销售金额

    scala>select c.theyear,b.itemid,sum(b.amount) as sumofamount from tbStock a join tbStockDetail b on a.ordernumber=b.ordernumber join tbDate  c on a.dateid=c.dateid group by c.theyear,b.itemid

    2、求出每年单品销售的最大金额

    scala>select d.theyear,max(d.sumofamount) as maxofamount from (select c.theyear,b.itemid,sum(b.amount) as sumofamount from tbStock a join tbStockDetail b on a.ordernumber=b.ordernumber join tbDate  c on a.dateid=c.dateid group by c.theyear,b.itemid) d group by d.theyear

    3、求出每年与销售额最大相符的货品就是最畅销货品

    scala>select distinct  e.theyear,e.itemid,f.maxofamount from (select c.theyear,b.itemid,sum(b.amount) as sumofamount from tbStock a join tbStockDetail b on a.ordernumber=b.ordernumber join tbDate  c on a.dateid=c.dateid group by c.theyear,b.itemid) e join (select d.theyear,max(d.sumofamount) as maxofamount from (select c.theyear,b.itemid,sum(b.amount) as sumofamount from tbStock a join tbStockDetail b on a.ordernumber=b.ordernumber join tbDate  c on a.dateid=c.dateid group by c.theyear,b.itemid) d group by d.theyear) f on (e.theyear=f.theyear and e.sumofamount=f.maxofamount) order by e.theyear

    第二步 实现SQL语句

    scala>hiveContext.sql("select distinct  e.theyear,e.itemid,f.maxofamount from (select c.theyear,b.itemid,sum(b.amount) as sumofamount from tbStock a join tbStockDetail b on a.ordernumber=b.ordernumber join tbDate  c on a.dateid=c.dateid group by c.theyear,b.itemid) e join (select d.theyear,max(d.sumofamount) as maxofamount from (select c.theyear,b.itemid,sum(b.amount) as sumofamount from tbStock a join tbStockDetail b on a.ordernumber=b.ordernumber join tbDate  c on a.dateid=c.dateid group by c.theyear,b.itemid) d group by d.theyear) f on (e.theyear=f.theyear and e.sumofamount=f.maxofamount) order by e.theyear").collect().foreach(println)
  • 相关阅读:
    [转]移动端实现垂直居中的几种方法
    MySQL常见的两种存储引擎:MyISAM与InnoDB的爱恨情仇
    关于分布式计算的一些概念
    一份最中肯的Java学习路线+资源分享(拒绝傻逼式分享)
    Java多线程学习(八)线程池与Executor 框架
    深入理解工厂模式
    深入理解单例模式
    Java NIO 之 Buffer(缓冲区)
    Java NIO 概览
    分布式系统的经典基础理论
  • 原文地址:https://www.cnblogs.com/wcLT/p/5826329.html
Copyright © 2011-2022 走看看