zoukankan      html  css  js  c++  java
  • Spark MLlib Deep Learning Convolution Neural Network (深度学习-卷积神经网络)3.2

    3、Spark MLlib Deep Learning Convolution Neural Network(深度学习-卷积神经网络)3.2

    http://blog.csdn.net/sunbow0

    第三章Convolution Neural Network (卷积神经网络)

    2基础及源代码解析

    2.1 Convolution Neural Network卷积神经网络基础知识

    1)基础知识:

    自行google,百度。基础方面的非常多,随便看看就能够,仅仅是非常多没有把细节说得清楚和明确;

    能把细节说清楚了讲明确了。能够參照以下2个文章,前提条件是你得先要有基础性的了解。

    2)重点參照:

    http://www.cnblogs.com/fengfenggirl/p/cnn_implement.html

    http://www.cnblogs.com/tornadomeet/archive/2013/05/05/3061457.html

    2.2 Deep Learning CNN源代码解析

    2.2.1 CNN代码结构

    CNN源代码主要包含:CNN,CNNModel两个类,源代码结构例如以下:


    CNN结构:



    CNNModel结构:


    2.2.2 CNN训练过程


    2.2.3 CNN解析

    (1) CNNLayers

    /**

     * types:网络层类别

     * outputmaps:特征map数量

     * kernelsize:卷积核k大小

     * k: 卷积核

     * b: 偏置

     * dk: 卷积核的偏导

     * db: 偏置的偏导

     * scale: pooling大小

     */

    caseclassCNNLayers(

     types:String,

     outputmaps:Double,

     kernelsize:Double,

     scale:Double,

     k:Array[Array[BDM[Double]]],

     b: Array[Double],

     dk:Array[Array[BDM[Double]]],

    db:Array[Double])extends Serializable

    CNNLayers:自己定义数据类型。存储网络每一层的參数信息。

    (2) CnnSetup

    卷积神经网络參数初始化。依据參数逐层构建CNN网络。

    /** 卷积神经网络层參数初始化. */

     defCnnSetup: (Array[CNNLayers], BDM[Double], BDM[Double], Double) = {

       varinputmaps1=1.0

       varmapsize1=mapsize

       varconfinit= ArrayBuffer[CNNLayers]()

       for(l <-0 tolayer -1) {// layer

         valtype1=types(l)

         valoutputmap1=outputmaps(l)

         valkernelsize1=kernelsize(l)

         valscale1=scale(l)

         vallayersconf=if(type1=="s"){//每一层參数初始化

            mapsize1 =mapsize1 /scale1

            valb1 = Array.fill(inputmaps1.toInt)(0.0)

            valki = Array(Array(BDM.zeros[Double](1,1)))

            new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki,b1,ki,b1)

         } elseif(type1=="c"){

            mapsize1 =mapsize1 -kernelsize1+1.0

            valfan_out =outputmap1* math.pow(kernelsize1,2)

            valfan_in =inputmaps1* math.pow(kernelsize1,2)

            valki = ArrayBuffer[Array[BDM[Double]]]()

            for (i <-0toinputmaps1.toInt-1) {// input map

              valkj = ArrayBuffer[BDM[Double]]()

              for (j <-0tooutputmap1.toInt-1) {// output map         

                valkk = (BDM.rand[Double](kernelsize1.toInt,kernelsize1.toInt)-0.5)*2.0* sqrt(6.0/ (fan_in+fan_out))

                kj +=kk

              }

              ki +=kj.toArray

            }

            valb1 = Array.fill(outputmap1.toInt)(0.0)

            inputmaps1 =outputmap1

            new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki.toArray,b1,ki.toArray,b1)

         } else{

            valki = Array(Array(BDM.zeros[Double](1,1)))

            valb1 = Array(0.0)

            new CNNLayers(type1,outputmap1,kernelsize1,scale1,ki,b1,ki,b1)

         }

         confinit+=layersconf

       }

       valfvnum=mapsize1(0,0) * mapsize1(0,1) *inputmaps1

       valffb= BDM.zeros[Double](onum,1)

       valffW= (BDM.rand[Double](onum,fvnum.toInt)-0.5)*2.0* sqrt(6.0/ (onum+fvnum))

       (confinit.toArray,ffb,ffW,alpha)

     }

    (3) expand

    克罗内克积方法。

     /**

      * 克罗内克积

      *

      */

     defexpand(a: BDM[Double],s: Array[Int]): BDM[Double]= {

       // val a = BDM((1.0, 2.0), (3.0,4.0), (5.0, 6.0))

       // val s = Array(3, 2)

       valsa = Array(a.rows, a.cols)

       vartt =new Array[Array[Int]](sa.length)

       for(ii <-sa.length -1 to0 by -1) {

         varh =BDV.zeros[Int](sa(ii) * s(ii))

         h(0 tosa(ii) * s(ii) -1 by s(ii)) :=1

         tt(ii) = Accumulate(h).data

       }

       varb = BDM.zeros[Double](tt(0).length,tt(1).length)

       for(j1 <-0 tob.rows -1) {

         for(j2 <-0 tob.cols -1) {

            b(j1,j2) = a(tt(0)(j1) -1, tt(1)(j2) -1)

         }

       }

       b

     }

    (4) convn

    卷积计算方法。

     /**

      * convn卷积计算

      */

     defconvn(m0: BDM[Double],k0: BDM[Double],shape: String): BDM[Double]= {

       //val m0 = BDM((1.0, 1.0, 1.0, 1.0),(0.0, 0.0, 1.0, 1.0), (0.0, 1.0, 1.0, 0.0), (0.0, 1.0, 1.0, 0.0))

       //val k0 = BDM((1.0, 1.0), (0.0,1.0))

       //val m0 = BDM((1.0, 1.0, 1.0),(1.0, 1.0, 1.0), (1.0, 1.0, 1.0))

       //val k0 = BDM((1.0, 2.0, 3.0),(4.0, 5.0, 6.0), (7.0, 8.0, 9.0))   

       valout1= shapematch{

         case"valid"=>

            valm1 = m0

            valk1 = k0.t

            valrow1 =m1.rows -k1.rows +1

            valcol1 =m1.cols -k1.cols +1

            varm2 = BDM.zeros[Double](row1,col1)

            for (i <-0torow1-1) {

              for (j <-0tocol1-1) {

                valr1 =i

                valr2 =r1 +k1.rows -1

                valc1 =j

                valc2 =c1 +k1.cols -1

                valmi =m1(r1 tor2,c1 toc2)

                m2(i,j) = (mi :*k1).sum

              }

            }

            m2

         case"full"=>

            varm1 = BDM.zeros[Double](m0.rows +2 * (k0.rows -1), m0.cols +2 * (k0.cols -1))

            for (i <-0to m0.rows-1) {

              for (j <-0to m0.cols-1) {

                m1((k0.rows -1) +i, (k0.cols -1) +j) = m0(i,j)

              }

            }

            valk1 = Rot90(Rot90(k0))

            valrow1 =m1.rows -k1.rows +1

            valcol1 =m1.cols -k1.cols +1

            varm2 = BDM.zeros[Double](row1,col1)

            for (i <-0torow1-1) {

              for (j <-0tocol1-1) {

                valr1 =i

                valr2 =r1 +k1.rows -1

                valc1 =j

                valc2 =c1 +k1.cols -1

                valmi =m1(r1 tor2,c1 toc2)

                m2(i,j) = (mi :*k1).sum

              }

            }

            m2

       }

       out1

     }

    (5) CNNtrain

    对神经网络进行训练。

    输入參数:train_d 训练RDD数据。opts训练參数。

    输出:CNNModel,训练模型。

    /**

      * 执行卷积神经网络算法.

      */

     defCNNtrain(train_d: RDD[(BDM[Double], BDM[Double])], opts: Array[Double]):CNNModel = {

       valsc =train_d.sparkContext

       varinitStartTime= System.currentTimeMillis()

       varinitEndTime= System.currentTimeMillis()

       // 參数初始化配置

       var(cnn_layers,cnn_ffb,cnn_ffW,cnn_alpha)= CnnSetup

       // 样本数据划分:训练数据、交叉检验数据

       valvalidation= opts(2)

       valsplitW1= Array(1.0-validation,validation)

       valtrain_split1= train_d.randomSplit(splitW1, System.nanoTime())

       valtrain_t=train_split1(0)

       valtrain_v=train_split1(1)

       // m:训练样本的数量

       valm =train_t.count

       // 计算batch的数量

       valbatchsize= opts(0).toInt

       valnumepochs= opts(1).toInt

       valnumbatches= (m /batchsize).toInt

       varrL = Array.fill(numepochs *numbatches.toInt)(0.0)

       varn =0

       // numepochs是循环的次数

       for(i <-1 tonumepochs) {

         initStartTime= System.currentTimeMillis()

         valsplitW2= Array.fill(numbatches)(1.0 /numbatches)

         // 依据分组权重。随机划分每组样本数据 

         for(l <-1 tonumbatches) {

            // 权重

            valbc_cnn_layers =sc.broadcast(cnn_layers)

            valbc_cnn_ffb =sc.broadcast(cnn_ffb)

            valbc_cnn_ffW =sc.broadcast(cnn_ffW)

     

            // 样本划分

            valtrain_split2 =train_t.randomSplit(splitW2, System.nanoTime())

            valbatch_xy1 =train_split2(l -1)

     

            // CNNff是进行前向传播

            // net =cnnff(net, batch_x);

            valtrain_cnnff = CNN.CNNff(batch_xy1,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)

     

            // CNNbp是后向传播

            // net =cnnbp(net, batch_y);

            valtrain_cnnbp = CNN.CNNbp(train_cnnff,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)

     

            // 权重更新

            //  net =cnnapplygrads(net,opts);

            valtrain_nnapplygrads = CNN.CNNapplygrads(train_cnnbp,bc_cnn_ffb,bc_cnn_ffW,cnn_alpha)

            cnn_ffW =train_nnapplygrads._1

            cnn_ffb =train_nnapplygrads._2

            cnn_layers =train_nnapplygrads._3

     

            // error and loss

            // 输出误差计算

            // net.L = 1/2* sum(net.e(:) .^ 2) / size(net.e, 2);

            valrdd_loss1 =train_cnnbp._1.map(f => f._5)

            val (loss2,counte)=rdd_loss1.treeAggregate((0.0,0L))(

              seqOp = (c, v) => {

                // c: (e, count), v: (m)

                vale1 = c._1

                vale2 = (v :* v).sum

                valesum =e1 +e2

                (esum, c._2 +1)

              },

              combOp = (c1, c2) => {

                // c: (e, count)

                vale1 = c1._1

                vale2 = c2._1

                valesum =e1 +e2

                (esum, c1._2 + c2._2)

              })

            valLoss = (loss2/counte.toDouble)*0.5

            if (n ==0) {

              rL(n) =Loss

            } else {

              rL(n) =0.09*rL(n -1) +0.01 *Loss

            }

            n =n +1

         }

         initEndTime= System.currentTimeMillis()

         // 打印输出结果

         printf("epoch: numepochs = %d , Took = %dseconds; batch train mse = %f. ",i, scala.math.ceil((initEndTime -initStartTime).toDouble /1000).toLong,rL(n -1))

       }

       // 计算训练误差及交叉检验误差

       // Full-batch trainmse

       varloss_train_e=0.0

       varloss_val_e=0.0

       loss_train_e= CNN.CNNeval(train_t,sc.broadcast(cnn_layers),sc.broadcast(cnn_ffb),sc.broadcast(cnn_ffW))

       if(validation>0)loss_val_e = CNN.CNNeval(train_v,sc.broadcast(cnn_layers),sc.broadcast(cnn_ffb),sc.broadcast(cnn_ffW))

       printf("epoch: Full-batch train mse = %f, valmse = %f. ",loss_train_e,loss_val_e)

       newCNNModel(cnn_layers,cnn_ffW,cnn_ffb)

    (6) CNNff

    前向传播计算,计算每层输出。从输入层->隐含层->输出层。计算每一层每个节点的输出值。

    输入參数:

    batch_xy1 样本数据

    bc_cnn_layers 每层的參数

    bc_cnn_ffb 偏置參数

    bc_cnn_ffW 权重參数

    输出:

    每一层的计算结果。

     

     /**

      * cnnff是进行前向传播

      * 计算神经网络中的每一个节点的输出值;

      */

     defCNNff(

       batch_xy1: RDD[(BDM[Double], BDM[Double])],

       bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],

       bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

       bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]):RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double], BDM[Double])] = {

       // 1:a(1)=[x]

       valtrain_data1= batch_xy1.map { f =>

         vallable= f._1

         valfeatures= f._2

         valnna1= Array(features)

         valnna= ArrayBuffer[Array[BDM[Double]]]()

         nna+=nna1

         (lable,nna)

       }

       // 2n-1层计算

       valtrain_data2=train_data1.map{ f =>

         vallable= f._1

         valnn_a= f._2

         varinputmaps1=1.0

         valn =bc_cnn_layers.value.length

         // for each layer

         for(l <-1 ton -1) {

            valtype1 = bc_cnn_layers.value(l).types

            valoutputmap1 = bc_cnn_layers.value(l).outputmaps

            valkernelsize1 = bc_cnn_layers.value(l).kernelsize

            valscale1 = bc_cnn_layers.value(l).scale

            valk1 = bc_cnn_layers.value(l).k

            valb1 = bc_cnn_layers.value(l).b

            valnna1 = ArrayBuffer[BDM[Double]]()

            if (type1 =="c"){

              for (j <-0tooutputmap1.toInt-1) {// output map

                // createtemp output map

                varz = BDM.zeros[Double](nn_a(l -1)(0).rows -kernelsize1.toInt + 1, nn_a(l -1)(0).cols -kernelsize1.toInt + 1)

                for (i <-0toinputmaps1.toInt-1) {// input map

                  // convolve with corresponding kernel and add to temp outputmap

                  // z = z + convn(net.layers{l - 1}.a{i}, net.layers{l}.k{i}{j},'valid');

                  z = z + convn(nn_a(l -1)(i),k1(i)(j),"valid")

                }

                // add bias, pass through nonlinearity

                // net.layers{l}.a{j} =sigm(z + net.layers{l}.b{j})

               valnna0 = sigm(z +b1(j))

                nna1 +=nna0

              }

              nn_a +=nna1.toArray

              inputmaps1 =outputmap1

            } elseif (type1=="s"){

              for (j <-0toinputmaps1.toInt-1) {

                // z =convn(net.layers{l - 1}.a{j}, ones(net.layers{l}.scale) /(net.layers{l}.scale ^ 2), 'valid'); replace with variable

                // net.layers{l}.a{j} = z(1 : net.layers{l}.scale : end, 1 :net.layers{l}.scale : end, :);

                valz = convn(nn_a(l -1)(j), BDM.ones[Double](scale1.toInt,scale1.toInt) / (scale1 *scale1),"valid")

                valzs1 =z(::,0 to -1 byscale1.toInt).t +0.0

                valzs2 =zs1(::,0 to -1 byscale1.toInt).t +0.0

                valnna0 =zs2

                nna1 +=nna0

              }

              nn_a +=nna1.toArray

            }

         }

         // concatenate all end layer feature mapsinto vector

         valnn_fv1= ArrayBuffer[Double]()

         for(j <-0 tonn_a(n -1).length -1) {

            nn_fv1 ++=nn_a(n -1)(j).data

         }

         valnn_fv=newBDM[Double](nn_fv1.length,1,nn_fv1.toArray)

         // feedforward into outputperceptrons

         // net.o =sigm(net.ffW * net.fv +repmat(net.ffb,1, size(net.fv, 2)));

         valnn_o= sigm(bc_cnn_ffW.value *nn_fv + bc_cnn_ffb.value)

         (lable,nn_a.toArray,nn_fv,nn_o)

       }

       train_data2

     } 

    (7) CNNbp

    后向传播计算。计算每层导数,输出层->隐含层->输入层。计算每一个节点的偏导数。也即误差反向传播。

    输入參数:

    train_cnnff 前向计算结果

    bc_cnn_layers 每层的參数

    bc_cnn_ffb 偏置參数

    bc_cnn_ffW 权重參数

    输出:

    每一层的偏导数计算结果。

     /**

      * CNNbp是后向传播

      * 计算权重的平均偏导数

      */

     defCNNbp(

       train_cnnff: RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double],BDM[Double])],

       bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],

       bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

       bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]):(RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double], BDM[Double],BDM[Double], BDM[Double], BDM[Double], Array[Array[BDM[Double]]])],BDM[Double], BDM[Double], Array[CNNLayers]) = {

       // error : net.e = net.o - y

       valn =bc_cnn_layers.value.length

       valtrain_data3= train_cnnff.map { f =>

         valnn_e= f._4 - f._1

         (f._1, f._2, f._3, f._4,nn_e)

       }

       // backprop deltas

       // 输出层的灵敏度或者残差

       // net.od = net.e .* (net.o .* (1 - net.o))

       // net.fvd = (net.ffW' * net.od)

       valtrain_data4=train_data3.map{ f =>

         valnn_e= f._5

         valnn_o= f._4

         valnn_fv= f._3

         valnn_od=nn_e:* (nn_o:* (1.0-nn_o))

         valnn_fvd=if(bc_cnn_layers.value(n -1).types =="c") {

            // net.fvd = net.fvd .* (net.fv .* (1 - net.fv));

            valnn_fvd1 = bc_cnn_ffW.value.t *nn_od

            valnn_fvd2 =nn_fvd1:* (nn_fv:* (1.0-nn_fv))

            nn_fvd2

         } else{

            valnn_fvd1 = bc_cnn_ffW.value.t *nn_od

            nn_fvd1

         }

         (f._1, f._2, f._3, f._4, f._5,nn_od,nn_fvd)

       }

       // reshape feature vector deltas intooutput map style

       valsa1=train_data4.map(f=> f._2(n -1)(1)).take(1)(0).rows

       valsa2=train_data4.map(f=> f._2(n -1)(1)).take(1)(0).cols

       valsa3=1

       valfvnum=sa1*sa2

     

       valtrain_data5=train_data4.map{ f =>

         valnn_a= f._2

         valnn_fvd= f._7

         valnn_od= f._6

         valnn_fv= f._3

         varnnd=newArray[Array[BDM[Double]]](n)

         valnnd1= ArrayBuffer[BDM[Double]]()

         for(j <-0 tonn_a(n -1).length -1) {

            valtmp1 =nn_fvd((j *fvnum) to ((j +1) *fvnum -1),0)

            valtmp2 =newBDM(sa1,sa2,tmp1.data)

            nnd1 +=tmp2

         }

         nnd(n -1) =nnd1.toArray

         for(l <- (n -2) to0 by -1) {

            valtype1 = bc_cnn_layers.value(l).types

            varnnd2 = ArrayBuffer[BDM[Double]]()

            if (type1 =="c"){

              for (j <-0tonn_a(l).length -1) {

                valtmp_a =nn_a(l)(j)

                valtmp_d =nnd(l +1)(j)

                valtmp_scale = bc_cnn_layers.value(l +1).scale.toInt

                valtmp1 =tmp_a:* (1.0-tmp_a)

                valtmp2 = expand(tmp_d,Array(tmp_scale,tmp_scale))/ (tmp_scale.toDouble*tmp_scale)

                nnd2 += (tmp1 :*tmp2)

              }

            } elseif (type1=="s"){

              for (i <-0tonn_a(l).length -1) {

                varz = BDM.zeros[Double](nn_a(l)(0).rows,nn_a(l)(0).cols)

                for (j <-0tonn_a(l +1).length -1) {

                  // z = z + convn(net.layers{l + 1}.d{j}, rot180(net.layers{l +1}.k{i}{j}), 'full');

                  z = z + convn(nnd(l +1)(j),Rot90(Rot90(bc_cnn_layers.value(l +1).k(i)(j))),"full")

                }

                nnd2 +=z

              }

            }

            nnd(l) =nnd2.toArray

         }

         (f._1, f._2, f._3, f._4, f._5, f._6, f._7, nnd)

       }

       // dk db calcgradients

       varcnn_layers= bc_cnn_layers.value

       for(l <-1 ton -1) {

         valtype1= bc_cnn_layers.value(l).types

         vallena1=train_data5.map(f=> f._2(l).length).take(1)(0)

         vallena2=train_data5.map(f=> f._2(l -1).length).take(1)(0)

         if(type1=="c"){

            for (j <-0tolena1-1) {

              for (i <-0tolena2-1) {

                valrdd_dk_ij =train_data5.map{ f =>

                  valnn_a = f._2

                  valnn_d = f._8

                  valtmp_d =nn_d(l)(j)

                  valtmp_a =nn_a(l -1)(i)

                  convn(Rot90(Rot90(tmp_a)),tmp_d,"valid")

                }

                valinitdk = BDM.zeros[Double](rdd_dk_ij.take(1)(0).rows,rdd_dk_ij.take(1)(0).cols)

                val (dk_ij,count_dk)=rdd_dk_ij.treeAggregate((initdk,0L))(

                  seqOp = (c, v) => {

                    // c: (m, count), v: (m)

                    valm1 = c._1

                    valm2 =m1 + v

                    (m2, c._2 +1)

                  },

                  combOp = (c1, c2) => {

                    // c: (m, count)

                    valm1 = c1._1

                    valm2 = c2._1

                    valm3 =m1 +m2

                    (m3, c1._2 + c2._2)

                  })

                valdk =dk_ij/count_dk.toDouble

                cnn_layers(l).dk(i)(j) =dk

              }

              valrdd_db_j =train_data5.map{ f =>

                valnn_d = f._8

                valtmp_d =nn_d(l)(j)

                Bsum(tmp_d)

              }

              valdb_j =rdd_db_j.reduce(_+ _)

              valcount_db =rdd_db_j.count

              valdb =db_j/count_db.toDouble

              cnn_layers(l).db(j) =db

            }

         }

       } 

       // net.dffW = net.od * (net.fv)' /size(net.od, 2);

       // net.dffb = mean(net.od, 2);

       valtrain_data6=train_data5.map{ f =>

         valnn_od= f._6

         valnn_fv= f._3

         nn_od*nn_fv.t

       }

       valtrain_data7=train_data5.map{ f =>

         valnn_od= f._6

         nn_od

       }

       valinitffW= BDM.zeros[Double](bc_cnn_ffW.value.rows, bc_cnn_ffW.value.cols)

       val(ffw2,countfffw2)=train_data6.treeAggregate((initffW,0L))(

         seqOp = (c, v) => {

            // c: (m, count), v: (m)

            valm1 = c._1

            valm2 =m1 + v

            (m2, c._2 +1)

         },

         combOp = (c1, c2) => {

            // c: (m, count)

            valm1 = c1._1

            valm2 = c2._1

            valm3 =m1 +m2

            (m3, c1._2 + c2._2)

         })

       valcnn_dffw=ffw2/countfffw2.toDouble

       valinitffb= BDM.zeros[Double](bc_cnn_ffb.value.rows, bc_cnn_ffb.value.cols)

       val(ffb2,countfffb2)=train_data7.treeAggregate((initffb,0L))(

         seqOp = (c, v) => {

            // c: (m, count), v: (m)

            valm1 = c._1

            valm2 =m1 + v

            (m2, c._2 +1)

         },

         combOp = (c1, c2) => {

            // c: (m, count)

            valm1 = c1._1

            valm2 = c2._1

            valm3 =m1 +m2

            (m3, c1._2 + c2._2)

         })

       valcnn_dffb=ffb2/countfffb2.toDouble

       (train_data5,cnn_dffw,cnn_dffb,cnn_layers)

     } 

    (8) CNNapplygrads

    权重更新。

    输入參数:

    train_cnnbp:CNNbp输出值

    bc_cnn_ffb:神经网络偏置參数

    bc_cnn_ffW:神经网络权重參数

    alpha:更新的学习率

    输出參数:(cnn_ffW, cnn_ffb, cnn_layers)更新后权重參数。

    /**

      * NNapplygrads是权重更新

      * 权重更新

      */

     defCNNapplygrads(

       train_cnnbp: (RDD[(BDM[Double], Array[Array[BDM[Double]]], BDM[Double],BDM[Double], BDM[Double], BDM[Double], BDM[Double],Array[Array[BDM[Double]]])], BDM[Double], BDM[Double], Array[CNNLayers]),

       bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

       bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]],

       alpha: Double): (BDM[Double], BDM[Double], Array[CNNLayers]) = {

       valtrain_data5= train_cnnbp._1

       valcnn_dffw= train_cnnbp._2

       valcnn_dffb= train_cnnbp._3

       varcnn_layers= train_cnnbp._4

       varcnn_ffb= bc_cnn_ffb.value

       varcnn_ffW= bc_cnn_ffW.value

       valn =cnn_layers.length

     

       for(l <-1 ton -1) {

         valtype1=cnn_layers(l).types

         vallena1=train_data5.map(f=> f._2(l).length).take(1)(0)

         vallena2=train_data5.map(f=> f._2(l -1).length).take(1)(0)

         if(type1=="c"){

            for (j <-0tolena1-1) {

              for (ii <-0tolena2-1) {

                cnn_layers(l).k(ii)(j) =cnn_layers(l).k(ii)(j) -cnn_layers(l).dk(ii)(j)

              }

              cnn_layers(l).b(j) =cnn_layers(l).b(j) -cnn_layers(l).db(j)

            }

         }

       }

     

       cnn_ffW=cnn_ffW+cnn_dffw

       cnn_ffb=cnn_ffb+cnn_dffb

       (cnn_ffW,cnn_ffb,cnn_layers)

     } 

    (9) CNNeval

    误差计算。

     /**

      * nneval是进行前向传播并计算输出误差

      * 计算神经网络中的每一个节点的输出值,并计算平均误差;

      */

     defCNNeval(

       batch_xy1: RDD[(BDM[Double], BDM[Double])],

       bc_cnn_layers: org.apache.spark.broadcast.Broadcast[Array[CNNLayers]],

       bc_cnn_ffb: org.apache.spark.broadcast.Broadcast[BDM[Double]],

       bc_cnn_ffW: org.apache.spark.broadcast.Broadcast[BDM[Double]]): Double ={

       // CNNff是进行前向传播   

       valtrain_cnnff= CNN.CNNff(batch_xy1, bc_cnn_layers, bc_cnn_ffb, bc_cnn_ffW)

       // error and loss

       // 输出误差计算

       valrdd_loss1=train_cnnff.map{ f =>

         valnn_e= f._4 - f._1

         nn_e

       }

       val(loss2,counte)=rdd_loss1.treeAggregate((0.0,0L))(

         seqOp = (c, v) => {

            // c: (e, count), v: (m)

            vale1 = c._1

            vale2 = (v :* v).sum

            valesum =e1 +e2

            (esum, c._2 +1)

         },

         combOp = (c1, c2) => {

            // c: (e, count)

            vale1 = c1._1

            vale2 = c2._1

            valesum =e1 +e2

            (esum, c1._2 + c2._2)

         })

       valLoss= (loss2/counte.toDouble)*0.5

       Loss

     } 

    2.2.4 CNNModel解析

    (1) CNNModel

    CNNModel:存储CNN网络參数,包含:cnn_layers每一层的配置參数,cnn_ffW权重,dbn_b偏置。cnn_ffb偏置。

    class CNNModel(

     valcnn_layers:Array[CNNLayers],

     valcnn_ffW:BDM[Double],

      valcnn_ffb: BDM[Double])extends Serializable {

    }

    (2) predict

    predict:依据模型进行预測计算。

     /**

      * 返回预測结果

      *  返回格式:(label, feature, predict_label, error)

      */

     defpredict(dataMatrix: RDD[(BDM[Double], BDM[Double])]): RDD[PredictCNNLabel] = {

       valsc =dataMatrix.sparkContext

       valbc_cnn_layers=sc.broadcast(cnn_layers)

       valbc_cnn_ffW=sc.broadcast(cnn_ffW)

       valbc_cnn_ffb=sc.broadcast(cnn_ffb)

       // CNNff是进行前向传播

       valtrain_cnnff= CNN.CNNff(dataMatrix,bc_cnn_layers,bc_cnn_ffb,bc_cnn_ffW)

       valrdd_predict=train_cnnff.map{ f =>

         vallabel= f._1

         valnna1= f._2(0)(0)

         valnnan= f._4

         valerror= f._4 - f._1

         PredictCNNLabel(label,nna1,nnan,error)

       }

       rdd_predict

     }

    (3) Loss

    Loss:依据预測结果计算误差。

     /**

      * 计算输出误差

      * 平均误差;

      */

     defLoss(predict: RDD[PredictCNNLabel]): Double = {

       valpredict1= predict.map(f => f.error)

       // error and loss

       // 输出误差计算

       valloss1=predict1

       val(loss2,counte)=loss1.treeAggregate((0.0,0L))(

         seqOp = (c, v) => {

            // c: (e, count), v: (m)

            vale1 = c._1

            vale2 = (v :* v).sum

            valesum =e1 +e2

            (esum, c._2 +1)

         },

         combOp = (c1, c2) => {

            // c: (e, count)

            vale1 = c1._1

            vale2 = c2._1

            valesum =e1 +e2

            (esum, c1._2 + c2._2)

         })

       valLoss= (loss2/counte.toDouble)*0.5

       Loss

     }

    转载请注明出处:

    http://blog.csdn.net/sunbow0


  • 相关阅读:
    javascript 调试代码
    简洁的js拖拽代码
    搭个小窝
    FastDFS随笔
    JDK6和JDK7中String的substring()方法及其差异
    杂笔
    JVM内存随笔
    java中的final和volatile详解
    关于java的Synchronized,你可能需要知道这些(下)
    关于java的Synchronized,你可能需要知道这些(上)
  • 原文地址:https://www.cnblogs.com/yxysuanfa/p/6874058.html
Copyright © 2011-2022 走看看