zoukankan      html  css  js  c++  java
  • Verilog -- 乘法器Booth算法

    Verilog -- 乘法器Booth算法

    1. 原理

    Booth算法的原理其实小学初中就学过,比如下面这道题:
    简便计算(8754 imes 998 = ?)
    随便抓个娃娃来都知道应该这么算:
    (8754 imes 998 = 8754 imes 1000 - 8754 imes 2)
    我们都知道在十进制里,10的倍数的乘法很容易,就是后面加几个0的事情,而上面这种简便计算都有个特点,就是会有999,1001,997,1002这种数,0和9出现的次数很多,这样就可以通过变为化简变为简单的乘法和加减法。
    对于二进制数,这种简便计算的情况往往更多。因为计算机中为了计算方便,往往将数据转换为补码的形式,而补码形式在计算时会扩展符号位,比如有符号数补码5'b10110 = -10,在计算与一个8位数相加时会扩展为8‘b11110110,可以发现,这种数往往会有很多连续的1出现,这跟上面的简便计算的例子非常相似。比如:

    [0011110 = 0100000 - 0000010 = 2^5-2^1 ]

    这就是booth算法分解乘数的基本原理。

    2. 一般化推论

    假设A和B是乘数和被乘数,且有:

    [egin{align} A &= a_{n-1}a_{n-2}dots a_{1}a_{0} ag{1}\ B &= b_{n-1}b_{n-2}dots b_{1}b_{0} ag{2}\ A*B &= (0-a_0) imes B imes 2^0 + (a_{0}-a_1) imes B imes 2^1 + ag{3}\ &(a_{1}-a_2) imes B imes 2^2 + dots +(a_{n-2}-a_{n-1}) imes B imes 2^{n-1} ag{}\ &=B imes [-a_{n-1} imes2^{n-1}+sum_{i=0}^{n-2}a_i imes 2^i] ag{} \ &=B imes Val(A) ag{} end{align} ]

    最后的Val(A)的表达式实际上就是补码A表示的原码。

    3. 实际算法

    上面的公式推导了booth乘法对乘数的分解原理,实际上在编码时只需要公式3,可以做如下的编码表:

    (a_i) (a_{i-1}) (a_{i-1}-a_i) 操作
    0 0 0
    1 0 -1 减B
    1 1 0
    0 1 1 加B

    举个栗子:
    (N=7, B = 22 = (0010110)_2,A=-34=-(0100010)_2)
    首先计算-B的补码(算法中要用到):(overline{-B} = (1101010)_2)
    以及A的补码:(overline{A} = (1011110)_2)

    硬件计算过程如下:

    首先初始化p空间:(p=2N+1).[A]和[Q]是两个寄存器。其中[Q]是N+1位的。

    1. 首先将乘数A的补码放到[Q]的高N位上,Q的最低为默认为0.(这步是为了(i=0)时,让(a_{-1}=0)
    2. 在Q中检查(a_{i-1}-a_i),得00,查表得无操作,直接将[A]和[Q]合起来右移(算数移位)
    3. 在Q中检查(a_{i-1}-a_i),得10,查表得减B,相当于加-B的补码,在[A]寄存器中加上-B的补码,之后右移
    4. ...

    最后的结果11110100010100就是结果的补码,也就是:
    (B imes A = overline{11110100010100} = (10001011101100)_原 = -748_{10})

    算法跟公式的匹配
    实际上,对于公式中的每一项((a_{i-1}-a_i) imes B imes 2^i)都对应实际算法中的每一步。((a_{i-1}-a_i))决定了B的系数,右移操作因为作用在[A][Q]寄存器上,所以实际上是相当于将积右移,等价于B左移,所以这一步对应( imes 2^i)操作。加减B的操作都作用在[A]寄存器上,保证了( imes 2^i)后的B能够作用在正确的位上。

    4. Verilog代码

    这里只放一种状态机实现的时序逻辑电路,计算过程基本跟上面的算法一样。
    参考了

    https://mp.weixin.qq.com/s?__biz=MzU3ODgwMzI5NA==&mid=2247483685&idx=1&sn=d06f3a4ced52b42c48bd978e63e2d1bf&chksm=fd6e8214ca190b023383add3c7b60a2eeffae1f747ea2c1059ebee273e7241116d411384682d&scene=126&sessionid=1589012994&key=826ecc1d344963fb89a1fe763a3a0a3c6d8d706a9ef97ba15b329db58590b56ea54262ec5c331c21ac89e81717147cec8824d56cd54abdbb95c5cf0a5a692b36cc66ac50b6dada9f71b68e893f8cb271&ascene=1&uin=MTk3NDE3MDgyMg%3D%3D&devicetype=Windows+10+x64&version=62090070&lang=zh_CN&exportkey=AQ4%2BwrjRyeNZJrEsZxPofPE%3D&pass_ticket=dkwMmft8fNv1TNAobItN6BuADVUY3SXqwDEWdgd1XXquz3xUPDTVW48UvhGe4gkz

    的代码,提供者fanhu, fh_w@outlook.com

    `timescale 1ns/1ps
    module booth_fsm
    # (parameter DATAWIDTH = 8)
    (
      input                        clk,
      input                        rstn,
      input                        en,
      input        [DATAWIDTH-1:0] multiplier,                            
      input        [DATAWIDTH-1:0] multiplicand,
      output reg                   done,
      output reg [2*DATAWIDTH-1:0] product
    );
    
    
    parameter   IDLE   = 2'b00,
                ADD    = 2'b01,
                SHIFT  = 2'b11,
                OUTPUT = 2'b10;
    
    reg  [1:0]              current_state, next_state;  // state registers.
    reg  [2*DATAWIDTH+1:0]  a_reg,s_reg,p_reg,sum_reg;  // computational values.
    reg  [DATAWIDTH-1:0]    iter_cnt;                   // iteration count for determining when done.
    wire [DATAWIDTH:0]      multiplier_neg;             // negative value of multiplier
    
    
    always @(posedge clk or negedge rstn)
      if (!rstn) current_state = IDLE;
      else current_state <= next_state;
    
    // state transform
    always @(*) begin
      next_state = 2'bx;
      case (current_state)
        IDLE  : if (en) next_state = ADD;
                else    next_state = IDLE;
        ADD   : next_state = SHIFT;
        SHIFT : if (iter_cnt==DATAWIDTH) next_state = OUTPUT;
                else            next_state = ADD;
        OUTPUT: next_state = IDLE;
      endcase
    end
    
    // negative value of multiplier.
    assign multiplier_neg = -{multiplier[DATAWIDTH-1],multiplier}; 
    // algorithm implemenation details.
    always @(posedge clk or negedge rstn) begin
      if (!rstn) begin
        {a_reg,s_reg,p_reg,iter_cnt,done,sum_reg,product} <= 0;
      end else begin
      case (current_state)
        IDLE :  begin
          a_reg    <= {multiplier[DATAWIDTH-1],multiplier,{(DATAWIDTH+1){1'b0}}};
          s_reg    <= {multiplier_neg,{(DATAWIDTH+1){1'b0}}};
          p_reg    <= {{(DATAWIDTH+1){1'b0}},multiplicand,1'b0};
          iter_cnt <= 0;
          done     <= 1'b0;
        end
        ADD  :  begin
          case (p_reg[1:0])
            2'b01       : sum_reg <= p_reg+a_reg; // + multiplier
            2'b10       : sum_reg <= p_reg+s_reg; // - multiplier
            2'b00,2'b11 : sum_reg <= p_reg;       // nop
          endcase
          iter_cnt <= iter_cnt + 1;
        end
        SHIFT :  begin
          p_reg <= {sum_reg[2*DATAWIDTH+1],sum_reg[2*DATAWIDTH+1:1]}; // right shift 
        end
        OUTPUT : begin
          product <= p_reg[2*DATAWIDTH:1];
          done <= 1'b1;
        end
      endcase
     end
    end
    
    endmodule
    
    

    testbench:

    `timescale 1ns/1ps
    
    // Basic exhaustive self checking test bench.
    `define TEST_WIDTH 10
    module booth_fsm_tb;
    
    reg clk;
    reg rstn;
    reg en;
    integer multiplier1;
    integer multiplicand1;
    reg [`TEST_WIDTH-1:0] multiplier;
    reg [`TEST_WIDTH-1:0] multiplicand;
    wire    done;
    
    //输入 :要定义有符号和符号,输出:无要求
    wire signed [2*`TEST_WIDTH-1:0] product;
    wire signed [`TEST_WIDTH-1:0]                m1_in;
    wire signed [`TEST_WIDTH-1:0]                m2_in;
    
    reg  signed [2*`TEST_WIDTH-1:0] product_ref;
    reg  [2*`TEST_WIDTH-1:0] product_ref_u;
    assign m1_in = multiplier[`TEST_WIDTH-1:0];
    assign m2_in = multiplicand[`TEST_WIDTH-1:0];
    
    booth_fsm #(.DATAWIDTH(`TEST_WIDTH)) booth 
    (
      .clk(clk),
      .rstn(rstn),
      .en(en),
      .multiplier(multiplier),                            
      .multiplicand(multiplicand),
      .done  (done),
      .product(product)
     );
    
    always #1 clk = ~clk;
    
    integer num_good;
    integer i;
    initial begin
      clk = 1;
      en = 0;
      rstn = 1;
      #2 rstn = 0; #2 rstn = 1;
      
      num_good = 0;
      multiplier=0;
      multiplicand=0;
      #8;
    
      for(i=0;i<4;i=i+1) begin
        en = 1;
        multiplier=10'b10000_00000+i;
        multiplicand=10'b00000_00010+i;
    
        wait (done == 0);
        wait (done == 1);
    	product_ref=m1_in*m2_in;
        product_ref_u=m1_in*m2_in;
        if (product_ref !== product) 
             $display("multiplier = %d multiplicand = %d proudct =%d",m1_in,m2_in,product);
            @(posedge clk);
      end		
      $display("sim done. num good = %d",num_good);
    
    end
    
    initial begin
        $fsdbDumpvars();
        $fsdbDumpMDA();
        $dumpvars();
        #1000 $finish;
     end
    endmodule
    
    

    仿真波形

  • 相关阅读:
    BZOJ2219数论之神——BSGS+中国剩余定理+原根与指标+欧拉定理+exgcd
    Luogu 3690 Link Cut Tree
    CF1009F Dominant Indices
    CF600E Lomsat gelral
    bzoj 4303 数列
    CF1114F Please, another Queries on Array?
    CF1114B Yet Another Array Partitioning Task
    bzoj 1858 序列操作
    bzoj 4852 炸弹攻击
    bzoj 3564 信号增幅仪
  • 原文地址:https://www.cnblogs.com/lyc-seu/p/12842399.html
Copyright © 2011-2022 走看看