zoukankan      html  css  js  c++  java
  • CSAPP Float Point

    Float Point

    Fractional Binary Numbers

    • Representation
      • Bits to right of "binary point" represent fractional powers of (2)
      • Represents rational number:

        [sum_{k=-j}^i b_k imes 2^k ]

    we can represent any fractional binary number

    Fractional Binary Numbers: Examples

    Observations
    • Divide by 2 by shifting right (unsigned)
    • Multiply by 2 by shifting left
    • Numbers of form 0.111111…2 are just below 1.0
      • Use notation (1.0^{-varepsilon})
        (varepsilon) depends how many bits you have to the right of binary point

    Representable Numbers

    Limitation #1

    Can only exactly represent numbers of the form
    (frac{x}{2^k})
    example:
    (1/3) Representation: ({0.0101010101[01]…}_2 )
    (1/5) Representation: ({0.001100110011[0011]…}_2 )

    Limitation #2

    Just one setting of binary point within the (w) bits
    Limited range of numbers:

    • binary point shift right ( ightarrow) range of numbers (uparrow)
    • binary point shift left ( ightarrow) range of fractional binary numbers (uparrow)

    IEEE Floating Point

    Floating Point Representation
    • Numerical Form: ((-1)^s M 2^E)
      • Sign bit (s) determines whether number is negative or positive
      • Significand (M)(mantissa) normally a fractional value in range ([1.0,2.0)).
      • Exponent (E) weights value by power of two
    • Encoding
      • MSB s is sign bit (s)
      • exp field encodes (E) (but is not equal to E)
      • frac field encodes (M) (but is not equal to M)
    Precision options
    • Single precision: 32 bits
      (s):1 bit
      (exp): 8 bit
      (frac): 23 bit
    • Double precision: 64 bits
      (s): 1 bit
      (exp): 11 bit
      (frac): 52 bit
    "Normalized" Values
    • When: exp ( ot =) (000…0) and exp ( ot =) (111…1)
    • Exponent coded as a biased value: E = Exp – Bias(7 unsigned numbers)
      • Exp: unsigned value of exp field(we can compare two float numbers using Exp directly because of the unsigned value)
      • Bias = (2^{k-1} - 1),where (k) is number of exponent bits
        Single precision: 127 (Exp: 1…254, E: -126…127)(don't have 000..0 or 111..1)
        Double precision: 1023 (Exp: 1…2046, E: -1022…1023)(don't have 000..0 or 111..1)
    • Significand coded with implied leading 1: M = 1.xx..x2
    • xxx…x: bits of frac field(1 is drop,because we want a bit for free)
    • Minimum when frac=000…0 (M = 1.0)
  • 相关阅读:
    LAMP环境搭建博客
    PHP项目中经常用到的无限极分类函数
    在PHP项目中,每个类都要有对应的命名空间,为什么?
    一键解决docker pull hello-world的问题
    网盘10M速度下载-亿寻下载器
    《提问的智慧》
    idea出现 Error:(1, 1) java: 非法字符: 'ufeff'解决方式
    多线程的四种实现方式
    Java中的get()方法和set()方法
    Java构造器(构造方法/constructor)
  • 原文地址:https://www.cnblogs.com/strategist-614/p/14410364.html
Copyright © 2011-2022 走看看