zoukankan      html  css  js  c++  java
  • mysql优化器在统计全表扫描的代价时的方法

    innodb 的聚集索引 的叶子结点 存放的 是 索引值以及数据页的偏移量

    那么在计算全表扫描的代价是怎么计算的呢?

    我们知道代价 为 cpu代价+io代价

    cpu代价 就是 每5条记录比对 计算一个代价 (这里的记录并不是我们数据记录,而是索引记录) 是数据记录个数

    又是如何取出全表的总记录呢 (即全表的总索引记录)

    具体方法是 通过索引能拿到叶子结点的page数,page页默认16K ,那么总容量为 leaf page num * 16k

    再计算这个索引的长度,因为索引可能是由多个字段构成,因此要遍历,假设为 m

     total_records = leaf page num * 16k /m 就是 索引记录个数了, 一条聚焦索引记录对应一条数据记录,所以这里是总的记录数

    还是有问题 这个leaf page是数据页,而m是主键的长度,上面的total_records计算出来的结果 并不是准确的记录个数,按理说m为一条记录的长度,但代码里是主键的长度

    那么cpu cost 就是 total_records/5+1

      io cost 就是  (double) (prebuilt->table->stat_clustered_index_size(聚簇索引叶页面数);

    /******************************************************************//**
    Calculate the time it takes to read a set of ranges through an index
    This enables us to optimise reads for clustered indexes.
    @return    estimated time measured in disk seeks */
    UNIV_INTERN
    double
    ha_innobase::read_time(
    /*===================*/
        uint    index,    /*!< in: key number */
        uint    ranges,    /*!< in: how many ranges */
        ha_rows rows)    /*!< in: estimated number of rows in the ranges */
    {
        ha_rows total_rows;
        double    time_for_scan;
    
        if (index != table->s->primary_key) {
            /* Not clustered */
            return(handler::read_time(index, ranges, rows));
        }
    
        if (rows <= 2) {
    
            return((double) rows);
        }
    
        /* Assume that the read time is proportional to the scan time for all
        rows + at most one seek per range. */
    
        time_for_scan = scan_time();
        
        //estimate_rows_upper_bound这里就是计算全表总记录的函数
        if ((total_rows = estimate_rows_upper_bound()) < rows) {
    
            return(time_for_scan);
        }
    
        return(ranges + (double) rows / (double) total_rows * time_for_scan);
    }
    
    /*********************************************************************//**
    Gives an UPPER BOUND to the number of rows in a table. This is used in
    filesort.cc.
    @return    upper bound of rows */
    UNIV_INTERN
    ha_rows
    ha_innobase::estimate_rows_upper_bound(void)
    /*======================================*/
    {
        dict_index_t*    index;
        ulonglong    estimate;
        ulonglong    local_data_file_length;
        ulint        stat_n_leaf_pages;
    
    
        //取得该表的第一个索引,就是聚集索引
        index = dict_table_get_first_index(prebuilt->table);
        
        //聚焦索引的叶子结点个数
        stat_n_leaf_pages = index->stat_n_leaf_pages;
    
        //大小为 叶子结点个数*16k
        local_data_file_length = ((ulonglong) stat_n_leaf_pages) * UNIV_PAGE_SIZE;
    
    
        /* Calculate a minimum length for a clustered index record and from
        that an upper bound for the number of rows. Since we only calculate
        new statistics in row0mysql.c when a table has grown by a threshold
        factor, we must add a safety factor 2 in front of the formula below. */
        
        //计算这个聚集索引的大小
        // 2* 总叶子个数*16K / 聚焦索引大小 得到聚集索引记录个数
        estimate = 2 * local_data_file_length /
                         dict_index_calc_min_rec_len(index);
    
        DBUG_RETURN((ha_rows) estimate);
    }
    
    /*********************************************************************//**
    Calculates the minimum record length in an index. */
    UNIV_INTERN
    ulint
    dict_index_calc_min_rec_len(
    /*========================*/
        const dict_index_t*    index)    /*!< in: index */
    {
        ulint    sum    = 0;
        ulint    i;
    
        //记录为compack 紧凑模式,因为有可能这个索引是由多个字段组成,要遍历,求出总字节数
        ulint    comp    = dict_table_is_comp(index->table);
    
        if (comp) {
            ulint nullable = 0;
            sum = REC_N_NEW_EXTRA_BYTES;
            for (i = 0; i < dict_index_get_n_fields(index); i++) {
                const dict_col_t*    col
                    = dict_index_get_nth_col(index, i);
                ulint    size = dict_col_get_fixed_size(col, comp);
                sum += size;
                if (!size) {
                    size = col->len;
                    sum += size < 128 ? 1 : 2;
                }
                if (!(col->prtype & DATA_NOT_NULL)) {
                    nullable++;
                }
            }
    
            /* round the NULL flags up to full bytes */
            sum += UT_BITS_IN_BYTES(nullable);
    
            return(sum);
        }
    }

    结构体dict_index_t

    /** InnoDB B-tree index */
    typedef struct dict_index_struct dict_index_t;
    
    /** Data structure for an index.  Most fields will be
    initialized to 0, NULL or FALSE in dict_mem_index_create(). */
    struct dict_index_struct{
        index_id_t    id;    /*!< id of the index */
        mem_heap_t*    heap;    /*!< memory heap */
        const char*    name;    /*!< index name */
        const char*    table_name;/*!< table name */
        dict_table_t*    table;    /*!< back pointer to table */ //
    #ifndef UNIV_HOTBACKUP
        unsigned    space:32;
                    /*!< space where the index tree is placed */
        unsigned    page:32;/*!< index tree root page number */
    #endif /* !UNIV_HOTBACKUP */
        unsigned    type:DICT_IT_BITS;
                    /*!< index type (DICT_CLUSTERED, DICT_UNIQUE,
                    DICT_UNIVERSAL, DICT_IBUF, DICT_CORRUPT) */
    #define MAX_KEY_LENGTH_BITS 12
        unsigned    trx_id_offset:MAX_KEY_LENGTH_BITS;
                    /*!< position of the trx id column
                    in a clustered index record, if the fields
                    before it are known to be of a fixed size,
                    0 otherwise */
    #if (1<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
    # error (1<<MAX_KEY_LENGTH_BITS) < MAX_KEY_LENGTH
    #endif
        unsigned    n_user_defined_cols:10;
                    /*!< number of columns the user defined to
                    be in the index: in the internal
                    representation we add more columns */
        unsigned    n_uniq:10;/*!< number of fields from the beginning
                    which are enough to determine an index
                    entry uniquely */
        unsigned    n_def:10;/*!< number of fields defined so far */
        unsigned    n_fields:10;/*!< number of fields in the index */
        unsigned    n_nullable:10;/*!< number of nullable fields */
        unsigned    cached:1;/*!< TRUE if the index object is in the
                    dictionary cache */
        unsigned    to_be_dropped:1;
                    /*!< TRUE if this index is marked to be
                    dropped in ha_innobase::prepare_drop_index(),
                    otherwise FALSE. Protected by
                    dict_sys->mutex, dict_operation_lock and
                    index->lock.*/
        dict_field_t*    fields;    /*!< array of field descriptions */
    #ifndef UNIV_HOTBACKUP
        UT_LIST_NODE_T(dict_index_t)
                indexes;/*!< list of indexes of the table */
        btr_search_t*    search_info; /*!< info used in optimistic searches */
        /*----------------------*/
        /** Statistics for query optimization */
        /* @{ */
        ib_int64_t*    stat_n_diff_key_vals;
                    /*!< approximate number of different
                    key values for this index, for each
                    n-column prefix where n <=
                    dict_get_n_unique(index); we
                    periodically calculate new
                    estimates */
        ib_int64_t*    stat_n_non_null_key_vals;
                    /* approximate number of non-null key values
                    for this index, for each column where
                    n < dict_get_n_unique(index); This
                    is used when innodb_stats_method is
                    "nulls_ignored". */
        ulint        stat_index_size;
                    /*!< approximate index size in
                    database pages */
        ulint        stat_n_leaf_pages;
                    /*!< approximate number of leaf pages in the
                    index tree */
        /* @} */
        rw_lock_t    lock;    /*!< read-write lock protecting the
                    upper levels of the index tree */
        trx_id_t    trx_id; /*!< id of the transaction that created this
                    index, or 0 if the index existed
                    when InnoDB was started up */
    #endif /* !UNIV_HOTBACKUP */
    #ifdef UNIV_BLOB_DEBUG
        mutex_t        blobs_mutex;
                    /*!< mutex protecting blobs */
        void*        blobs;    /*!< map of (page_no,heap_no,field_no)
                    to first_blob_page_no; protected by
                    blobs_mutex; @see btr_blob_dbg_t */
    #endif /* UNIV_BLOB_DEBUG */
    #ifdef UNIV_DEBUG
        ulint        magic_n;/*!< magic number */
    /** Value of dict_index_struct::magic_n */
    # define DICT_INDEX_MAGIC_N    76789786
    #endif
    };
  • 相关阅读:
    不使用库函数,编写函数int strcmp(char *source, char *dest) 相等返回0,不等返回-1【转】
    atol实现【转】
    atol的实现【转】
    关于内存中栈和堆的区别(非数据结构中的堆和栈,区别)【转】
    ubuntu下安装android模拟器genymotion【转】
    buntu下命令行安装jdk,android-studio,及genymotion虚拟机来进行android开发【转】
    Ubuntu下安装Android studio【转】
    C++模板(二)【转】
    【转】iOS中设置导航栏标题的字体颜色和大小
    【转】Java 截取字符串
  • 原文地址:https://www.cnblogs.com/taek/p/6652863.html
Copyright © 2011-2022 走看看