zoukankan      html  css  js  c++  java
  • 分位数标准化

    quantile normalization 原理:

    A quick illustration of such normalizing on a very small dataset:

    Arrays 1 to 3, genes A to D

    A    5    4    3
    B    2    1    4
    C    3    4    6
    D    4    2    8
    

    For each column determine a rank from lowest to highest and assign number i-iv

    A    iv    iii   i
    B    i     i     ii
    C    ii    iii   iii
    D    iii   ii    iv
    

    These rank values are set aside to use later. Go back to the first set of data. Rearrange that first set of column values so each column is in order going lowest to highest value. (First column consists of 5,2,3,4. This is rearranged to 2,3,4,5. Second Column 4,1,4,2 is rearranged to 1,2,4,4, and column 3 consisting of 3,4,6,8 stays the same because it is already in order from lowest to highest value.) The result is:

    A    5    4    3    becomes A 2 1 3
    B    2    1    4    becomes B 3 2 4
    C    3    4    6    becomes C 4 4 6
    D    4    2    8    becomes D 5 4 8
    

    Now find the mean for each row to determine the ranks

    A (2 1 3)/3 = 2.00 = rank i
    B (3 2 4)/3 = 3.00 = rank ii
    C (4 4 6)/3 = 4.67 = rank iii
    D (5 4 8)/3 = 5.67 = rank iv
    

    Now take the ranking order and substitute in new values

    A    iv    iii   i
    B    i     i     ii
    C    ii    iii   iii
    D    iii   ii    iv
    

    becomes:

    A    5.67    4.67    2.00
    B    2.00    2.00    3.00
    C    3.00    4.67    4.67
    D    4.67    3.00    5.67


    R实现方法: 实质上是针对array数据进行设置的,要求数据每一列是一个array,每一行是一个探针
    针对分位数标准化,R中有多个包进行处理 1:affy 2: preprocessCore 其中preprocessCore 中的normalize.quantiles使用非常方便
    > a<-matrix(1:6,3,2)
    > a
         [,1] [,2]
    [1,]    1    4
    [2,]    2    5
    [3,]    3    6
    > library(preprocessCore)
    > b=normalize.quantiles(a)
    > b
         [,1] [,2]
    [1,]  2.5  2.5
    [2,]  3.5  3.5
    [3,]  4.5  4.5
  • 相关阅读:
    原码、反码、补码以及为什么要用反码和补码
    Linux中的段管理,bss段,data段,
    关于SRAM,DRAM,SDRAM,以及NORFLASH,NANDFLASH
    S3C2440的GPIO
    剑指offer——二叉搜索树与双向链表
    剑指offer——平衡二叉树
    ***剑指offer——字符串的排列(不会)
    剑指offer——两个链表的第一个公共结点
    剑指offer——数组中只出现一次的数字
    剑指offer——最小的K个数
  • 原文地址:https://www.cnblogs.com/lmj-sky/p/6036392.html
Copyright © 2011-2022 走看看