zoukankan      html  css  js  c++  java
  • sas优化技巧(1) 追踪资源使用情况选项,控制内存使用情况bufsize、bufno、sasfile、ibufsize

    CPU time:the amount of time the central processing unit (CPU) uses to perform requested tasks such as calculations, reading and writing data, conditional logic, and iterative logic.CPU time is measured when data must be processed in the program data vector.

    I/O:a measurement of the read and write operations that are performed as data and programs are copied from a storage device to memory (input) or from memory to a storage or display device (output).

    1:用sas选项来追踪资源使用情况

    选项关键词前面加NO,可以取消该选项。

    2:控制内存使用情况

    2.1:Measuring I/O

    Improvement in I/O can come at the cost of increased memory consumption

    用一张图展示缓冲区也就是memory的那块与I/O的关系

    I/O的计算是在从数据集到缓冲区和从缓冲区到数据集这两部分组成

    2.1:如何改变单次I/O读入数据的大小?

    2.1.1 page size

    在sas中page size和buffer size是一个意思,那么增加size就会减少I/O也是一个很自然的道理,但是这样的代价就是memory consumption的消耗增加

    sas通过一系列算法来给定一个默认的page size,对于多任务的sas程序很实用,但是需要手动改变page size也是可以的。

    一般不用min,会出现不可预见的问题。

    如果用copy过程拷贝数据集,原有的page size不会保留

    2.1.2 page no

    BUFNO= control the number of buffers that are available for reading or writing a SAS data set with each I/O transfer

    建议使用10

    总结:对于小数据集,尽可能的一次性分配可以足够读取数据集的buffer

    3:使用sasfile语句

    sasfile语句可以将数据集hold在内存中,减少open/close操作,包括释放和分配内存

    The SASFILE statement opens a SAS data file and allocates enough buffers to hold the entire file in memory

    Once the data file is read, the data is held in memory, and it is available to subsequent DATA and PROC steps or applications until either

    sasfile close or 程序结束自动释放内存

    如果没有足够的空间则会用虚拟内存或者默认的buffer大小

    在data步或proc步,sas会自动释放buffer,在这个程序中如果不用sasfile,company.sales则要被读取两次,浪费了资源
    sasfile company.sales load; proc print data=company.sales; var Customer_Age_Group; run; proc tabulate data=company.sales; class Customer_Age_Group; var Customer_BirthDate; table Customer_Age_Group,Customer_BirthDate*(mean median); run; sasfile company.sales close;

    总结

    1:If you need to repeatedly process a SAS data file that will fit entirely in memory,use the SASFILE statement to reduce I/O and some CPU usage

    2:If you use the SASFILE statement and the SAS data file will not fit entirely in memory, the code will execute, but there might be a degradation in performance

    3:如果只要反复文件的一部分,则最好用sasfile,这样提升效率

    4:用IBUFSIZE=改变索引缓存的大小

    这对于经常用索引的程序有改善,但是改变大小后要重新建立索引

    IBUFSIZE=0重新设置为系统默认大小

  • 相关阅读:
    Ubuntu18.04安装NAVIDIA驱动
    ubuntu 设置root用户密码并实现root用户登录
    配置ubuntu允许远程SSH连接
    Centos7安装yum命令
    NVDIA往期在线研讨会地址 论坛提问地址
    二进制安装单master节点测试环境k8s集群
    kubeadm初始化k8s-延长证书过期时间
    kubeadm初始化k8s-删除控制节点-重新把控制节点加入集群步骤
    kubeadm安装的多master节点的k8s高可用集群
    二进制安装多master节点的k8s集群
  • 原文地址:https://www.cnblogs.com/yican/p/4121756.html
Copyright © 2011-2022 走看看