zoukankan      html  css  js  c++  java
  • NJU Static Program Analysis 03: Data Flow Analysis I

    NJU Static Program Analysis 03: Data Flow Analysis I

    Abstract

    • Understand the three data flow analyses:
      • reaching definitions
      • live variables (in next lecture)
      • available expressions (in next lecture)
    • Can tell the differences and similarities of the three data flow analyses
    • Understand the iterative algorithm and can tell why it is able to terminate

    Notes

    At the beginning, this lecture introduced some preliminaries of data flow analysis:

    • Program Points:

      Program points refers to the points before and after each statement in a program.

    • Input & Output States:

      Each program points is associated with a program state. When inspecting a specific statement, we can name the program point before and after it with input state and output state. Each execution of an IR(Intermediate Representation) statement would transform an input state to another output state. For some more complex situations in a control flow, we can use sign ^ to indicate the "meet" of two states.

      Here are some illustrations:

      image-20210708195303984

      To further abstraction, we can take the transformation corresponding as a function. In this context, considering a statement (s) with the input state ({ m IN}[s]) and the output state ({ m OUT}[s]) we have two different way of analysis. For forward analysis, we inspect the function ({ m OUT}[s] = f_s({ m IN}[s])), and for backward analysis, we inspect the function ({ m IN}[s] = f_s({ m OUT}[s])).

    We can also generalize these notations to BBs(Basic Blocks) in a CFG(Control Flow Graph). For instance, considering a block (B) sequentially consisting the statements (s_1, dots, s_n), we can name that ({ m IN}[B] = { m IN}[s_1]), ({ m OUT}[B] = { m OUT}[s_n]).
    And in forward analysis ({ m OUT}[B] = f_B({ m IN})[B]), we have (f_B = f_{s_n} circ dots, circ f_{s_2} circ f_{s_1}). In case there is a control flow meets at the point before B, we also have ({ m IN}[B] = igwedge_{P o B} { m OUT}[P]). What's more, we can obviously find that to do backward analysis is to do forward analysis on the reverse of the given graph, so these notations are naturally similar in backward analysis contexts.

    The lecture continued to introduce the data flow analysis applications. We would save some complex issues including method calls and aliases to the further lectures, and focus on the inter-procedural analysis.

    The first application is Reaching Definitions Analysis, which is defined as

    A definition d at program point d reaches a point q if there is a path from p to q such that d is not "killed" along the path,

    where a definition (d) of a variable (v) is a statement that assigns a value to v, and "killed" means there is a new definition of (v) occurs along the path.

    Reaching definition can be useful in complier optimization, or some simple undefined variables detection. For example, we can introduce a dummy definition for each variable at the entry(beginning node) of CFG, and if a dummy definition reaches a point p where the corresponding variable is used, bugs may occur here.

    To achieve a sound analysis of reaching definitions, we should do safe-approximation(over-approximation) when implementing the transfer function.
    For a definition D: v = x op y, we generates a definition D of variable v and "kills" all the other definitions in the program that define v, while leaving the remaining income definitions unaffected. Formally we have: ({ m OUT}[B] = gen_B cup ({ m IN}[B] - kill_B)), where ({ m IN}) and ({ m OUT}) refers to the bits compressed state of the reaching definitions; 0 refers to killed while 1 refers to defined. Compressing states to a vector of bits is a common way when doing algorithms concerning states like search or dynamic programming.
    And for the occasion where control flow meets, naturally we should make sure that ({ m IN}[B] = igcup_{P o B}{ m OUT}[P]).

    After these work, we can design a iterative algorithm for data flow analysis(the algorithm is a template not only for reaching definitions analysis):

    image-20210708213219497

    We can give a brief proof that the algorithm would end in finite steps. Considering each time a compressed state changes from (a) to (b), as the implementation of our transfer function indicates, it must satisfies (a or b = b). In another word, the change can only be (0 o 1), while (1 o 0) would be impossible. So in the worst case, the state would become 111111...1, and that is an ending.

  • 相关阅读:
    正确使用c语言中的头文件
    用Python玩转大数据-入门阶段
    Python基础知识进阶(五---2)----程序基本结构、简单分支、异常处理、三大实例分析、基本循环结构、通用循环构造方法、死循环嵌套循环、布尔表达式
    Python基础知识进阶(五---1)----程序基本结构、简单分支、异常处理、三大实例分析、基本循环结构、通用循环构造方法、死循环嵌套循环、布尔表达式
    Python小程序练习(7个) 2017年9月28日---告诉自己多东西练习,而不是去空想
    python基础知识进阶(四) Python语言的math库和random库和实例
    python基础知识进阶(二) Python语言的数据类型
    python基础知识进阶之路(一)
    今天2017.5.23 开始一个web端网上商城项目的开发
    Lambda表达式(转载)
  • 原文地址:https://www.cnblogs.com/Shimarin/p/14988477.html
Copyright © 2011-2022 走看看