zoukankan      html  css  js  c++  java
  • Checking the Performance of FindArray

    FindArray Example

    Let’s create a program that shows how a sample C++ compiler generates code for a function

    named FindArray. Later, we will write an assembly language version of the function, attempting

    to write more efficient code than the C++ compiler. The following FindArray function (in C++)

    searches for a single value in an array of long integers:

    bool FindArray( long searchVal, long array[], long count )
    
    {
    
    for(int i = 0; i < count; i++)
    
    {
    
    if( array[i] == searchVal )
    
    return true;
    
    }
    
    return false;
    
    }

    Linking MASM to Visual C++

    Let’s create a hand-optimized assembly language version of FindArray, named AsmFindArray.

    A few basic principles are applied to the code optimization:

    Move as much processing out of the loop as possible.

    Move stack parameters and local variables to registers.

    Take advantage of specialized string/array processing instructions (in this case, SCASD).

    We will use Microsoft Visual C++ (Visual Studio) to compile the calling C++ program and

    Microsoft MASM to assemble the called procedure. Visual C++ generates 32-bit applications that

    run only in protected mode. We choose Win32 Console as the target application type for the examples

    shown here, although there is no reason why the same procedures would not work in ordinary

    MS-Windows applications. In Visual C++, functions return 8-bit values in AL, 16-bit values in AX,

    32-bit values in EAX, and 64-bit values in EDX:EAX. Larger data structures (structure values,

    arrays, etc.) are stored in a static data location, and a pointer to the data is returned in EAX.

    Our assembly language code is slightly more readable than the code generated by the C++

    compiler because we can use meaningful label names and define constants that simplify the use

    of stack parameters. Here is the complete module listing:

    TITLE AsmFindArray Procedure (AsmFindArray.asm)
    
    .586
    
    .model flat,C
    
    AsmFindArray PROTO,
    
    srchVal:DWORD, arrayPtr:PTR DWORD, count:DWORD
    
    .code
    
    ;-----------------------------------------------
    
    AsmFindArray PROC USES edi,
    
    srchVal:DWORD, arrayPtr:PTR DWORD, count:DWORD
    
    ;
    
    ; Performs a linear search for a 32-bit integer
    
    ; in an array of integers. Returns a boolean
    
    ; value in AL indicating if the integer was found.
    
    ;-----------------------------------------------
    
    true = 1
    
    false = 0
    
    mov eax,srchVal ; search value
    
    mov ecx,count ; number of items
    
    mov edi,arrayPtr ; pointer to array
    
    repne scasd ; do the search
    
    jz returnTrue ; ZF = 1 if found
    
    returnFalse:
    
    mov al,false
    
    jmp short exit
    
    returnTrue:
    
    mov al, true
    
    exit:
    
    ret
    
    AsmFindArray ENDP
    
    END

    Checking the Performance of FindArray

    Test Program It is interesting to check the performance of any assembly language code

    you write against similar code written in C++. To that end, the following C++ test program

    inputs a search value and gets the system time before and after executing a loop that calls

    FindArray one million times. The same test is performed on AsmFindArray. Here is a listing

    of the findarr.h header file, with function prototypes for the assembly language procedure and

    the C++ function:

    // findarr.h
    
    extern "C" {
    
    bool AsmFindArray( long n, long array[], long count );
    
    // Assembly language version
    
    bool FindArray( long n, long array[], long count );
    
    // C++ version
    
    }

    Main C++ Module Here is a listing of main.cpp, the startup program that calls FindArray and

    AsmFindArray:

    // main.cpp - Testing FindArray and AsmFindArray.
    
    #include <iostream>
    
    #include <time.h>
    
    #include "findarr.h"
    
    using namespace std;
    
    int main()
    
    {
    
    // Fill an array with pseudorandom integers.
    
    const unsigned ARRAY_SIZE = 10000;
    
    const unsigned LOOP_SIZE = 1000000;
    
    long array[ARRAY_SIZE];
    
    for(unsigned i = 0; i < ARRAY_SIZE; i++)
    
    array[i] = rand();
    
    long searchVal;
    
    time_t startTime, endTime;
    
    cout << "Enter value to find: ";
    
    cin >> searchVal;
    
    cout << "Please wait. This will take between 10 and 30
    
    seconds...
    ";
    
    // Test the C++ function:
    
    time( &startTime );
    
    bool found = false;
    
    for( int n = 0; n < LOOP_SIZE; n++)
    
    found = FindArray( searchVal, array, ARRAY_SIZE );
    
    time( &endTime );
    
    cout << "Elapsed CPP time: " << long(endTime - startTime)
    
    << " seconds. Found = " << found << endl;
    
    // Test the Assembly language procedure:
    
    time( &startTime );
    
    found = false;
    
    for( int n = 0; n < LOOP_SIZE; n++)
    
    found = AsmFindArray( searchVal, array, ARRAY_SIZE );
    
    time( &endTime );
    
    cout << "Elapsed ASM time: " << long(endTime - startTime)
    
    << " seconds. Found = " << found << endl;
    
    return 0;
    
    }

    Assembly Code versus Nonoptimized C++ Code We compiled the C++ program to a

    Release (non-debug) target with code optimization turned off. Here is the output, showing the

    worst case (value not found):

    Assembly Code versus Compiler Optimization Next, we set the compiler to optimize the

    executable program for speed and ran the test program again. Here are the results, showing the

    assembly code is noticeably faster than the compiler-optimized C++ code:

    Pointers versus Subscripts

    Programmers using older C compilers observed that processing arrays with pointers was more efficient

    than using subscripts. For example, the following version of FindArray uses this approach:

    bool FindArray( long searchVal, long array[], long count )
    
    {
    
    long * p = array;
    
    for(int i = 0; i < count; i++, p++)
    
    if( searchVal == *p )
    
    return true;
    
    return false;
    
    }

    Running this version of FindArray through the Visual C++ compiler produced virtually the

    same assembly language code as the earlier version using subscripts. Because modern compilers

    are good at code optimization, using a pointer variable is no more efficient than using a subscript.

    Here is the loop from the FindArray target code that was produced by the C++ compiler:

    $L176:
    cmp esi, DWORD PTR [ecx]
    je SHORT $L184
    inc eax
    add ecx, 4
    cmp eax, edx
    jl SHORT $L176

    Your time would be well spent studying the output produced by a C++ compiler to learn about

    optimization techniques, parameter passing, and object code implementation. In fact, many computer

    science students take a compiler-writing course that includes such topics. It is also important to

    realize that compilers take the general case because they usually have no specific knowledge about

    individual applications or installed hardware. Some compilers provide specialized optimization for a

    particular processor such as the Pentium, which can significantly improve the speed of compiled

    programs. Hand-coded assembly language can take advantage of string primitive instructions, as

    well as specialized hardware features of video cards, sound cards, and data acquisition boards.

  • 相关阅读:
    01背包
    用动态规划求两个自然数的最大公约数
    编程实现文件的复制功能,要求源文件名及目标文件名在程序运行后根据提示输入
    this和super
    JAVA中static的使用
    结构化异常处理 笔记
    继承和多态 笔记
    javascript 客户端验证和页面特效制作 学习笔记
    定义封装的类类型 笔记
    C# 核心编程结构Ⅱ 笔记
  • 原文地址:https://www.cnblogs.com/dreamafar/p/5995125.html
Copyright © 2011-2022 走看看