zoukankan      html  css  js  c++  java
  • Rarely executed and almost empty if statement drastically reduces performance in C++

    Question:

    Editor's clarification: When this was originally posted, there were two issues:

    • Test performance drops by a factor of three if seemingly inconsequential statement added
    • Time taken to complete the test appears to vary randomly

    The second issue has been solved: the randomness only occurs when running under the debugger.

    The remainder of this question should be understood as being about the first bullet point above, and in the context of running in VC++ 2010 Express's Release Mode with optimizations "Maximize Speed" and "favor fast code".

    There are still some Comments in the comment section talking about the second point but they can now be disregarded.


    I have a simulation where if I add a simple if statement into the while loop that runs the actual simulation, the performance drops about a factor of three (and I run a lot of calculations in the while loop, n-body gravity for the solar system besides other things) even though the if statement is almost never executed:

    if (time - cb_last_orbital_update > 5000000)
    {
        cb_last_orbital_update = time;
    }

    with time and cb_last_orbital_update being both of type double and defined in the beginning of the main function, where this if statement is too. Usually there are computations I want to run there too, but it makes no difference if I delete them. The if statement as it is above has the same effect on the performance.

    The variable time is the simulation time, it increases in 0.001 steps in the beginning so it takes a really long time until the if statement is executed for the first time (I also included printing a message to see if it is being executed, but it is not, or at least only when it's supposed to). Regardless, the performance drops by a factor of 3 even in the first minutes of the simulation when it hasn't been executed once yet. If I comment out the line

    cb_last_orbital_update = time;

    then it runs faster again, so it's not the check for

    time - cb_last_orbital_update > 5000000

    either, it's definitely the simple act of writing current simulation time into this variable.

    Also, if I write the current time into another variable instead of cb_last_orbital_update, the performance does not drop. So this might be an issue with assigning a new value to a variable that is used to check if the "if" should be executed? These are all shots in the dark though.

    Disclaimer: I am pretty new to programming, and sorry for all that text.

    I am using Visual C++ 2010 Express, deactivating the stdafx.h precompiled header function didn't make a difference either.

    EDIT: Basic structure of the program. Note that nowhere besides at the end of the while loop (time += time_interval;) is time changed. Also, cb_last_orbital_update has only 3 occurrences: Declaration / initialization, plus the two times in the if statement that is causing the problem.

    int main(void)
    {
        ...
        double time = 0;
        double time_interval = 0.001;
        double cb_last_orbital_update = 0;
    
        F_Rocket_Preset(time, time_interval, ...);
    
        while(conditions)
        {
        Rocket[active].Stage[Rocket[active].r_stage].F_Update_Stage_Performance(time, time_interval, ...);
        Rocket[active].F_Calculate_Aerodynamic_Variables(time);
        Rocket[active].F_Calculate_Gravitational_Forces(cb_mu, cb_pos_d, time);
        Rocket[active].F_Update_Rotation(time, time_interval, ...);
        Rocket[active].F_Update_Position_Velocity(time_interval, time, ...);
        Rocket[active].F_Calculate_Orbital_Elements(cb_mu);
        F_Update_Celestial_Bodies(time, time_interval, ...);
    
        if (time - cb_last_orbital_update > 5000000.0)
        {
            cb_last_orbital_update = time;
        }
    
        Rocket[active].F_Check_Apoapsis(time, time_interval);
        Rocket[active].F_Status_Check(time, ...);
        Rocket[active].F_Update_Mass (time_interval, time);
        Rocket[active].F_Staging_Check (time, time_interval);
    
        time += time_interval;
    
        if (time > 3.1536E8)
        {
            std::cout << "
    
    Break main loop! Sim Time: " << time << std::endl;
            break;
        }
        }
    ...
    }

    EDIT 2:

    Here is the difference in the assembly code. On the left is the fast code with the line

    cb_last_orbital_update = time;

    outcommented, on the right the slow code with the line.

    EDIT 4:

    So, i found a workaround that seems to work just fine so far:

    int cb_orbit_update_counter = 1; // before while loop
    
    if(time - cb_orbit_update_counter * 5E6 > 0)
    {
        cb_orbit_update_counter++;
    }

    EDIT 5:

    While that workaround does work, it only works in combination with using __declspec(noinline). I just removed those from the function declarations again to see if that changes anything, and it does.

    EDIT 6: Sorry this is getting confusing. I tracked down the culprit for the lower performance when removing __declspec(noinline) to this function, that is being executed inside the if:

    __declspec(noinline) std::string F_Get_Body_Name(int r_body)
    {
    switch (r_body)
    {
    case 0:
        {
            return ("the Sun");
        }
    case 1:
        {
            return ("Mercury");
        }
    case 2:
        {
            return ("Venus");
        }
    case 3:
        {
            return ("Earth");
        }
    case 4:
        {
            return ("Mars");
        }
    case 5:
        {
            return ("Jupiter");
        }
    case 6:
        {
            return ("Saturn");
        }
    case 7:
        {
            return ("Uranus");
        }
    case 8:
        {
            return ("Neptune");
        }
    case 9:
        {
            return ("Pluto");
        }
    case 10:
        {
            return ("Ceres");
        }
    case 11:
        {
            return ("the Moon");
        }
    default:
        {
            return ("unnamed body");
        }
    }
    
    }

    The if also now does more than just increase the counter:

    if(time - cb_orbit_update_counter * 1E7 > 0)
    {
        F_Update_Orbital_Elements_Of_Celestial_Bodies(args);
        std::cout << F_Get_Body_Name(3) << " SMA: " << cb_sma[3] << "	Pos Earth: " << cb_pos_d[3][0] << " / " << cb_pos_d[3][1] << " / " << cb_pos_d[3][2] <<
        "	Alt: " << sqrt(pow(cb_pos_d[3][0] - cb_pos_d[0][0],2) + pow(cb_pos_d[3][1] - cb_pos_d[0][1],2) + pow(cb_pos_d[3][2] - cb_pos_d[0][2],2)) << std::endl;
        std::cout << "Time: " << time << "	cb_o_h[3]: " << cb_o_h[3] << std::endl;
        cb_orbit_update_counter++;
    }

    I remove __declspec(noinline) from the function F_Get_Body_Name alone, the code gets slower. Similarly, if i remove the execution of this function or add __declspec(noinline) again, the code runs faster. All other functions still have __declspec(noinline).

    EDIT 7:So i changed the switch function to

    const std::string cb_names[] = {"the Sun","Mercury","Venus","Earth","Mars","Jupiter","Saturn","Uranus","Neptune","Pluto","Ceres","the Moon","unnamed body"}; // global definition
    const int cb_number = 12; // global definition
    
    std::string F_Get_Body_Name(int r_body)
    {
    if (r_body >= 0 && r_body < cb_number)
    {
        return (cb_names[r_body]);
    }
    else
    {
        return (cb_names[cb_number]);
    }
    }

    and also made another part of the code slimmer. The program now runs fast without any __declspec(noinline). As ElderBug suggested, an issue with the CPU instruction cache then / the code getting too big?

    Answer:


    I'd put my money on Intel's branch predictor. http://en.wikipedia.org/wiki/Branch_predictor

    The processor assumes (time - cb_last_orbital_update > 5000000) to be false most of the time and loads up the execution pipeline accordingly.

    Once the condition (time - cb_last_orbital_update > 5000000) comes true. The misprediction delay is hitting you. You may loose 10 to 20 cycles.

    if (time - cb_last_orbital_update > 5000000)
    {
        cb_last_orbital_update = time;
    }


  • 相关阅读:
    Redis 字符串(String)
    Redis 哈希(Hash)
    Redis 键(key)
    Redis 命令
    Redis 数据类型
    Redis 配置
    Log4j 2X 日志文件路径问题
    shiro项目从 log4j1X 迁移到 log4j2X
    shiro+SpringMVC 项目 配置404页面
    邮件发送-》http://service.mail.qq.com/cgi-bin/help?subtype=1&&id=28&&no=1001256
  • 原文地址:https://www.cnblogs.com/vigorz/p/10499173.html
Copyright © 2011-2022 走看看