VC++ 8 Show-Stopper?

Hi,

I'm seeing a speed hit of around 2X when switching to VC++ 2005. It's so serious and effects even the most  basic programs that I can only think I must be missing something. Please help!

Building the same (trivial) program onVS 2003 and VS 2005 I'm seeing around a 2x decrease in speed. I'm building for release and in 2005 also setting D_SECURE_SCL=0. Can someone explain

#include "stdafx.h"
#include <windows.h>
#include <vector>
#include <string>
#include <set>
using namespace std ;

int _tmain(int argc, _TCHAR* argv[])
{
    const long dwTime = ::GetTickCount();
    {
        vector<string> myarr ;
        set<string> myset ;
        myarr.resize(10000000) ;
        for(vector<string>::iterator it=myarr.begin();it!=myarr.end();it++)
        {
            myset.insert(*it) ;
        }
    }
    std::cout << "Time = " << ::GetTickCount() - dwTime << std::endl ;
    return 0;
}









Answer this question

VC++ 8 Show-Stopper?

  • Roosevelt Sheriff - MSFT

    One interesting thing to try is instead of using containers of strings, try containers of ints and make sure /GL optimizations are on.  You'll see 2005 generates code that's about twice as fast. 

    -Ben



  • cave_troll

    We appear to have gotten down to the root of the issue.  < xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

     

    We’ve gotten a new hand optimized memcmp for VS 2005.  It is now much more complicated than before and is no longer being inlined as the old version was in 7.1.  So, in the posted example where the empty string was being inserted over and over again a slowdown resulted from the function overhead added since both versions can quickly evaluate the empty string.  (In order to insert into a set, you must compare the value you are inserting to the already existing values).  However for more complicated strings, (ie strings that actually exist) the new version wins out even with the function call overhead.  Here are some metrics comparing two strings to each other 100,000,000 times:

     

    Command line: cl /MD /EHs /D_UNICODE /DUNICODE t3.cpp /D_SECURE_SCL=0 /O2 /link /out:tNew.exe

                                                                8.0                                7.1

    “” v “” (as in customer repro):                   3453                             2438

    “a” v “a” :                                               3719                             3688

    “a” v “b”                                                 3625                             3843

    “ab” v “ab” ;                                            3860                             3812

    “ab” v “ac”:                                             3750                             3969

    “alligator” v “alligator                               4438                             7016

    “gerriatric” v “juvenille”:                            4203                             6844

    “supercalifragilisticexpialidocious” v

    “supercalifragilisticexpialidocious”            5219                             8187

     

    As you can see for short, even 1 char strings the new optimized version is comparable to the old while for longer strings 8.0 is almost twice as fast.  So, in the common case (not comparing null to null) we are quite a bit faster while in the empty v empty we are about 25% slower.

     

    Thanks for reporting the issue, we'll see in the future about ensuring that even "" versus "" is faster in the next version than it was in 7.1. 

     

    Thanks,
    Ben Anderson
    Visual C++ Team



  • hipswich

    Thanks Liam,

    It's good to know that the main perf difference was due to the use of the single threaded library in the VS 2003 project.  We're still going to see if we can't track down that 8% slowdown you're seeing and will post when we know more. 

    Sorry about the multiple posts earlier, there were intermediate posts I was responding to which have since been deleted. 

     

    Ben Anderson,
    Visual C++ Team



  • margaret lenins carter

    I would agree with Ben.

    The VC2005 compiler provides more optimization techinques and strategies. LTCG (/GL) is much better as well as having new optmizations techniques like PGO.

    Nevertheless, we would be more than happy to look at other perf issues that you might discover.

    Thanks,

       Ayman Shoukry

       VC++ Team



  • CmanFsu

    Thanks for the research, it was the first bit of code I thought of to test my feeling that the compiler was generating slower code.

    -Liam

  • quantum_csfb

    Sorry about the delay Ben, I'm at GMT time.

    The Visual Studio 2005 project was created using the IDE, taking the standard release-build setting for a console app with an additional option of /D_SECURE_SCL=0

    The 2003 project again used the IDE's standard release-build console app settings.

    Last night I went through the command line settings and found the culprit. In 2005 the Runtime Library setting was set to /MD (Multi-threading DLL). In 2003 this setting was defaulted to /ML (single threaded DLL).

    When I changed both projects to /MT I found the results were similar to the Visual Studio 2003 default settings.


    There are still speed differences between the 2003 and 2005 builds, but these were more in line with the results Jonathan Caves has reported (about 8%).

    Thanks for looking into it for me

    -Liam



    Vis 2003

    /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /FD /EHsc /ML /GS /Yu"stdafx.h" /Fp"Release/ttstold.pch" /Fo"Release/" /Fd"Release/vc70.pdb" /W3 /nologo /c /Wp64 /Zi /TP



    Vis 2005 created from scratch

    /O2 /GL /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_MBCS" /FD /EHsc /MD /Yu"stdafx.h" /Fp"Release\ttstnew.pch" /Fo"Release\\" /Fd"Release\vc80.pdb" /W3 /nologo /c /Wp64 /Zi /TP /errorReport:prompt



    Vis 2005 converted from Vis2003 (MT set in conver5ed project)
    /O2 /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_VC80_UPGRADE=0x0710" /D "_MBCS" /FD /EHsc /MT /Yu"stdafx.h" /Fp"Release\ttstold.pch" /Fo"Release\\" /Fd"Release\vc80.pdb" /W3 /nologo /c /Wp64 /Zi /TP /errorReport:prompt






  • LudovicoVan

    I've been playing around with this.

    The command-line I am using for both Visual C++ 2003 and Visual C++ 2005 is

    cl /O2 /EHsc a.cpp

    With these compiler options I am seeing a definite difference in execution speed:

    2003: ~1150
    2005: ~1260

    Note: these are rough averages over several runs but they do show that there is a definite difference: not 2X but significant all the same (I am running the tests on a pretty highend laptop).

    On looking at the generated code I noticed the following:

    2003

    ; 19   :         for (vector<string>::iterator it = myarr.begin(); it != myarr.end(); it++)

      0008a 8b 7c 24 2c  mov  edi, DWORD PTR _myarr$64447[esp+72]
      0008e 8b 6c 24 30  mov  ebp, DWORD PTR _myarr$64447[esp+76]
      00092 3b fd   cmp  edi, ebp
      00094 8b f7   mov  esi, edi
      00096 74 16   je  SHORT $L65008


    2005

    ; 19   :         for (vector<string>::iterator it = myarr.begin(); it != myarr.end(); it++)

      00097 8b 74 24 34  mov  esi, DWORD PTR _myarr$73173[esp+80]
      0009b 8b 44 24 38  mov  eax, DWORD PTR _myarr$73173[esp+84]
      0009f 3b f0   cmp  esi, eax
      000a1 76 0d   jbe  SHORT
    $LL124@main
      000a3 e8 00 00 00 00  call  __invalid_parameter_noinfo
      000a8 8b 44 24 38  mov  eax, DWORD PTR _myarr$73173[esp+84]
      000ac 8d 64 24 00  npad  4
    $LL124@main:
      000b0 39 44 24 34  cmp  DWORD PTR _myarr$73173[esp+80], eax
      000b4 8b f8   mov  edi, eax
      000b6 76 09   jbe  SHORT
    $LN144@main
      000b8 e8 00 00 00 00  call  __invalid_parameter_noinfo
      000bd 8b 44 24 38  mov  eax, DWORD PTR _myarr$73173[esp+84]
    $LN144@main:
      000c1 3b f7   cmp  esi, edi
      000c3 74 2e   je  SHORT
    $LN1@main

    Notice the extra parameter validation code in 2005. To get rid of this checking you need to compile with a #define

    cl /O2 /EHsc /D_SECURE_SCL=0 a.cpp

    Doing so reverts the code back to something that looks more like what 2003 produced

    ; 19   :         for (vector<string>::iterator it = myarr.begin(); it != myarr.end(); it++)

      00095 8b 7c 24 30  mov  edi, DWORD PTR _myarr$72955[esp+76]
      00099 8b 6c 24 34  mov  ebp, DWORD PTR _myarr$72955[esp+80]
      0009d 3b fd   cmp  edi, ebp
      0009f 8b f7   mov  esi, edi
      000a1 74 16   je  SHORT
    $LN1@main

    While this gets back some of the speed: it doesn't get it all back. With this change I see a figure in the ~1240 range for the 2005 compiler.

    So there is still something else going on here. I did try Ann's trick of calling _set_sbh_threshold and while this did help a bit (~1230) it did not get rid of the remaining difference.

    One thing I did try was to just measure the time for the loop: i.e. I excluded the construction and destruction of the containers. For this I got ~420 for 2003 and ~500 for 2005. So there is still something about this loop. Note: using for_each instead of a for-loop also gave a slight improvement.



  • InfraBob

    Rellik - If you'd care to contact me directly (email in my profile), I can follow up with you and the rest of the C++ team.  We'd really like to get to the bottom of this. 

     

    Thanks,
    Ben Anderson
    Visual C++ Team



  • Saranga Amarasinghe

    Hi,

    Thanks for reporting. We will take a look on this. Hold on for now.

    Thanks,
    Nikola

  • Keessie

    Quick question:

    I'm getting the same perf #'s for both '03 and '05 compiling from the command line. 

    What are the exact command line arguments that are getting passed   (if you're compiling from the command line, what command do you use, if you're in the ide, just copy paste the command line arguments from the C++ and Linker menus)

     

    Thanks,

    Ben



  • havana

    Thx for this detailed info's. Its very interesting!

  • Ajish M

    If you could still post your command line options, I would still be interested in seeing them. 

    Thanks,

    Ben



  • Code_Explorer

    Hi!
    I just tried the same. Created a VC.2003 win32 Console application.
    Inserted the code above.
    Created a build.
    Copied the project.
    Converted the project to VC.2005
    Created a build.

    VC2003 = ~1585
    VC2005 = ~1722

    The command line options show the same for both project:
    /O2 /EHsc /MT


  • VC++ 8 Show-Stopper?