VC2005 %50 Slower then VC2003

I just migrated a graphics application from VC2003 to VC2005.  After building it, I ran a small benchmark & found that VC2005 code was about %50 slower.

The Application does a great deal of floating point and integer calculations.  It also converts doubles to int quite a bit (the VC2003 version used SSE to handle this).

The VC2003 version used /G7 and /Ow (which are not available in VC2005)

Compiler Switches Used:

(vc2003)/arch:SSE2 /c /EHa /FD /GF /GL /Gs8192 /Gy /MT /nologo /O2 /Ob2
/Og /Oi /Ot /Ow /Oy /TP /W3 /WX /Zi

(vc2005)/arch:SSE2 /c /EHa /FD /G7 /GF /GL /GS- /Gs8192 /Gy /MT /nologo /O2 /Ob2
/Og /Oi /Ot /Oy /TP /W3 /WX /Zi /fp:fast /D_CRT_SECURE_NO_DEPRECATE /D_SECURE_SCL=0


Q) Did the Pentium-4 Optimizer do *that* good a job

Q) Will /G7 ever come back, or will I be forced to buy the Intel Compiler




Answer this question

VC2005 %50 Slower then VC2003

  • kamo

    Could you post the sample at http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx feedbackid=f3b99d35-3d84-44c5-950f-fab0a3179168 since the responsible folks are already asking for it.

    Thanks,
      Ayman Shoukry
      VC++ Team

  • klacounte

    I still can't post to the progress report -- when will the "service be available" again

  • BryanF

    Could you add these details to your bug link so that the folks looking into the issue would have more details

    Thanks,
      Ayman Shoukry
      VC++ Team

  • Alexander Ashwin

    The service is avialable, I can see that it is active. What do you want to check

    Thanks,
       Ayman Shoukry
       VC++ Team

  • sbf1100

    I'll try to put together a test case.

    For now, is there anything strange about the switches that I'm using  

    Also, how do I get P4 Optimized code from VC2005

  • vkbajoria

    Has there been any progress yet

    I would like to update my application before the new-year & this is the main sticking point.

  • BiohazrD

    You might find this article interesting - it mentions the reasons for removing the G7 switch.  From what I can infer, the VC 2005 code generated is already generated P4 optimized like G7.

    http://blogs.msdn.com/branbray/archive/2005/07/08/437078.aspx

    Does your code use the CRT (C Runtime library) at all   If so, you might want to look at this article.  There is a pair of redundant TLS calls that will cause a performance hit in 2005.

    http://www.codeproject.com/useritems/Improved2005crt.asp

    Also, take a look at removing RTTI (it's on by default).


  • Arstan

    I believe by default the compiler is optimizing for P4. If you are using the same switches in both cases then it is for sure something that we would like to look at. That is why a test case would be extremely beneficial.

    Thanks,
      Ayman Shoukry
      VC++ Team

  • Luciana69740

    I believe you already reported the issue at http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx feedbackid=f3b99d35-3d84-44c5-950f-fab0a3179168. Make sure you include a reproducible repro case at the bug link so that the person investigating could reproduce the issue.

    The responsible folks will take a look and update the bug link with any more details. I will make sure to follow on the issue from myside.

    Thanks,
      Ayman Shoukry
      VC++ Team

  • RensV

    I posted the message as an attachment to the second thread.  I wasn't able to add a comment, as it told me:

    Service Temporarily Unavailable

    Can't the other people see this forum

    Second:  I tried using a different version of STL, (STLport v5) but there wasn't much difference.

    I did see that VC2003 with STLPort v5 was faster then VC2005 with STLPort v5.  So I am convinced that the issue lies in the code generator, and not the STL Library.

  • Aaron Stern - MSFT

    Creating a test case is not so easy -- the application is over 500K lines of source.

    So for now, I'm going to take the easy way out and try switch changes && see what happens.

    I tried this:

    Added: /GR- (RTTI Off)
    Added: Modified tidtable.c

    Result: No Change

    Q) I have HT (hyper threading) enabled on my CPU, could that have any effect

    I'm going to try the "profile guided optimizer" and see how that effects the performance.



  • Jongo

    Thanks!

    I also placed a link to the this thread in the internal MS link.

    Thanks,
      Ayman Shoukry
      VC++ Team

  • Kirk Hilse - Microsoft

    Here is sample code that shows a little about what is happening.  This sample shows VC2005 being slower by a measurable amount every time I run it.

    Benchmark2003.exe
    Add Time: 2125
    Time: 6281

    Benchmark2003.exe
    Add Time: 2125
    Time: 6297

    2003 is about 6290 ms

    Benchmark2005.exe
    Add Time: 2156
    Time: 6828

    Benchmark2005.exe
    Add Time: 2156
    Time: 6469

    Benchmark2005.exe
    Add Time: 2110
    Time: 6515

    2005 is about 6600 - so 310 ms diff -- not much, but a measurable result.

    It doesn't match my application exactly, as my app uses many more map objects with many less entries per map.  But, it is a start!

    I'm going to keep working on this as time permits, but I wanted to pass along this info to try and solve this problem.

    //   Code

    //    Sample class
    class __declspec( dllexport ) Object
        {
    public:       
        // Str is an aftermarket String Library
        // that I use because it has a Linux version
        // Trial version may be optained at:       
        // http://www.utilitycode.com/str         
        // if it is needed.  When built, it gives
        //    warning C4800: 'BOOL' : forcing value to
        //    bool 'true' or 'false' (performance warning)
        //
        //   I have modified the code to return bool and not BOOL,
        //   but it had no effect, so I left it alone

        Str        m_csObject;   

        Object()
            {
            }
           
        Object(const Object& rhs)
            {
            m_csObject = rhs.m_csObject;
            }
           
        Object( const char *pcszName )
            {
            m_csObject = pcszName;
            }
           
        virtual ~Object()
            {
            }                   
        };
       
    typedef map<Str,Object>    Object_Map;   


    //    Global Map
    Object_Map    g_Map;


    //    Test Function 1
    const char *getObjectName(
     const char *pcszObject )
        {           
        Object_Map::iterator    itr = g_Map.find( pcszObject );
        if(g_Map.end() != itr)
            {
            return (const char *)((*itr).second).m_csObject;
            }
        return NULL;   
        }           

    int _tmain(int argc, _TCHAR* argv[])
        {
        DWORD    dwTime;
        int        i;
        int        j;
        Str        cs;
       
        dwTime = GetTickCount();
        for(i=0;i<500000;i++)
            {
            cs.Format( "Object:%d", i );
            g_Map[cs] = Object( cs );
            }
        dwTime = GetTickCount() - dwTime;
        printf( "Add Time: %d\n", dwTime );
           
        //    Now search       
        dwTime = GetTickCount();
        for(j=0;j<10;j++)
            {       
            for(i=0;i<500000;i++)
                {
                cs.Format( "::A%d", i );
                getObjectName( cs );
                }
            }
        dwTime = GetTickCount() - dwTime;
        printf( "Time: %d\n", dwTime );
       
        return 0;
        }


  • Luke Zhang - MSFT

    I have profiled the application.

    It seems that a STL map search is taking a great deal more time  then it did with VC2003.

    The map<> uses a String class for the key that is not a STL String.

    Is there a link that talks about the changes to STL with VC2005

  • VC2005 %50 Slower then VC2003