Hey,
I have a performance problem with the new Visual Studio 2005. My code (which is a genetic algorithm) is 3 times slower when compiled under Visual Studio 2005 compared to when I was compiling it under Visual Studio 2003.
With Visual Studio 2003, it used to run in 3.0639 seconds, and now it runs in 9.0031 seconds. I run both code in Release. I tried every optimization option, and adding the /D_SECURE_SCL=0 compiler command to the project options with only minor differences.
Is there any other way to get back to the speed Visual Studio 2003 It is a bit absurd to upgrade to a new compiler and get lesser performances.
Thanks a lot!!!
Antoine Atallah

Program runs slower when compiled with Visual Studio 2005 (in Release with optimization)
RealmRPGer
Overall, the new "new" is 33% slower than the previous "new"
The same thing can be observed in the code of memcpy, where there is an if checking for the validity of the parameters in the new 2005 versions. Since a bunch of those IFs were added here and there in the library, it makes it (overall) 3 times slower for algorithm and other computations. Even pure C is slower in some cases (for example, the memcpy and memmove...).
I hope there would have been some #define to speed up stuff...
Antoine
Kosmosniks
We are currently investigating the repro case. I will keep you updated once the analysis of the issue is done.
Thanks,
Ayman Shoukry
VC++ Team
Cat man
Hi Ayman,
The code is proprietary and I cannot give it freely on the web. However, here is a part of the code which exhibits the same problem. Basically, it is a linked list and a bunch of nodes. Inserting 1000000 nodes in the linked list is faster when compiled under Visual Studio 2003 in Release (takes 501 milliseconds on my pc) than when compiled under Visual Studio 2005 in Release (takes 753 milliseconds on my pc).
Here is the sample code (stripped for the web):
#include "stdafx.h"
#include "ALAbstractList.h"
#include "ALIntNode.h"
#include <time.h>
int _tmain(int argc, _TCHAR* argv[])
{
ALAbstractList alalist;
clock_t time = clock();
for (int i = 0; i < 1000000; i++)
{
alalist.InsertAtEnd(new ALIntNode(i));
}
printf("Time Spent (in milliseconds) = %d\n", (clock() - time));
return 0;
}
------------------------------
/**
* \author Antoine Atallah
* \version 1.0
* \date 04/25/2005
* \brief Defines a generic Abstract List node which can contain integers
*
*/
#pragma once
#include "alconstants.h"
#include "alabstractnode.h"
/** \class ALIntNode
\brief Abstract Node for Integers
*/
class ALIntNode : public ALAbstractNode
{
public:
ALIntNode(int data) {i_data = data;};
virtual ~ALIntNode(void);
int GetData() {return i_data;};
private:
int i_data; ///< Data contained by the node!
};
--------------------------------
/**
* \author Antoine Atallah, Aidee Carriere
* \version 1.0
* \date 04/21/2005
* \brief The abstract node is the base class for anything which can be inserted in an
* abstract list (see ALAbstractList.h).
*
*/
#pragma once
#include "stdafx.h"
/*! \class ALAbstractNode
\brief This is a class describing an abstract node (purely virtual class)
*/
class ALAbstractNode
{
//friend class ALAbstractList;
public:
/**
* \brief Constructor of a node: sets the type of the node.
* \param Type is the type of the node
* \param MaxStackSize is the maximum length of the stack of Next/Prev nodes (useful to put a link in many lists)
*/
ALAbstractNode() {palan_nextNode = NULL; palan_prevNode = NULL;};
/**
* \brief Virtual Destructor of a node
*/
virtual ~ALAbstractNode(void) {};
/**
* \brief Sets the next node to the current node
* \param nextNode is a pointer to the new next node (can be NULL to clear it)
*/
inline void SetNextNode(ALAbstractNode * nextNode) {palan_nextNode = nextNode;};
/**
* \brief Gets the next node to the current node
* \return Returns a pointer to the next node
*/
inline ALAbstractNode * GetNextNode() {return palan_nextNode;};
/**
* \brief Sets the previous node to the current node
* \param prevNode is a pointer to the new previous node (can be NULL to clear it)
*/
inline void SetPrevNode(ALAbstractNode * prevNode) {palan_prevNode = prevNode;};
/**
* \brief Gets the previous node to the current node
* \return Returns a pointer to the previous node
*/
inline ALAbstractNode * GetPrevNode() {return palan_prevNode;};
private:
ALAbstractNode * palan_nextNode; ///< Pointer to the next node in the list
ALAbstractNode * palan_prevNode; ///< Pointer to the previous node in the list
};
--------------------------------
/** \file
* \author Antoine Atallah, Aidee Carriere, Jean-Francois Cote
* \version 1.0; Antoine and Aidee
* \version 1.1; Antoine
* \date 04/21/2005, 05/03/2005
*
*/
#pragma once
#include "stdafx.h"
#include "ALAbstractNode.h"
#include "ALConstants.h"
#define AL_END_OF_LIST 65534 ///< Constant used to insert at the end of a list
#define AL_BEGINNING_OF_LIST 0 ///< Constant used to insert at the beginning of a list
// Forward Declaration
class ALAbstractNode;
/*! \class ALAbstractList
\brief This is a class describing the Abstract List system
*/
class ALAbstractList
{
public:
// Class Constructor
ALAbstractList();
virtual ~ALAbstractList(void);
// Basic list functions for data insertion
void InsertAtEnd(ALAbstractNode * nodeToInsert);
void InsertAfter(ALAbstractNode * node, ALAbstractNode * nodeToInsert);
protected:
ALAbstractNode * palan_firstNode; ///< Pointer to the first node of the list
ALAbstractNode * palan_lastNode; ///< Pointer to the last node of the list
int i_listSize; ///< Size of the list
};
---------------------------
#include "stdafx.h"
#include "alabstractlist.h"
/*!
* \brief
* The constructor of the list.
*
* \param Mutexed
* Indicates if the list must be thread safe or not. If this variable is set to TRUE
* the list is thread safe, otherwise, it is not.
*/
ALAbstractList::ALAbstractList()
{
// initializes the list items
palan_firstNode = NULL;
palan_lastNode = NULL;
i_listSize = 0;
}
/*!
* \brief
* Destructor of the abstract list
*/
ALAbstractList::~ALAbstractList(void)
{
// Removed for code example simplicity...
}
/*!
* \brief
* Inserts a node after the selected node
*
* \param node is the selected node
* \param nodeToInsert is the node to insert after the selected node
* \remarks This function is NOT thread safe
*/
void ALAbstractList::InsertAfter(ALAbstractNode * node, ALAbstractNode * nodeToInsert)
{
ALAbstractNode * palan_tempNode = node;
// Increases the size of the list, since insertion is always guaranteed
i_listSize++;
// If the list is empty... Insert at the beginning
if ((palan_firstNode == NULL) && (palan_lastNode == NULL))
{
palan_firstNode = nodeToInsert;
palan_lastNode = nodeToInsert;
nodeToInsert->SetNextNode(NULL);
nodeToInsert->SetPrevNode(NULL);
}
//we insert at the beginning of the list
else if (palan_tempNode == NULL)
{
nodeToInsert->SetNextNode(palan_firstNode);
nodeToInsert->SetPrevNode(NULL);
palan_firstNode->SetPrevNode(nodeToInsert);
palan_firstNode = nodeToInsert;
} // if (palan_tempNode == NULL)
// We insert at the end of the list
else if (palan_tempNode->GetNextNode() == NULL)
{
palan_lastNode->SetNextNode(nodeToInsert);
nodeToInsert->SetNextNode(NULL);
nodeToInsert->SetPrevNode(palan_lastNode);
palan_lastNode = nodeToInsert;
} // if (palan_tempNode->GetNextNode() == NULL)
else
{
// Sets the link of the node!
nodeToInsert->SetPrevNode(palan_tempNode);
nodeToInsert->SetNextNode(palan_tempNode->GetNextNode());
palan_tempNode->SetNextNode(nodeToInsert);
nodeToInsert->GetNextNode()->SetPrevNode(nodeToInsert);
}
}
/*!
* \brief
* Inserts a node after the selected node
*
* \param nodeToInsert is the node to insert after the selected node
* \remarks This function is NOT thread safe
*/
void ALAbstractList::InsertAtEnd(ALAbstractNode * nodeToInsert)
{
InsertAfter(this->palan_lastNode, nodeToInsert);
}
RobertLevy
Can you send us the command line parameters you are using to build the app in VS2003 and VS 2005 If you are building in the IDE you can just send us the build logs.
When you say that "new" is now 33% slower you are ignoring the cost of the following line in new
res = _heap_alloc(cb);
This function eventually calls into HeapAlloc which maintains the heap. Heap allocation is orders of magnitude costlier than the additional check in new.
mem* functions should not have any additional checks. If you are seeing slowdown in using these functions please send us a repro case.
Other CRT functions do have the additional parameter tests as a result of Secure CRT work.
Thanks,
Sridhar Madhugiri
Software Developer
VisualC++
---
This posting is provided "AS IS" with no warranties, and confers no rights.
jh55557777
Meher123
Sure, here are the compiler options for 2005:
Compiler: /O2 /Ob1 /Oi /Ot /Oy /GT /GL /D "WIN32" /D "_WINDOWS" /D "_VC80_UPGRADE=0x0710" /D "_MBCS" /FD /EHa /MT /GS- /arch:SSE2 /fp:fast /GR- /Fo"Release\\" /Fd"Release\vc80.pdb" /W2 /nologo /c /Wp64 /TP /wd4996
/errorReport:prompt
Linker: /OUT:"Release/Core.exe" /INCREMENTAL:NO /NOLOGO /MANIFEST
/MANIFESTFILE:"Release\Core.exe.intermediate.manifest" /SUBSYSTEM:CONSOLE
/HEAP:10485760,10485760 /STACK:10485760,10485760
/LARGEADDRESSAWARE:NO /TSAWARE:NO /OPT:REF /OPT:ICF
/OPT:NOWIN98 /LTCG /MACHINE:X86 /FIXED:No /ERRORREPORT:PROMPT Ws2_32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
For 2003:
Compiler: /O2 /Ot /GT /G7 /GA /D "WIN32" /D "_WINDOWS"
/D "_MBCS" /FD /EHsc /MT /GS /arch:SSE2 /Fo"Release/"
/Fd"Release/vc70.pdb" /W2 /nologo /c /Wp64 /TP
Linker: /OUT:"Release/Core.exe" /INCREMENTAL:NO /NOLOGO
/SUBSYSTEM:CONSOLE /HEAP:10485760,10485760 /STACK:10485760,10485760
/OPT:REF /OPT:ICF /MACHINE:X86 /FIXED:No Ws2_32.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
Thanks a lot!
Antoine
Richard Bunce
Hello,
I think Gorm is right about the new... Looks like someone messed it up in 2005, the code in the CRT looks like:
VISUAL STUDIO 2005:
void * operator new( size_t cb )
{
void *res;
for (;;) {
// allocate memory block
res = _heap_alloc(cb);
// if successful allocation, return pointer to memory
if (res)
break;
// call installed new handler
if (!_callnewh(cb))
break;
// new handler was successful -- try to allocate again
}
RTCCALLBACK(_RTC_Allocate_hook, (res, cb, 0));
return res;
}
VISUAL STUDIO 2003:
void * operator new( size_t cb )
{
void *res = _nh_malloc( cb, 1 );
RTCCALLBACK(_RTC_Allocate_hook, (res, cb, 0));
return res;
}
Can anyone explain why there is a loop in the New operator Looks to me like a humongous waste of processing waste of time!
Thanks a lot!
Antoine
June Low
Thanks,
Ayman Shoukry
VC++ Team
SpurryMoses
I am actually interested in investigating such slow performance. Could you please post a sample exhibiting the problem I will be more than happy to look at what is exactly happening.
Thanks,
Ayman Shoukry
VC++ Team
Erik Campo
Why does the existance of a loop mean "humongous waste of processing" It doesn't, at all.
Paul Baudouin
> If you look at the code of malloc, it is actually worse than the code of new in 2005... there are more operations and it ends up with the same _heap_alloc in the end... basically, there is no performace gain by choosing one or the other.
Sorry. I was a bit unclear. By "malloc" I mean memory allocation on heap (e.g. _heap_alloc). I don't believe there is any real difference between allocating memory with "new", "malloc" or even "_heap_alloc"...
What I ment to say was that you can, with some effort. allocate all the memory you need in one or a few chunks, for then to give it out a bit at the time in a metod you call "operator new".
Excuse me if I don't always make sense.
Larry D.
carb
If you look at the code of malloc, it is actually worse than the code of new in 2005... there are more operations and it ends up with the same _heap_alloc in the end... basically, there is no performace gain by choosing one or the other.
Antoine
Rivorus
Interesting.
That is a lot of "new"s.
Looks to me that malloc is taking its share of the ticks here, is this the case for your other program, too
My immediate guess would be that there is more code behind every malloc for some (security) reason.
Does it help allocating more at a time
Maybe overriding the new operator could be a possibility for your scenario If you know how much memory you need, I guess one million malloc calls (and possible more for your leaf objects) are unneccessary brutal
Hope this helps