XmlSerializer or custom serialization?

Hi!

I'm currently not sure whether I should use XmlSerializer to save the properties of a class to an xml file or if should just write the file manually with XmlTextWriter.
The problem hereby lies in the speed of deserialization. My program will load an undefined number of such files and create instances of the corresponding class at startup. Thus the deserialization operation should take as few time as possible, even if that means writing more code.
If I have a class named Employee with the properties Name, Address, Age, and Salary I'd deserialize it with an XmlReader as follows:


while (reader.Read()) {
    if (reader.NodeType == XmlNodeType.Element) {
        switch (reader.LocalName) {
            case "Name": m_Name = reader.ReadString(); break;
            case "Address": m_Address = reader.ReadString(); break;
            case "Age": m_Age = reader.ReadElementContentAsInt(); break;
            case "Salary": m_Salary = reader.ReadElementContentAsInt(); break;
            default: break;
        }
    }
}

 

This means that in the worst case I'll have four string comparisons for one element. My question is, does the XmlSerializer use some other strategy for deserialization, so that it might be faster or is it the same anyway

Thanks in advance!



Answer this question

XmlSerializer or custom serialization?

  • EricPaul

    Wouldn't comparing pointers be somwhat useless Strings not known at compile time (this applies to the strings returned by reader.LocalName) are not added to the intern pool. Thus the pointers can't be the same.
    But apparently XmlSerializer also just compares each node name to a list of possible names, similar to my example from above, am I right


  • Scott W.

    But "Name", "Address" is in the pool (see case statements). So IsInterned (which will be called once) will not return null for that values. XmlSerializer probably will use IsStartElement.
  • victoriak68

    OK, I think I get it now. Thanks for making this clear.
    To fnally be able to decide between XmlSerializer and custom serialization I wrote a little testing application where I'd let XmlSerializer and my own method load desirialize some data while being timed. In the end it turned out that my custom switch-case desisirialization was a tick faster than XmlSerializer.


  • JohnsonZhang

    Wow, there is an alarming amount of mis-information in this thread.  The XmlReader does not use string interning.  Instead, it has it's own XmlNameTable object.  So The fastest way to deserialize is to pre-atomize the strings you are looking for using this NameTable, for example:

    object atom1 = reader.NameTable.Add("Name");

    Then while reading you can do pointer comparison instead of string comparison:

    if ((object)reader.LocalName == atom1) { .... } else if ...

    Unfortunately you have to use if/else structure instead of switch since the atoms are not constants. 

    I believe the XmlSerializer in .NET 2.0 does all this, so it will be fast.   Now there is an initial hit on the first call to new XmlSerializer(typeof(X)) since it dynamically generates a dynamic serializer assembly using reflection and all that which is slow.  You can use the new sgen tool in VS 2005 to pre-build this serialization code and include it in your assembly.  See the new "Generate serialization assembly" property on the C# Build settings page.

  • Shailesh Saini

    When you say "undefined number", what do you feel is the max # of files you'd be loading   Also, why are you storing all the data in separate files   Opening tons of individual files will probably take longer than the serialization process for that file ... it would be far better if all data was stored in a single file.

    Josh



  • AllanP

    XmlSerializer creates object during serialization so will work slower that just XmlReader.

    switch operator doesn't compare strings, it compares pointers and will look like 4 ifs that compare numbers. (Hint: String.IsInterned) XmlSerializer may use string comparison.

  • Matt_chrs

    Undefined means that the user can create any number of independent data sets (don't be fooled by that word combination, I have to use XML to save them to the hd - a database is not possible). These sets are then saved in whatever location the user chooses. But because they are part of a certain collection they have to be loaded at startup.


  • Ken Schall

    OK, but do they have to be saved to multiple files, can't they be stored into a single file  

    I asked about the magnitude due to performance.  If you think the max is somewhere under 100 xml objects, it really doesn't matter which method you use, it should be pretty fast (less than a few seconds).

    Josh

  • CHowell

    tommazzo,
    I was reading the Chapter about XML Serializing in Dino Esposite's book, "Applied XMl Programming for Microsoft .NET". He confirms the conclusion that you arrived at by trial and error.
    "...Each instantiation of the XmlSeraializer class results in an ad hoc assembly being created and loaded. After that, the reading and writing performance you get from the XML Serializer is not different from that of other types of reading writing tools (here referring to the SOAP or Binary Serializer class implementations). The creation of the assembly takes several milleseconds -- probably several hundred milleseconds -- as compared to the one or two milleseconds that serializing a class might take. This means that using the XML Serializer taxes you for about half a second each time you instantiate the XmlSerializer class." 
    The author goes on to explain the inner workings of the XmlSerializer. The condensed version is that the XMLSerializer class generates C# source code from the type information of your class. A temporary assembly is created from the dynamically generated source. There are other features incorporated into the XmlSerializer class instance, and the author recommends that it be used for the more complex object serialization. I'm leaving alot of the explanation out; the author goes into considerable details in describing how XML serialization works and demonstrating this through numerous code examples.

  • Steve Tyson

    I originally thought it would be easier to explain it with the employee example, but perhaps I should just try to portrait the real situation.
    Each data set is a photo album, which has some properties by itself and then, of course, also contains a collection of photos. That's why each data set has to be saved in its own file.
    What I ask myself is, whether XmlSerializer also just uses some kind of switch-case structure to determine what property the current node represents or if it has some kind of faster method.


  • XmlSerializer or custom serialization?