XML deserialization problems with invalid hex characters

Hello,

In the document at [1], I am experiencing the problem described under the heading "Deserializing Invalid XML". I am receiving XML documents that contain escaped characters that are technically invalid in the XML spec.

I am using WSE 2.0 Service Pack 3, so I have no control over deserialization myself (or at least not without writing a input filter of some kind). I do have control over which strings are serialized (I also write the client).

Does anyone know of a solution to this problem

Thanks.


Answer this question

XML deserialization problems with invalid hex characters

  • wlsteevens

    Elena:

    Eucharisto Smile

    Unfortunately, I am not using .NET 2.0 (yet). I am not sure if WSE 2.0 allows for overriding the XML deserializer, without having to resort to writing an input filter of some kind.

    If anyone knows, I'd appreciate the input.


  • Moon-Sik Kim

    XML has plenty of invalid characters (such as values < 0x20). You have the following options

    1) Inform the sender that they are generating invalid XML and that others cannot interoperate with them. They should ensure that all invalid characters are either removed or Base64 encoded before sending on the wire. For example if they are sending an image this needs to be Base64 encoded.

    2) If 1) is not possible they (the invalid chars) can be sent as character references but you then need to turn off XmlReader.Normalization (as pointed out in the other reply in this thread from Elena). Using WSE 2.0 and ASP.NET this is best achieved by creating a ASP.NET handler on the stream in advance of the WSE 2.0 handler. ASP.NET handlers can be specified in order of execution. In the handler read and rewrite the stream using XmlReader(Stream stream) and XmlWriter.WriteNode(XmlReader reader). You will have to determine where the invalid characters are either by knowing which XML elements they occur within, or inspecting the value of every character. You are effectively "fixing up" the XML as the sender would so that this can be successfully loaded by WSE 2.0 into the XmlDocument class (via an XmlReader)

    There is no hook point in WSE 2.0 that allows you to get hold of the XML stream in advance. WSE 2.0 simply has an ASP.NET handler that loads the stream directly into an XmlDocument.

    Thanks.  Mark
    WSE Program Manager



  • VASUSIVA

    Mark,

    Thanks for your answer.

    Number 1 would indeed be the solution to go with. However, it's *my* WSE 2.0 client that's generating the invalid XML!!!

    Shouldn't WSE 2.0 take care of that then

    Thanks,

    SA.

  • *Sledge

    .Net Framework V2.0 allows to provide custom Xml Reader for reading SOAP messages.  You need to add GetReaderForMessage() override to the generated proxy class, in which you can configure the XmlTextReader with reader.Normalization = false;

    protected override XmlReader GetReaderForMessage(SoapClientMessage message, int bufferSize) {

    Encoding enc = message.SoapVersion == SoapProtocolVersion.Soap12 RequestResponseUtils.GetEncoding2(message.ContentType) : RequestResponseUtils.GetEncoding(message.ContentType);

    if (bufferSize < 512)

    bufferSize = 512;

    XmlTextReader reader;

    if (enc != null)

    reader = new XmlTextReader(new StreamReader(message.Stream, enc, true, bufferSize));

    else

    reader = new XmlTextReader(message.Stream);

    reader.ProhibitDtd = true;

    reader.Normalization = true;

    reader.XmlResolver = null;

    return reader;

    }


  • XML deserialization problems with invalid hex characters