Text in XSLTransform

I am transforming one xml file to another using XSLT file. I create XSLTranform and load the stylesheet before calling the transform function. It seems to work fine but there are a few problems with text. When the XML file that I am trying to transform includes special text such as "£" sign the transformation will fail.
I assume that I need to set some parameter to support w_char or unicode somewa, but I can't seem to fine the information needed. Can someone please help, or point me to where I can find more information on the matter.

Thanks in advance!


Answer this question

Text in XSLTransform

  • Ricas

    I am using Visual C++ 2003 and have installed the .net 1.1 framework.

     

    How do I ensure that I am using utf-16 when I am creating the tempxml.xml file when using the ISAXXMLReader


  • Doc_Brown AKA Neil

    Please read the following article about how to encode XML data correctly.

    http://msdn.microsoft.com/library/default.asp url=/library/en-us/dnxml/html/xmlencodings.asp


  • Heinz_Richards

    That's the usual trouble when producing XML by hands. It's recommended to always use XML API to produce XML to avoid such problems. Most likely that's encoding issue. Try to open C:\\temp\\tempXML.xml in Firefox - it has a better XML well-formedness checking that in IE. Alternatively put it somewhere on the Web so we can tell you exactly what's wrong with it.



  • Nate F

    Can you open the source XML file in IE without any errors

  • Skylark

    RE: so what is the best way

    Why can't you leverage the System.Xml classes to read and write from and to your source XML

    I typically use the XmlDataDocument for all of my XML read/write scenarios.



  • Emanuele Greco

    Most likely the "special" XML you are talking about is malformed. Make sure that encoding declaration in that document is right.



  • KlaasR

    Your document marked as UTF-16 encoded. Make sure that any text within is UTF-16 encoded too.

  • Ricsi

    I have done a bit more investigation and minimised all my files. This is what i have found out so far -

    When just using the XSLTransform it will be fine if the tempxml.xml file has £ instead of £ but replacing with £ causes the same error (editing in notepad).

    When including the ISAXXMLReader, the £ will be automatically converted to the symbol £ which will cause the XSLTransform to fail. I guess I can get round this to explicitly output the value £ in place of the £ but it is not ideal as I would need to find each symbol that causes a crash and convert.

    The following is the minimised source -

    myxslt.xml:

    < xml version="1.0" >
    <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method='xml' version='1.0' encoding='UTF-16' indent='yes'/>

    <xsl:template match="/">

    <xsl:for-each select="descendant::*">
     <xsl:if test="self::mytext">
      <mytext>
       <text><xsl:value-of select="."/></text>
      </mytext>
     </xsl:if>
    </xsl:for-each>

    </xsl:template>
    </xsl:stylesheet>

     

    tempXML.xml:

    < xml version="1.0" >


    <mytext>text test here</mytext>

     

    newXML.xml:

    < xml version="1.0" encoding="utf-16" >
    <mytext>
      <text>text test here</text>
    </mytext>

     

    Putting in £ anywhere in between the mytext tags will cause the error but will work fine with &#163;. Any ideas what I am doing wrong or what I can do to get away from having to convert each symbol

    Thanks for the help so far!


  • Andrei Romanenko - MSFT

    If I remove the £ then yes. But if it is inserted, then IE says there is an error where I have placed the £.
  • cope22

    I tried something similar and did NOT receive such an error. However, my test used the raw £ symbol and it did not get transformed to its entity equivalent. Are you using XSLT 1.0 It may be that your source XML contains a transformed £ symbol that XslTransform is choking on.

  • Myron_B

    Ok - after resaving the tempXML as unicode in notepad and retrying - it seems to work.

    So the problem is the SAXXMLReader - curently I am creating the tempXML file as "FILE" and writing to it using fprintf with char data. I read somewhere that using wchar_t does not gurantee utf-16 encoding, so what is the best way


  • MickR

    Thanks - I have been investigating this further but have had no luck in solving.

    I am currently using an XHTML file as my input and using ISAXXMLReader to edit and create a new XML file. Then using the XSLTTransform to compose this XML file to the format that I want.

    The ISAXXMLReader basically reads the XHTML file and outputs the relevent XML using char and fprintf to create the interim XML doc. I assume this is the correct format, but when the "£" sign is in the doc to be transformed I get the following error -
    An unhandled exception of type 'System.Xml.XmlException' occurred in system.xml.dll

    Additional information: System error.

    This is the code i am using to transform -

    // Create a resolver with default credentials.
    XmlUrlResolver* resolver = new XmlUrlResolver();
    resolver->Credentials = System::Net::CredentialCache::DefaultCredentials;

    // Create the XslTransform object.
    XslTransform* xslt = new XslTransform();

    // Load the stylesheet.
    xslt->Load(S"myxslt.xsl", resolver);

    // Transform the file.
    xslt->Transform("C:\\temp\\tempXML.xml", "C:\\temp\\newXML.xml", resolver);

    Can anyone see anything obvious that I am doing wrong
    The xslt does utf-16 and seems to work fine when the special characters are not used.

    Thanks!


  • JoeAaaa

    RE: When including the ISAXXMLReader, the &#163; will be automatically converted to the symbol £ which will cause the XSLTransform to fail. I guess I can get round this to explicitly output the value &#163; in place of the £ but it is not ideal as I would need to find each symbol that causes a crash and convert.

    That's strange since my test used a raw £ symbol in the source XML file and XslTransform did not choke on it. I used .Net 2.0 for this test. What version of .Net are you using



  • Mikeeee

    Thanks for the link. I have had a read, but can I use this to set the encoding I am using the SAX XML rather than the DOM.
  • Text in XSLTransform