Thursday, March 31, 2011

ASP.NET: Will Saving an XmlDocument to the Response.OutputStream honor the encoding?

i want to send the xml of an XmlDocument object to the HTTP client, but i'm concerned that the suggested soltuion might not honor the encoding that the Response has been set to use:

public void ProcessRequest(HttpContext context)
{
   XmlDocument doc = GetXmlToShow(context);

   context.Response.ContentType = "text/xml";
   context.Response.ContentEncoding = System.Text.Encoding.UTF8;
   context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
   context.Response.Cache.SetAllowResponseInBrowserHistory(true);

   doc.Save(context.Response.OutputStream);

}

What if i changed the encoding to something else, Unicode for instance:

public void ProcessRequest(HttpContext context)
{
   XmlDocument doc = GetXmlToShow(context);

   context.Response.ContentType = "text/xml";
   context.Response.ContentEncoding = System.Text.Encoding.Unicode;
   context.Response.Cache.SetCacheability(HttpCacheability.NoCache);
   context.Response.Cache.SetAllowResponseInBrowserHistory(true);

   doc.Save(context.Response.OutputStream);
}

Will the Response.OutputStream translate the binary data that's being written to it on the fly, and make it Unicode?

Or is the Response.ContentEncoding just informative?

If the ContentEncoding is just informative, what content encoding will the follow text strings come back in?

context.Response.ContentEncoding = System.Text.Encoding.Unicode;
context.Response.Write("Hello World");

context.Response.ContentEncoding = System.Text.Encoding.UTF8;
context.Response.Write("Hello World");

context.Response.ContentEncoding = System.Text.Encoding.UTF16;
context.Response.Write("Hello World");

context.Response.ContentEncoding = System.Text.Encoding.ASCII;
context.Response.Write("Hello World");

context.Response.ContentEncoding = System.Text.Encoding.BigEndianUnicode;
context.Response.Write("Hello World");
From stackoverflow
  • First 2 links from google

    How to: Select an Encoding for ASP.NET Web Page Globalization: http://msdn.microsoft.com/en-us/library/hy4kkhe0.aspx

    globalization Element (ASP.NET Settings Schema): http://msdn.microsoft.com/en-us/library/hy4kkhe0.aspx

  • i found it.

    The answer is no: The XmlDocument will not honor the ContentEncoding of the response stream it's writing to.

    The encoding that the XmlDocument will use when saving to a stream depends on the encoding specified in the xml declaration node. e.g.:

    <?xml version="1.0" encoding="UTF-8"?>
    

    If "UTF-8" encoding is specified in the xml declaration, then Save(stream) will use UTF-8 encoding.

    If no encoding is specified, e.g.:

    <?xml version="1.0"?>
    

    or the xml declaration node is omitted entirely, then the XmlDocument will default to UTF-8 unicode encoding. (Reference)

    If an encoding attribute is not included, UTF-8 encoding is assumed when the document is written or saved out.

    Some common encodings strings, that you could also use in the xml declaration, are:

    • UTF-8
    • UTF-16
    • ISO-10646-UCS-2
    • ISO-10646-UCS-4
    • ISO-8859-1
    • ISO-8859-2
    • ISO-8859-3
    • ISO-8859-4
    • ISO-8859-5
    • ISO-8859-6
    • ISO-8859-7
    • ISO-8859-8
    • ISO-8859-9
    • ISO-2022-JP
    • Shift_JIS
    • EUC-JP

    Note: The encoding attribute is not case sensitive:

    Unlike most XML attributes, encoding attribute values are not case-sensitive. This is because encoding character names follow ISO and Internet Assigned Numbers Authority (IANA) standards.

    If you loaded your XML from a string or a file, and it did not contain an xml declaration node, you can manually add one to the XmlDocument using:

    // Create an XML declaration. 
    XmlDeclaration xmldecl;
    xmldecl = doc.CreateXmlDeclaration("1.0", null, null);
    xmldecl.Encoding="UTF-8";
    
    // Add the new node to the document.
    XmlElement root = doc.DocumentElement;
    doc.InsertBefore(xmldecl, root);
    

    If the XmlDocument does not have an xml declaration, or if the xml declaration does not have an encoding attribute, the saved document will not have one either.

    Note: If the XmlDocument is saving to a TextWriter, then the encoding that will be used is taken from the TextWriter object. Additionally, the xml declaration node encoding attribute (if present) will be replaced with the encoding of the TextWriter as the contents are written to the TextWriter. (Reference)

    The encoding on the TextWriter determines the encoding that is written out (The encoding of the XmlDeclaration node is replaced by the encoding of the TextWriter). If there was no encoding specified on the TextWriter, the XmlDocument is saved without an encoding attribute.

    If saving to a string, the encoding used is determined by the xml declaration node's encoding attribute, if present.


    In my specific example, i am writing back to an Http client through ASP.NET. i want to set the Response.Encoding type to an appropriate value - and i need to to match what the XML itself will contain.

    The appropriate way to do this is to save the xml to the Response.Output, rather than the Response.OutputStream. The Response.Output is a TextWriter, who's Encoding value follows what you set for the Response.Encoding.

    In other words:

    context.Response.ContentEncoding = System.Text.Encoding.ASCII;
    doc.Save(context.Response.Output);
    

    results in xml:

    <?xml version="1.0" encoding="us-ascii" ?> 
    <foo>Hello, world!</foo>
    

    while:

    context.Response.ContentEncoding = System.Text.Encoding.UTF8;
    doc.Save(context.Response.Output);
    

    results in xml:

    <?xml version="1.0" encoding="utf-8" ?> 
    <foo>Hello, world!</foo>
    

    and

    context.Response.ContentEncoding = System.Text.Encoding.Unicode;
    doc.Save(context.Response.Output);
    

    results in xml:

    <?xml version="1.0" encoding="utf-16" ?> 
    <foo>Hello, world!</foo>
    

0 comments:

Post a Comment