LISTSERV mailing list manager LISTSERV 15.5

Help for MHTML Archives

MHTML Archives

MHTML Archives


Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font


Join or Leave MHTML
Reply | Post New Message
Search Archives

Subject: Issue of assumed charset
From: Jacob Palme <[log in to unmask]>
Reply-To:IETF working group on HTML in e-mail <[log in to unmask]>
Date:Fri, 3 May 2002 19:42:05 +0200

text/plain (32 lines)

Two students at my university are writing a Master's thesis
by developing an MHTML-compliant software. They have
wondered about the following text in the MHTML standard
(RFC 2557)

" The charset parameter value "US-ASCII" SHOULD be used if the URI
   contains no octets outside of the 7-bit range. If such octets are
   present, the correct charset parameter value (derived e.g. from
   information about the HTML document the URI was found in) SHOULD be
   used. If this cannot be safely established, the value "UNKNOWN-8BIT"
   [RFC 1428] MUST be used. "

My understanding of this is that the above clause is valid
if you have received a document via some transport
mechanism which does not tell you the charset.

However, if you have received a document via ordinary HTTP
download, and there is no charset indication in the HTTP
header, then the default charset is "ISO-8859-1" and not
"US-ASCII" or "UNKNOWN-8BIT". So the rule quoted above does
not apply to documents downloaded via HTTP before being

Of course, you might check if all the characters are 7-bit,
and then e-mail it as "US-ASCII" instead of "ISO-8859-1",
but this should not be required.

Is this right? If not, say so!

Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL:

Back to: Top of Message | Previous Page | Main MHTML Page



CataList Email List Search Powered by the LISTSERV Email List Manager