On Mon, 1 Jul 1996, Martin J Duerst wrote:
> Have any more specific things in this area been discussed or decided
> upon? Can a draft for the rewording of the affected section(s) be
> published in this group before going to last call, to assure that we
> really have a text that is sound under all aspects?
Here is the latest wording of this in my draft.
I will send out a new version of the full standard in about a week
from now. If I get your correction before June 9, I can incorporate
it in that text.
--- cut here ---
11. Encoding Considerations for HTML bodies
11.1 Character set issues
A mail user agent that wishes to send a content-type of HTML can just do
so, so long as the normal data encoding issues are taken care of as
specified in RFC 1521 [MIME1]. However at a basic level there are some
differences between HTML being transferred by HTTP and HTML being
transferred through Internet email. When transferred through HTTP, HTML by
default uses the document character set ISO-8859-1 [HTML2]. Within
electronic mail, the default character set is US-ASCII [MIME1].
The sending of HTML messages via MIME e-mail can be seen as two layers of
Displayed text Displayed text
(e.g. with a (e.g. with a HTML viewer
HTML editor) or Web browser)
HTML markup HTML markup
MIME encoding--transport--MIME encoding
If the displayed text contains non-ascii characters, these characters
might have to be rewritten if the transport (as is common in e-mail) is
set to handle only 7-bit characters.
This rewriting can be done either at the HTML layer (using "&" entity
references or numeric character references as defined in [HTML2] section
3.2.1) or at the MIME layer (using Content-Transfer-Encoding as defined in
[MIME1] section 5).
In sending a message containing non-ascii characters, both these rewriting
methods for non-ascii characters MAY be used, and any mixture of them MAY
occur when sending the document via e-mail. Receiving mailers MUST be
capable of both decoding at the MIME layer and mapping at the HTML layer.
MIME decoding MUST take place before mapping at the HTML layer.
The charset attribute of the Content-Type attribute should be us-ascii if
and only if the html markup contains only us-ascii characters (even if the
displayed text contains non-ascii characters).
11.2 Line break characters
The MIME standard [MIME1] specifies that line breaks in the MIME encoding
(see figure 1) MUST be CRLF. The HTML standard [HTML2] specifies that line
breaks in HTML markup (see figure 2) may be either bare CRs, bare LFs or
CRLFs. To allow data integrity checks through checksums, MIME encoding of
line breaks SHOULD be such that after decoding, the line break
representation of the original HTML markup is returned.
Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL: http://www.dsv.su.se/~jpalme