LISTSERV mailing list manager LISTSERV 15.5

Help for MHTML Archives


MHTML Archives

MHTML Archives


View:

Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font

Options:

Join or Leave MHTML
Reply | Post New Message
Search Archives


Subject: Revised text for draft-ietf-mhtml-spec
From: Jacob Palme <[log in to unmask]>
Reply-To:IETF working group on HTML in e-mail <[log in to unmask]>
Date:Wed, 24 Jul 1996 12:01:10 +0200
Content-Type:TEXT/PLAIN
Parts/Attachments:
Parts/Attachments

TEXT/PLAIN (144 lines)


JP: I have now revised the text according to proposals by Martin J Duerst,
Jay  Levitt, Larry Masinter and Einar Stefferud. Here is the full text of
those sections of the document which has been changed:

New text:
-------- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

11. Encoding Considerations for HTML bodies

11.1 Encoding layers

          Displayed text                        Displayed text
               |                                     ^
               V                                     |
         +-------------+                       +----------------+
         | HTML editor |                       | HTML viewer    |
         |             |                       | or Web browser |
         +-------------+                       +----------------+
             |                                       ^
             V                                       |
         HTML markup                             HTML markup
             |                                       ^
             V                                       |
  +---------+ +---------------+           +---------+ +---------------+
  | MIME    | | MIME content- |           | MIME    | | MIME content- |
  | encap-  | | transfer-     |           | encap-  | | transfer-     |
  | sulator | | encoder       |           | sulator | | encoder       |
  +---------+ +---------------+           +---------+ +---------------+
    |              |                            ^              ^
    V              V         +-----------+      |              |
MIME heading + MIME content->| Transport |->MIME heading + MIME content
                             +-----------+

                               Figure 1

Definitions (see Figure 1):

Displayed text   A visual representation of the intended text.

HTML markup      A sequence of octets formatted according to the
                 HTML specification [HTML2].

MIME content     A sequence of octets physically forwarded via e-mail,
                 may use MIME content-transfer-encoding as specified
                 in [MIME1].

HTML editor      Software used to produce HTML markup.

MIME content-    Software used to encode and decode non-US-ASCII
transfer-encoder characters as specified in [MIME1].

HTML viewer      Software used to display HTML documents to recipients.

11.2 Character set issues

- HTML [HTML2] as an application of SGML [SGML] allows characters to be
  denoted by character entities as well as by numeric character references
  (e.g. "latin small letter a with acute" may be represented by "&aacute;"
  or "&#225;") in the HTML markup (see Figure 1).

- HTML documents, in common with other documents of MIME content-type
  text, can use various kinds of character encodings which are indicated
  by the value of the "charset" parameter in the MIME content-type header
  (MIME heading in Figure 1). For the exact meaning and use of the
  "charset" parameter, please see [MIME1 section 7.1.1]. Note that the
  "charset" parameter refers to the charset in the HTML markup (see Figure
  1), not to the charset in the displayed text (see Figure 1). Thus, if
  the HTML markup contains only US-ASCII characters, then the value of the
  charset parameter should be US-ASCII, even if the HTML markup contains
  entities which cause the displayed text to contain non-US-ASCII-
  characters.

- Any documents including HTML documents that contain octet values outside
  the 7-bit range or that contain bare CRs or bare LFs need a content-
  transfer-encoding applied before transmission over certain transport
  protocols [MIME1, chapter 5] (MIME content in Figure 1).

The above three mechanisms are well defined and documented, and therefore
not further explained here. In sending a message, all the abovementioned
mechanisms MAY be used, and any mixture of them MAY occur when sending the
document via e-mail. Receiving mailers (together with the Web browser they
may use to display the document) MUST be capable of handling any
combinations of these mechanisms.

Some transport mechanisms may specify a default "charset" parameter if
none is suppled [HTTP, MIME1]. Because the default differs for different
mechanisms, when HTML is transferred through mail, the charset parameter
SHOULD be included, rather than relying on the default.

Example of non-US-ASCII characters in HTML: See section 9.3 above.

11.2 Line break characters

The MIME standard [MIME1] specifies that line breaks in the MIME content
(see Figure 1) MUST be CRLF. The HTTP standard [HTTP] specifies that line
breaks in transported HTML markup (see Figure 1) may be either bare CRs,
bare LFs or CRLFs. To allow data integrity checks through checksums, MIME
content-transfer-encoding of line breaks SHOULD, if necessary, be used so
that after decoding, the line break representation of the original HTML
markup is returned.

Note that since the mail content-MD5 is defined to a canonical form with
all line breaks converted to CRLF, while the HTTP content-MD5 is defined
to apply to the transmitted form. This means that the Content-MD5 HTTP
header may not be correct for Text/HTML that is retrieved from a HTTP
server and then sent via mail.

--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

JP: I am a little frightened by the fact that no one has commented on the
decision at the editing meeting in Montreal to allow non-resolvable
relative URIs in Content-Location headers. See the text below. Even if
this is OK, I would like to hear that you think it is OK, so that I can be
sure that you have just not read or considered this new text:

New text:
-------- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---

8.2 Use of the Content-Location header

If there is a Content-Base header, then the recipient MUST employ relative
to absolute resolution as defined in RFC 1808 [RELURL] of relative URIs in
both the HTML markup and the Content-Location header before matching a
hyperlink in the HTML markup to a Content-Location header. The same
applies if the Content-Location contains an absolute URL, and the HTML
markup contains a BASE element so that relative URL-s in the HTML markup
can be resolved.

If there is NO Content-Base header, and the Content-Location header
contains a relative URL, then NO relative to absolute resolution SHOULD be
performed (even if there is a BASE element in the HTML markup), and exact
textual match of the relative URL-s in the Content-Location and the HTML
markup is performed instead (after removal of LWSP introduced as described
in section 4.4 above).

The URI in the Content-Location header need not refer to an object which
is actually available globally for retrieval using this URI (after
resolution of relative URIs). However, URI-s in Content-Location headers
(if absolute, or resolvable to absolute URIs) SHOULD still be globally
unique.

------------------------------------------------------------------------
Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL: http://www.dsv.su.se/~jpalme

Back to: Top of Message | Previous Page | Main MHTML Page

Permalink



LISTSRV.NORDU.NET

CataList Email List Search Powered by the LISTSERV Email List Manager