LISTSERV mailing list manager LISTSERV 15.5

Help for MHTML Archives


MHTML Archives

MHTML Archives


View:

Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font

Options:

Join or Leave MHTML
Reply | Post New Message
Search Archives


Subject: Version 02 of draft-ietf-mhtml-spec and draft-ietf-mhtml-info
From: Jacob Palme <[log in to unmask]>
Reply-To:IETF working group on HTML in e-mail <[log in to unmask]>
Date:Mon, 29 Jul 1996 10:02:53 +0200
Content-Type:MULTIPART/MIXED
Parts/Attachments:
Parts/Attachments

TEXT/PLAIN (176 lines) , draft-ietf-mhtml-spec-02.txt (1 lines)


Report of revision of draft-ietf-mhtml-spec-01 into
draft-ietf-mhtml-spec-02:

The revised text is available at URL
ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-spec-02.txt
and also included as an attachement to this message.

A revised version of the informational accompanying document is
available at URL:
ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mhtml-info-02.txt.

As usual, all the relevant documents can be found from the home page
for our work at URL:

http://www.dsv.su.se/~jpalme/ietf/jp-ietf-home.html.

Making the document less HTML dependent
---------------------------------------

I have implemented most of the changes proposed by Steve Zilles, and whose
aim was that the standard should be applicable also for other link-
containing formats than HTML, such as PDF or VRML. I have however in a few
cases used a little less strong wording than Zilles proposed.

The abstract of draft-ietf-mhtml-spec-02 contains the following phrase
which was not part of Zilles proposal: "Only HTML objects with such links
were fully considered in developing this standard, but the standard may
still be applicable also to other link-containing object types than HTML".

And the introduction contains the following phrase which was not part of
Zilles proposal:

This version of this standard was based on full consideration only of the
needs for objects with links in the Text/HTML media type (as defined in
RFC 1866 [HTML2]), but the standard may still be applicable also to other
formats for sets of interlinked objects, liked by URIs. There is no
conformance requirement that implementations claiming conformance to this
standard are able to handle URI-s in other document formats than HTML.

Zille wanted to remove the following paragraph

The Text/HTML body MAY contain links to MIME body parts outside of the
Multipart/Related or in other messages, but such usage is discouraged.
Implementors are warned that many receiving mailers may not be able to
resolve such links.

Zilles argument for removing this paragraph was:

> I do not believe this paragraph is well defined. First there may be
> multiple body parts of Content-Type: Text/HTML (because the referred to
> body parts may be of that Content-Type). Secondly, section 8.1 only seems
> to define how reference will work within a Multipart/Related body part so
> it would seem that any other kind of reference is undefined.]

I do not fully agree with Zille on this, but I have reworded the text of the
paragraph in question to read as follows:

This standard does not cover the case where a multipart/related contains
links to MIME body parts outside of the current multipart/related or in
other MIME messages, even if methods similar to those described in this
standard are used. Implementors who provide such links are warned that
mailers implementing this standard may not be able to resolve such links.

The figure
----------

I have removed the figure from section 11, ordered so by the working group
chairman, Einar Stefferud. I still personally believe this figure would
make the text more understandable.

The value of the charset parameter
----------------------------------

Some people claim that we need not say anything about the value of the
charset parameter, since this is already specified in MIME. I do not agree
with this. MIME is rather ambiguous. Here is a quote from MIME (RFC 1521):

   The specification for any future subtypes of "text" must specify
   whether or not they will also utilize a "charset" parameter, and may
   possibly restrict its values as well.  When used with a particular
   body, the semantics of the "charset" parameter should be identical to
   those specified here for "text/plain", i.e., the body consists
   entirely of characters in the given charset.  In particular, definers
   of future text subtypes should pay close attention the the
   implications of multibyte character sets for their subtype
   definitions.

   This RFC specifies the definition of the charset parameter for the
   purposes of MIME to be a unique mapping of a byte stream to glyphs, a
   mapping which does not require external profiling information.

To me, this text is not clear.

(a) The text says 'The specification for any future subtypes of "text"
must specify whether or not they will also utilize a "charset" parameter,
and may possibly restrict its values as well'. This tends to indicate to
me that the interpretation of the charset parameter is content-type
dependent, and that thus we must specify how this is done for our content-
type, text/html.

(b) The MIME text says "the body consists entirely of characters in the
given charset". It is not clear to me whether "character" in this text
refers to "octet in the HTML markup" or "character as displayed to the
user".

Thus, we must make this clear, by choosing one of the two alternatives:

Alternative 1: The charset is the charset of the HTML markup, rather than
the charset of the displayed text. Thus, for example, the string "&auml;"
has the charset US-ASCII and not the charset ISO 8859-1.

Alternative 2: The charset is the charset of the displayed text. In that
case, "&auml;" to my mind is neither US-ASCII nor ISO 8859-1 but rather a
third charset, since ISO 8859-1 specifies that the glyph denoted by &auml;
be denoted by a single octet, not by a series of octets.

All of this would be much clearer with the figure which I have been forced
to remove from the text, since that figure clearly shows the difference
between "HTML markup" and "displayed text".

To "solve" the problem by not writing anything about this, is a bad
solution, since then some implementors may implement Alternative 1
and some Alternative 2, and this might cause interoperability problems.

The handling of line breaks
---------------------------

This is obviously a very controversial issue. It is always tempting in
such cases to leave out all text about the controversial issue. This is
however NOT a good way of resolving controversial issues in standards
development, since this will mean that different implementors will make
different assumptions and their systems may then not be able to
interoperate.

We all agree that all line breaks in the content-transfer-encoded text
must be CRLF. The issue of contention is if the HTML text before content-
transfer-encoding might contains bare LFs or bare CRs. Arguments for this
is

(a) it is very common in HTTP and it may be difficult to get implementors
to deviate from HTTP conventions on this

(b) keeping the original HTML text intact allows integrity checks with
checksums.

Arguments against this is:

MIME allows other line breaks than CRLF, but only in binary data, for
textual data, MIME requires line breaks to be CRLF in text. And the text
of RFC 1521 seems to indicate that this is valid both before and after
content-transfer-encoding.

The definitions section
-----------------------

The following new items have been added to chapter 2.2 Other terminology:

Displayed text        The text shown to the user reading a document with
                      a web browser. This may be different from the HTML
                      markup, see the definition of HTML markup below.

HTML markup           A file containing HTML encodings as specified in
                      [HTML] which may be different from the displayed
                      text which a person using a web browser sees. For
                      example, the HTML markup may contain "&lt;" where
                      the displayed text contains the character "<".

PDF                   Portable Document Format, see [PDF].

VRML                  Virtual Reality Markup Language.

------------------------------------------------------------------------
Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL: http://www.dsv.su.se/~jpalme


Back to: Top of Message | Previous Page | Main MHTML Page

Permalink



LISTSRV.NORDU.NET

CataList Email List Search Powered by the LISTSERV Email List Manager