LISTSERV mailing list manager LISTSERV 15.5

Help for MHTML Archives


MHTML Archives

MHTML Archives


View:

Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font

Options:

Join or Leave MHTML
Reply | Post New Message
Search Archives


Subject: Re: New IETF draft for MHTML now ready
From: Steve Zilles <[log in to unmask]>
Reply-To:IETF working group on HTML in e-mail <[log in to unmask]>
Date:Tue, 23 Jul 1996 13:18:11 -0800
Content-Type:text/plain
Parts/Attachments:
Parts/Attachments

text/plain (421 lines)


The following was initially prepared following the IETF meeting in
Montreal, but was delayed to avoid suggesting too many changes to the then
current draft at the same time. The content of the message applies to the
.01 draft of MHTML recently distributed. This work is submitted as a
(hopefully) friendly ammendment that is not intended to change the agreed
content, but only to broaden it's applicability.

As noted in Jacob Palme's Summary of decisions at the Montreal MHTML IETF
meeting:

>The most important decision taken at the Montreal meeting was on the
>method for matching of URIs in HTML documents and in Content-Location
>statements. The Montreal meeting decided that this matching will be
>done after resolution of relative into absolute URIs, which is
>different from the draft which was input to the meeting.

Part of the discussion for this decision pointed out that rewriting of the
transmitted content, solely to allow transmission of the content, SHOULD be
avoided. The rule adopted means that senders never need to convert URIs
(and thus rewrite).

This decision means that the whole protocol is essentialy media-type
independent. A User Agent, such as a Web Browser (of the future), which can
directly read a Multipart/related document (and can display the media-type
of the root body part) can locally cache the linked body parts using the
URIs in the Content-location or Content-id headers (and Content-base if
relative) for those parts. (For privacy, these cached parts should only be
accessible to parts in the document in which they arrived.) With such a
cache, no conversion of any body part containing links using URIs is
necessary. The cache would be used to resolve the URIs in the links.
Therefore, any media-type understood by the User Agent could be sent as the
root body part; the root need not be limited to Text/HTML, but since it is
likely that most User Agents will understand Text/HTML, objects of this
type can clearly be sent as the root body part. Today's Web Browser User
Agents, supported by helper applications and plug-ins, are capable of
displaying and handling links within a wide range of media-types,
including, Application/pdf, VRML?, ...

The prior paragraph does assume that a local Web Browser or equivalent is
available to the MUA for display of the Multipart/related content. (Here
equivalent means a display program that can (a) display the relevant
included media-types and has a mechanism for caching and resolving URIs
that can be loaded with the information in the body parts.

The suggested replacement text to make the above change follows. The
existing paragraph (or fragment) is identified by ".01.DRAFT" and the
replacement follows and is identified by "PROPOSED REPLACEMENT". Comments
that are not replacement text are placed in square brackets, such as,
"[This is a comment]".

[The general strategy of these changes are to replace specific references
to Text/HTML or HTML with generic references everywhere this seems the
correct thing to do. Some of this was already done in sections 4 and 5 of
the .01 draft and this work was used as the basis for the changes proposed
below. For example
"Text/HTML body part" becomes "body part containing links" and
"The BASE element of HTML" becomes "a base specification, such as the BASE
element in HTML".
]
PROPOSED CHANGES (Identified by section):

Abstract

...

.01 DRAFT
This document describes a set of guidelines that will allow
conforming mail user agents to be able to send, deliver and display
these HTML objects. In addition it is hoped that these techniques will
also apply to the wider category of URI-enabled objects. In order to do
this, the document specifies the MIME content-headers "Content-Location"
and "Content-Base".

PROPOSED CHANGE
This document describes a set of guidelines that will allow conforming mail
user agents to be able to send, deliver and display these objects, such as
HTML objects, that can contain links represented by URIs. In order to do
this, the document specifies the MIME content-headers "Content-Location"
and "Content-Base".

1. Introduction

01.draft
The HTML format is a very common format for documents in the Internet,
and there is an obvious need to be able to send documents in this format
in e-mail [RFC821=SMTP, RFC822]. The "text/html" media type is defined
in RFC 1866 [HTML2]. This document gives additional specifications on
how to use the text/html media type as a Content-Type in MIME [RFC
1521=MIME1] e-mail messages. HTML documents commonly include links to
other objects and resources, either embedded or directly accessible
through hypertext links. When mailing a HTML document, it is often
desirable to also mail all of the additional resources that are
referenced in it; those elements are necessary for the complete
interpretation of the HTML.

Proposed replacement text
There are a number of document formats, HTML, PDF and VMRL for example,
which provide links using URLs for their resolution.There is an obvious
need to be able to send documents in these formats in e-mail [RFC821=SMTP,
RFC822]. This document gives additional specifications on how to use media
types that can contain URI based hyper-links as Content-Types in MIME [RFC
1521=MIME1] e-mail messages. Such links reference other objects and
resources. When mailing a hyper-linked object, it is often desirable to
also mail all of the additional resources that are referenced in it; those
elements are necessary for the complete interpretation of the object.

[PDF and VRML were added to provide other examples of objects with URL
references. Should these be added to the Glossary if people feel they
should stay in?]

.01.draft
An alternative way for sending HTML documents in e-mail is to only send
the URL, and let the recipient look up the document using HTTP. That
method is described in [URLBODY] and is not described in this document.

Proposed replacement
An alternative way for sending an HTML documents or any object containing
URLs in e-mail is to only send the URL for the document, and let the
recipient look up the document using HTTP. That method is described in
[URLBODY] and is not described in this document.



3. Overview

.01.DRAFT
An aggregate HTML object is a MIME-encoded message that contains a root
document as well as other data that is required in order to represent
that document (inline pictures, style sheets, applets, etc.). Aggregate
HTML objects can also include additional elements that are linked to the
first object.  It is important to keep in mind the differing needs of
several audiences. Mail sending agents might send aggregate HTML objects
as an encoding of normal day-to-day electronic mail. Mail sending agents
might also send aggregate HTML objects when a user wishes to mail a
particular document from the web to someone else. Finally mail sending
agents might send aggregate HTML documents as automatic responders,
providing access to WWW resources for non-IP connected clients.

PROPOSED REPLACEMENT
An aggregate document is a MIME-encoded message that contains a root object
as well as other data that is required in order to represent that document
(inline pictures, style sheets, applets, etc.). Aggregate documents can
also include additional elements that are linked to the first object.  It
is important to keep in mind the differing needs of several audiences. Mail
sending agents might send aggregate documents as an encoding of normal
day-to-day electronic mail. Mail sending agents might also send aggregate
documents when a user wishes to mail a particular document from the web to
someone else. Finally mail sending agents might send aggregate documents as
automatic responders, providing access to WWW resources for non-IP
connected clients.

[The usage of "document" and "object" in the above paragraph was made
consistent: document is the whole collection of objects; an object is a
part of the document.]

.01.DRAFT
Mail receiving agents also have several differing needs. Some mail
receiving agents might be able to receive an aggregate HTML document and
display it just as any other text content type would be displayed.
Others might have to pass this aggregate HTML document to an HTML
browsing program, and provisions need to be made to make this possible.

PROPOSED REPLACEMENT
Mail receiving agents also have several differing needs. Some mail
receiving agents might be able to receive an aggregate document, such as an
HTML document, and display it just as any other text content type would be
displayed. Others might have to pass this aggregate document to a document
browsing program, and provisions need to be made to make this possible.

.01.DRAFT
Finally several other constraints on the problem arise. It is important
that it be possible for an HTML document to be signed and for it to be
able to be transmitted to a client and displayed with a minimum risk of
breaking the message integrity (MIC) check that is part of the
signature.

PROPOSED REPLACEMENT
Finally several other constraints on the problem arise. It is important
that it be possible for an aggregate document to be signed and for it to be
able to be transmitted to a client and displayed with a minimum risk of
breaking the message integrity (MIC) check that is part of the signature.

[What is to be signed: the document or individual objects or either?]

.01.draft
6. Sending HTML documents without linked objects

If an HTML document is sent without other objects, to which it is
linked, it MAY be sent as a Text/HTML body part by itself. In this case,
Multipart/related need not be used.

PROPOSED REPLACEMENT
6. Sending documents without the objects to which it is linked

[This section is not about "sending documents without linked objects" but
is about "sending documents without the objects to which it is linked"]

If a document, such as an HTML document, is sent without other objects, to
which it is linked, it MAY be sent as a body part by itself. In this case,
Multipart/related need not be used.

7. Use of the Content-Type: Multipart/related

.01.DRAFT
The use of URI references creates some additional issues for aggregate
HTML objects. Normal URI references can of course be used, however it is
likely that many user agents may not be able to retrieve those objects
referred to. This document provides a means for these additional objects
to be transmitted with the HTML and for the links between these objects
to be properly resolved.

PROPOSED REPLACEMENT
[This paragraph seems to be a repetition of the Introduction and an
Overview and, therefore, it would seem that it could be deleted. The
content of this section begins in the next paragraph.]

.01.DRAFT
If a message contains one or more Text/HTML body parts and also contains
as separate body parts, data, to which hyperlinks (as defined in RFC
1866 [HTML2]) in the Text/HTML body parts refers, then this set of
objects SHOULD be sent within a Multipart/Related body part as defined
in [REL].

PROPOSED REPLACEMENT
If a message contains one or more MIME body parts containing links and also
contains as separate body parts, data, to which these links (as defined,
for example, in RFC 1866 [HTML2]) refers, then this whole set of body parts
(refering body parts and referred to body parts) SHOULD be sent within a
Multipart/Related body part as defined in [REL].

.01.DRAFT
The root of the Multipart/related SHOULD be of the Content-Type:
Text/HTML. Use of the Content-Type: Multipart/Alternative, one of whose
parts is of Content-Type: Text/HTML, is also allowed, but implementors
are warned that many mail programs treat Multipart/Alternative as if it
had been Multipart/Mixed (even though MIME [MIME1] requires support for
Multipart/Alternative).

PROPOSED REPLACEMENT
The root of the Multipart/related SHOULD be the root object of the
aggregate document. The Content-Type of root of the Multipart/related
SHOULD be the Content-Type of the root object; for example for an HTML root
object, use Content-Type: Text/HTML. Use of the Content-Type:
Multipart/Alternative, is also allowed, but implementors are warned that
many mail programs treat Multipart/Alternative as if it had been
Multipart/Mixed (even though MIME [MIME1] requires support for
Multipart/Alternative).
[Note the ommision of the requirement that one part in a
Multipart/Alternative be of Content-Type: Text/HTML. This requirement does
not make sense for an arbitrary root object and by the definition of
Multipart/Alternative the parts are "interchangeable" and appear in the
order of increasing faithfulness so the UA is left to choose which the best
alternative it can display.]

...

.01.DRAFT
The Text/HTML body MAY contain links to MIME body parts outside of the
Multipart/Related or in other messages, but such usage is discouraged.
Implementors are warned that many receiving mailers may not be able to
resolve such links.
[I do not believe this paragraph is well defined. First their may be
multiple body parts of Content-Type: Text/HTML (because the referred to
body parts may be of that Content-Type). Secondly, section 8.1 only seems
to define how reference will work within a Multipart/Related body part so
it would seem that any other kind of reference is undefined.]

PROPOSED REPLACEMENT
[remove this paragraph]

8. Format of Links to Other Body Parts

8.1 General principle

.01.DRAFT
A Text/HTML body part may contain hyperlinks to objects which are
included as other body parts in the same message and within the same
multipart/related content. Often such linked objects are meant to be
displayed inline to the reader of the main document. HTML version 2.0
[RFC 1866=HTML2] has only one way of specifying hyperlinks to such
inline embedded content, the IMG tag. New tags with this property are
however proposed in the ongoing development of HTML (example: applet,
frame).

PROPOSED REPLACEMENT
A body part, such as a Text/HTML body part, may contain links to objects
which are included as other body parts in the same message and within the
same multipart/related content. Often such linked objects are meant to be
displayed inline to the reader of the main document; for example, objects
referrenced with the IMG tag in HTML [RFC 1866=HTML2].

.01.DRAFT
In order to send such messages, there is a need to indicate which other
body parts are referred to by the links in the Text/HTML body parts.
This is done in the following way: For each distinct URI in the
Text/HTML document, which refers to data which is sent in the same MIME
message, there SHOULD be a separate body part within the
multipart/related part of the message containing this data. Each such
body part SHOULD contain a Content-Location header (see section 8.2) or
a Content-ID header (see section 8.3).

PROPOSED REPLACEMENT
In order to send such messages, there is a need to indicate which other
body parts are referred to by the links in the body parts containing links.
For example, a body part of Content-Type: Text/HTML typically has links to
other objects. The referencing of other body parts is done in the following
way: For each body part containing links and each distinct URI within it
which refers to data which is sent in the same MIME message, there SHOULD
be a separate body part within the multipart/related part of the message
containing this data. Each such body part SHOULD contain a Content-Location
header (see section 8.2) or a Content-ID header (see section 8.3).
[Note that this version also handles the (fully intended) case where the
Multipart/Related object has more than one body part/HTML object with links
within it.]


8.2 Use of the Content-Location header

.01.DRAFT
If there is a Content-Base header, then the recipient MUST employ
relative to absolute resolution as defined in RFC 1808 [RELURL] of URIs
in both the HTML markup and the Content-Location header before matching
a hyperlink in the HTML markup to a Content-Location header. The same
applies if the Content-Location contains an absolute URL, and the HTML
markup contains a BASE element so that relative URL-s in the HTML markup
can be resolved.

PROPOSED REPLACEMENT
If there is a Content-Base header, then the recipient MUST employ relative
to absolute resolution as defined in RFC 1808 [RELURL] of URIs in both the
body part containing the URI and the Content-Location header before
matching a hyperlink in the body part containing the URI to a
Content-Location header. The same applies if the Content-Location contains
an absolute URL, and the body part containing the URI contains a base
specification, such as the BASE element in HTML, so that relative URL-s in
the body part containing the URI can be resolved.

.01.DRAFT
If there is NO Content-Base header, and the Content-Location header
contains a relative URL, then NO relative to absolute resolution SHOULD
be performed (even if there is a BASE element in the HTML markup), and
exact textual match of the relative URL-s in the Content-Location and
the HTML markup is performed instead (after removal of LWSP introduced
as described in section 4.4 above).

PROPOSED REPLACEMENT
If there is NO Content-Base header, and the Content-Location header
contains a relative URI, then NO relative to absolute resolution SHOULD be
performed (even if there is a base specification, such as the BASE element
in HTML, in the body part containing the URI), and exact textual match of
the relative URI-s in the Content-Location and the body part containing the
URI is performed instead (after removal of LWSP introduced as described in
section 4.4 above).
[I have changed the URL's to URI's, is this correct?]


9 Examples

9.1 Example of a HTML body without included linked objects

.01.DRAFT
The first example is the simplest form of an HTML email message. This is
not an aggregate HTML object, but simply one by itself. This message
contains a hyperlink but does not provide the ability to resolve the
hyperlink. To resolve the hyperlink the receiving client would need
either IP access to the Internet, or an electronic mail web gateway.

PROPOSED REPLACEMENT
The first example is the simplest form of an HTML email message. This is
not an aggregate document, but simply a message with a single HTML body
part. This message contains a hyperlink but does not provide the ability to
resolve the hyperlink. To resolve the hyperlink the receiving client would
need either IP access to the Internet, or an electronic mail web gateway.


11. Encoding Considerations for HTML bodies

[Changes to this section have been omitted pending handling of the current
set of comments on character encodings. Message to follow.]

12. Security Considerations

...
.01.DRAFT
.....................Note that some caching HTML proxy servers may not
distinguish between cached objects from e-mail and HTTP, which may be a
security risk.

PROPOSED REPLACEMENT
.....................Note that some caching WWW proxy servers may not
distinguish between cached objects from e-mail and HTTP, which may be a
security risk.

.01.DRAFT
In addition, by allowing people to mail aggregate HTML objects, we are
opening the door to other potential security problems that until now
were only problems for WWW users.

PROPOSED REPLACEMENT
In addition, by allowing people to mail aggregate documents, we are
opening the door to other potential security problems that until now
were only problems for WWW users.

14. References

[If PDF is left in the Text, then the following reference is relevant.]
Ref.            Author, title
---------       --------------------------------------------------------

[PDF]           Bienz, Tim, Cohn, Richard, and Meehan, Jim, Portable
Document Format Reference Manual, Version 1.1, Adobe Systems Incorporated,
March 1, 1996.

If everyone believes the above changes, then I would suggest a small change
to the document title:

from: MIME E-mail Encapsulation of Aggregate HTML Documents (MHTML)

to: MIME E-mail Encapsulation of Aggregate Documents, such as HTML (MHTML)

        Steve Zilles

Back to: Top of Message | Previous Page | Main MHTML Page

Permalink



LISTSRV.NORDU.NET

CataList Email List Search Powered by the LISTSERV Email List Manager