LISTSERV mailing list manager LISTSERV 15.5

Help for MHTML Archives

MHTML Archives

MHTML Archives


Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font


Join or Leave MHTML
Reply | Post New Message
Search Archives

Subject: Re: More on wrongly(?) formatted urls
From: Jacob Palme <[log in to unmask]>
Reply-To:IETF working group on HTML in e-mail <[log in to unmask]>
Date:Sun, 24 Aug 1997 14:31:20 +0200

text/plain (74 lines)

At 00.22 -0700 97-08-24, Einar Stefferud wrote:
> In the interests of proper Internationalization, I have some specific
> suggestions for improvement of the grammar and semantics of the new
> proposed text...

Here is the new text after Einar's changes:

     Handling of URLs containing inappropriate characters

     Some URLs may contain characters that are inappropriate for an
     RFC 822 header, either because the URL itself has an incorrect
     syntax or the URL syntax standard has been changed to allow
     characters not previously allowed in MIME headers. To include
     such a URL in a mail header, an implementation can either (a)
     arrange so that the URL becomes correctly formatted or (b)
     encode the header using the encoding method described in RFC

     Method (a) MUST be applied to the URL both in Content-
     Location headers and in body text. It MUST NOT be reversed by
     receiving mailers before matching hyperlinks to body parts.

     Method (b) MUST not be applied to the URL in the HTML text and
     MUST be reversed by receiving clients before comparing
     hyperlinks in body text to URLs in Content-Location headers.

     Method (a) is not always easy. It MUST include cooperation
     with the author and the software used to produce the faulty
     URL. The encoding method of RFC 1738 can cause a correct URL
     to become faulty if not changed the right way. Changing the
     URL of documents already available on the Internet or an
     Intranet can invalidate existing links to the changed
     document. Changing the HTML body can also invalidate message
     integrity checks. For these reasons, this standard only
     recommends method (b).

     With method (b), the charset parameter value US-ASCII can be
     used, or, if the URL contains octets outside of the 7-bit
     range, "UKNOWN-8BIT" [RFC 1428] or "UTF-8" may be appropriate.
     Note that for the MHTML processing of (matching URLs in body
     text to URL in) Content-Location headers the choice of
     character encoding need not be the "correct" choice. It need
     only be a choice which, after reversal of the encoding by the
     receiving mailer, returns the same octet string as before the

The only thing I am not sure I agree with in this text is the
statement at the end of the fourth paragraph: "For these reasons,
this standard only recommends method (b)".

It seems rather funny to say that a standard only recommends
keeping errors in using another standard, and does not recommend
correcting those errors.

Possibly, we are seeing this from a different viewpoint.

If you see it from the viewpoint of a mailer which is asked to
forward existing HTML text, choice (b) is the right one.

If, however, we see it from the viewpoint of an author who
wants to create and send correctly formatted messages, choice
(a) may be the right choice. And if this author is creating the
HTML text in conjunction with sending it, the problems with
choice (a) are not so serious. Telling the author "your HTML
is not quite correct, but will still probably work. Do you
want to correct it or send it as it is?" does not seem to me
to be something which we do *not* recommend. It is too much
encroaching on the freedom of the software designer!

Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL:

Back to: Top of Message | Previous Page | Main MHTML Page



CataList Email List Search Powered by the LISTSERV Email List Manager