At 00.22 -0700 97-08-24, Einar Stefferud wrote:
> In the interests of proper Internationalization, I have some specific
> suggestions for improvement of the grammar and semantics of the new
> proposed text...
Here is the new text after Einar's changes:
Handling of URLs containing inappropriate characters
Some URLs may contain characters that are inappropriate for an
RFC 822 header, either because the URL itself has an incorrect
syntax or the URL syntax standard has been changed to allow
characters not previously allowed in MIME headers. To include
such a URL in a mail header, an implementation can either (a)
arrange so that the URL becomes correctly formatted or (b)
encode the header using the encoding method described in RFC
Method (a) MUST be applied to the URL both in Content-
Location headers and in body text. It MUST NOT be reversed by
receiving mailers before matching hyperlinks to body parts.
Method (b) MUST not be applied to the URL in the HTML text and
MUST be reversed by receiving clients before comparing
hyperlinks in body text to URLs in Content-Location headers.
Method (a) is not always easy. It MUST include cooperation
with the author and the software used to produce the faulty
URL. The encoding method of RFC 1738 can cause a correct URL
to become faulty if not changed the right way. Changing the
URL of documents already available on the Internet or an
Intranet can invalidate existing links to the changed
document. Changing the HTML body can also invalidate message
integrity checks. For these reasons, this standard only
recommends method (b).
With method (b), the charset parameter value US-ASCII can be
used, or, if the URL contains octets outside of the 7-bit
range, "UKNOWN-8BIT" [RFC 1428] or "UTF-8" may be appropriate.
Note that for the MHTML processing of (matching URLs in body
text to URL in) Content-Location headers the choice of
character encoding need not be the "correct" choice. It need
only be a choice which, after reversal of the encoding by the
receiving mailer, returns the same octet string as before the
The only thing I am not sure I agree with in this text is the
statement at the end of the fourth paragraph: "For these reasons,
this standard only recommends method (b)".
It seems rather funny to say that a standard only recommends
keeping errors in using another standard, and does not recommend
correcting those errors.
Possibly, we are seeing this from a different viewpoint.
If you see it from the viewpoint of a mailer which is asked to
forward existing HTML text, choice (b) is the right one.
If, however, we see it from the viewpoint of an author who
wants to create and send correctly formatted messages, choice
(a) may be the right choice. And if this author is creating the
HTML text in conjunction with sending it, the problems with
choice (a) are not so serious. Telling the author "your HTML
is not quite correct, but will still probably work. Do you
want to correct it or send it as it is?" does not seem to me
to be something which we do *not* recommend. It is too much
encroaching on the freedom of the software designer!
Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL: http://www.dsv.su.se/~jpalme