LISTSERV mailing list manager LISTSERV 15.5

Help for MHTML Archives

MHTML Archives

MHTML Archives


Next Message | Previous Message
Next in Topic | Previous in Topic
Next by Same Author | Previous by Same Author
Chronologically | Most Recent First
Proportional Font | Monospaced Font


Join or Leave MHTML
Reply | Post New Message
Search Archives

Subject: Re: More on wrongly(?) formatted urls
From: Jacob Palme <[log in to unmask]>
Reply-To:IETF working group on HTML in e-mail <[log in to unmask]>
Date:Mon, 25 Aug 1997 20:37:19 +0200

text/plain (44 lines)

At 13.21 +0100 97-08-25, [log in to unmask] wrote:
> I would, therefore, argue that RFC 2110 bis should contain the
> following shorter and much more prescriptive text:-

Your text seems sounder in some respects than our previous text.
I am not sure, however, if it covers all we want to cover?

Below a discussion of only one sentence in your text:

> A text/html root object may contain absolute or relative URLs that cannot
> be employed directly in MIME Content-base or Content location headers. This
> is because their direct employment would violate RFC 822 header syntax.

RFC 822 header syntax? Do you mean some kind of general header syntax,
which is valid for all RFC 822 headers? Mainly, in RFC 822 each header
has its own syntax definition, so we can define them to be whatever
we want.

The reason why certain characters are not allowed in URLs is not only
the problem of transporting them in RFC 822 headers. RFC 1738 says

   Characters can be unsafe for a number of reasons.  The space
   character is unsafe because significant spaces may disappear and
   insignificant spaces may be introduced when URLs are transcribed or
   typeset or subjected to the treatment of word-processing programs.
   The characters "<" and ">" are unsafe because they are used as the
   delimiters around URLs in free text; the quote mark (""") is used to
   delimit URLs in some systems.  The character "#" is unsafe and should
   always be encoded because it is used in World Wide Web and in other
   systems to delimit a URL from a fragment/anchor identifier that might
   follow it.  The character "%" is unsafe because it is used for
   encodings of other characters.  Other characters are unsafe because
   gateways and other transport agents are known to sometimes modify
   such characters. These characters are "{", "}", "|", "\", "^", "~",
   "[", "]", and "`".

   All unsafe characters must always be encoded within a URL.

Have we a different definition of "unsafe" than RFC 1738, allowing
characters in URLs which RFC 1738 does not allow?

Jacob Palme <[log in to unmask]> (Stockholm University and KTH)
for more info see URL:

Back to: Top of Message | Previous Page | Main MHTML Page



CataList Email List Search Powered by the LISTSERV Email List Manager