Thanks Larry --
For the CR vs LF vs CRLF issue resolution, we should await text from
Dave Crocker who has taken an assignemnt to explain it for us and for
ASID WG where the same issue arose with massive confusion, even worse
than in our MHTML case.
Dave -- Can you please provide us with a very few words on this, and a
date when you expect to have something for us to look at?
From Larry Masinter's message Wed, 3 Jul 1996 14:03:05 PDT:
}I think the only reasonable course is to say that HTML in mail MUST be
}labelled with a charset unless the document only uses US-ASCII.
}> 11.2 Line break characters
}> The MIME standard [MIME1] specifies that line breaks in the MIME encoding
}> (see figure 1) MUST be CRLF. The HTML standard [HTML2] specifies that line
}> breaks in HTML markup (see figure 2) may be either bare CRs, bare LFs or
}> CRLFs. To allow data integrity checks through checksums, MIME encoding of
}> line breaks SHOULD be such that after decoding, the line break
}> representation of the original HTML markup is returned.
}The "canonical form" of HTML is with CRLF for end of line, independent
}of the transport. However, HTML may be represented in non-canonical
}form (bare CR, bare LF) and some transports allow transmission of data
}in non-canonical form. Some data integrity checks apply to the
}canonical form, others to the transmitted form. The Mail content-MD5
}is defined to apply to the canonical form. The HTTP content-MD5 is
}defined to apply to the transmitted form.
}This may mean that the content-MD5 HTTP header may not be correct for
}text/html that is retrieved from a HTTP server and then sent via mail.