Image of Navigational Map linked to Home / Contents / Search MIME and the Mangled Mail

Peter Routtier-Wone - Praxa
Image of Line Break

MIME (multipurpose internet mail extensions) was a protocol originally designed for mail. Since then it has been bent to many other purposes but the protocol remains the same.

The essence of MIME is that it describes a stream, and the stream can contain MIME objects - it can be nested.

The MIME object begins with a header specifying its type (eg text/plain or image/jpeg) and a section boundary marker. This is followed by a stream of data.

This is where it gets interesting. Some systems use 7-bit ASCII, and the eighth bit of every byte will be lost. In fact some systems only use six bits (upper case, numerals and a few punctuation marks).

For plain text this is irrelevant, because ASCII is a seven bit code anyway, but for a binary image (an attached file, for example) this is catastrophic.

The solution is to remap the local character word into six bit words. This representation is known as base64 or binhex. The two high bits are not used. For a PC this means mapping every three bytes into four characters. This may be wasteful on an 8/16/32 bit system but it guarantees the stream will not be corrupted when it is routed through a very old server and the high bits are truncated.

Base64 is the native encoding scheme for MIME. Unfortunately, there are more flavours of binhex than you can poke a bent stick at, and the chances of your mail-reader correctly decoding all of them is quite low.

The solution is to use UUENCODE instead. UUENCODE solves exactly the same problem as binhex, but there are only a few flavours of UUENCODE and compatibility between them is high.

Probably there is a setting on your mail-reader to select your preferred encode/decode scheme. If that doesn't work then remember that the effect of these strategies is to move all values into what you might regard as "readable" characters.

This means that you can cut and paste the embedded stream into a file of its own. Then you can use a separate decoder program on it at your leisure. The encoding schemes are well documented - even if there are myriad variations on binhex.

You could even write your own custom decoder in GWBASIC (or whatever is handy). Given what you now know, you will be able to obtain the details of odd versions by visually examining the formatting of the stream using edit or even edlin. All you have to do is construct three bytes from the low six bits of every four characters.



Written by: Peter Routtier-Wone
November '96

Image of Arrow linked to Previous Article Image of Arrow linked to Next Article
Image of Line Break
[HOME] [TABLE OF CONTENTS] [SEARCH]