Send email that have UTF-8 encoded mail headers with perl

Get around the inherent 7bit limitations in email

Email has inherent 7bit limitations. In everyday life we don't observe these limitations because our mail user agents automatically work around them. When sending mail programmatically, you have to do the workarounds. There are helper libraries, but you still have to have a basic understanding of what needs to be done.

Here is how to do it in perl.

use Mail::Sendmail; ## interface to the unix command sendmail
use MIME::Base64; ## content-transfer-encoding
use MIME::Words qw(encode_mimewords); ## handle UTF-8 in mail headers

Base64 is an encoding of binary (8bit) data to a 7bit representation. It also ensures that no line is longer than 1000 characters, which is a limit of the SMTP protocol used by the mail servers.

Configure Sendmail to use a SMTP server, here I use "localhost", but you can set this to any SMTP-server that allows you to send mail, eg "smtp.gmail.com"

unshift @{$Mail::Sendmail::mailcfg{'smtp'}}, 'localhost';

Compose the letter.

my $subject = "utf8, even in the hæding";
my $message = "Hello Wörld!\n\n";
my %mail;
%mail = ( 'from' => $sender,
          'to' =>   $email,
'subject' => encode_mimewords($subject, Charset => 'utf-8', Encoding => 'B'),
        'Content-type' => 'text/plain; charset="utf-8"',
        'Content-Transfer-Encoding' => 'base64',
      );
utf8::encode($message);
$mail{message} = encode_base64($message);

Send the letter.

sendmail(%mail) || print "Error: $Mail::Sendmail::error\n";

This let's you send a letter with any unicode character in it properly. For the recepient to get a correct display of the letter, the recipient must have a glyph for that unicode character in the font used to display the text. It is not your fault if a certain glyph is missing in the recipients font, but don't expect everyone to have every possible unicode glyphs in their fonts.

Personal note: The font I normally use to read emails do not include glyphs used in esperanto, like Eĥoŝanĝo ĉiuĵaŭde. I would have liked to use xfonts-efont-unicode to get better unicode coverage, but the 'a' glyph is just too ugly, sorry. I'll stick to the misc-fixed font instead of efonts-fixed.

comments powered by Disqus


Back to the index

Blog roll

R-bloggers, Debian Weekly
Valid XHTML 1.0 Strict [Valid RSS] Valid CSS! Emacs Muse Last modified: oktober 12, 2017