45. Content scanning at ACL time (2024)

Chapter 45 - Content scanning at ACL time

The extension of Exim to include content scanning at ACL time, formerly knownas “exiscan”, was originally implemented as a patch by Tom Kistner. The codewas integrated into the main source for Exim release 4.50, and Tom continues tomaintain it. Most of the wording of this chapter is taken from Tom’sspecification.

It is also possible to scan the content of messages at other times. Thelocal_scan() function (see chapter 46) allows for contentscanning after all the ACLs have run. A transport filter can be used to scanmessages at delivery time (see the transport_filter option, described inchapter 24).

If you want to include the ACL-time content-scanning features when you compileExim, you need to arrange for WITH_CONTENT_SCAN to be defined in yourLocal/Makefile. When you do that, the Exim binary is built with:

Two additional ACLs (acl_smtp_mime and acl_not_smtp_mime) that are runfor all MIME parts for SMTP and non-SMTP messages, respectively.
Additional ACL conditions and modifiers: decode, malware,mime_regex, regex, and spam. These can be used in the ACL that isrun at the end of message reception (the acl_smtp_data ACL).
An additional control feature (“no_mbox_unspool”) that saves spooled copiesof messages, or parts of messages, for debugging purposes.
Additional expansion variables that are set in the new ACL and by the newconditions.
Two new main configuration options: av_scanner and spamd_address.

Content-scanning is continually evolving, and new features are still beingadded. While such features are still unstable and liable to incompatiblechanges, they are made available in Exim by setting options whose names beginEXPERIMENTAL_ in Local/Makefile. Such features are not documented inthis manual. You can find out about them by reading the file calleddoc/experimental.txt.

All the content-scanning facilities work on a MBOX copy of the message that istemporarily created in a file called:

<spool_directory>/scan/<message_id>/<message_id>.eml

The .eml extension is a friendly hint to virus scanners that they canexpect an MBOX-like structure inside that file. The file is created when thefirst content scanning facility is called. Subsequent calls to contentscanning conditions open the same file again. The directory is recursivelyremoved when the acl_smtp_data ACL has finished running, unless

control = no_mbox_unspool

has been encountered. When the MIME ACL decodes files, they are put into thesame directory by default.

1. Scanning for viruses

The malware ACL condition lets you connect virus scanner software to Exim.It supports a “generic” interface to scanners called via the shell, andspecialized interfaces for “daemon” type virus scanners, which are residentin memory and thus are much faster.

Since message data needs to have arrived,the condition may be only called in ACL defined byacl_smtp_data,acl_smtp_data_prdr,acl_smtp_mime oracl_smtp_dkim

A timeout of 2 minutes is applied to a scanner call (by default);if it expires then a defer action is taken.

You can set the av_scanner option in the main part of the configurationto specify which scanner to use, together with any additional options thatare needed. The basic syntax is as follows:

av_scanner = <scanner-type>:<option1>:<option2>:[...]

If you do not set av_scanner, it defaults to

av_scanner = sophie:/var/run/sophie

If the value of av_scanner starts with a dollar character, it is expandedbefore use.The usual list-parsing of the content (see 6.20) applies.The following scanner types are supported in this release,though individual ones can be included or not at build time:

avast

This is the scanner daemon of Avast. It has been tested with Avast CoreSecurity (currently at version 2.2.0).You can get a trial version at https://www.avast.com or for Linuxat https://www.avast.com/linux-server-antivirus.This scanner type takes one option,which can be either a full path to a UNIX socket,or host and port specifiers separated by white space.The host may be a name or an IP address; the port is either asingle number or a pair of numbers with a dash between.A list of options may follow. These options are interpreted on theExim’s side of the malware scanner, or are given on separate lines tothe daemon as options before the main scan command.

If pass_unscannedis set, any files the Avast scanner can’t scan (e.g.decompression bombs, or invalid archives) are considered clean. Use withcare.

For example:

av_scanner = avast:/var/run/avast/scan.sock:FLAGS -fullfiles:SENSITIVITY -pupav_scanner = avast:/var/run/avast/scan.sock:pass_unscanned:FLAGS -fullfiles:SENSITIVITY -pupav_scanner = avast:192.168.2.22 5036

If you omit the argument, the default path/var/run/avast/scan.sockis used.If you use a remote host,you need to make Exim’s spool directory available to it,as the scanner is passed a file path, not file contents.For information about available commands and their options you may use

$ socat UNIX:/var/run/avast/scan.sock STDIO: FLAGS SENSITIVITY PACK

If the scanner returns a temporary failure (e.g. license issues, orpermission problems), the message is deferred and a paniclog entry iswritten. The usual defer_ok option is available.

aveserver

This is the scanner daemon of Kaspersky Version 5. You can get a trial versionat https://www.kaspersky.com/. This scanner type takes one option,which is the path to the daemon’s UNIX socket. The default is shown in thisexample:

av_scanner = aveserver:/var/run/aveserver

clamd

This daemon-type scanner is GPL and free. You can get it athttps://www.clamav.net/. Some older versions of clamd do not seem tounpack MIME containers, so it used to be recommended to unpack MIME attachmentsin the MIME ACL. This is no longer believed to be necessary.

The options are a list of server specifiers, which may bea UNIX socket specification,a TCP socket specification,or a (global) option.

A socket specification consists of a space-separated list.For a Unix socket the first element is a full path for the socket,for a TCP socket the first element is the IP addressand the second a port number,Any further elements are per-server (non-global) options.These per-server options are supported:

retry=<timespec>Retry on connect fail

The retry option specifies a time after which a single retry fora failed connect is made. The default is to not retry.

If a Unix socket file is specified, only one server is supported.

Examples:

av_scanner = clamd:/opt/clamd/socketav_scanner = clamd:192.0.2.3 1234av_scanner = clamd:192.0.2.3 1234:localav_scanner = clamd:192.0.2.3 1234 retry=10sav_scanner = clamd:192.0.2.3 1234 : 192.0.2.4 1234

If the value of av_scanner points to a UNIX socket file or contains thelocaloption, then the ClamAV interface will pass a filename containing the datato be scanned, which should normally result in less I/O happening and bemore efficient. Normally in the TCP case, the data is streamed to ClamAV asExim does not assume that there is a common filesystem with the remote host.

The final example shows that multiple TCP targets can be specified. Exim willrandomly use one for each incoming email (i.e. it load balances them). Notethat only TCP targets may be used if specifying a list of scanners; a UNIXsocket cannot be mixed in with TCP targets. If one of the servers becomesunavailable, Exim will try the remaining one(s) until it finds one that works.When a clamd server becomes unreachable, Exim will log a message. Exim doesnot keep track of scanner state between multiple messages, and the scannerselection is random, so the message will get logged in the mainlog for eachemail that the down scanner gets chosen first (message wrapped to be readable):

2013-10-09 14:30:39 1VTumd-0000Y8-BQ malware acl condition: clamd: connection to localhost, port 3310 failed (Connection refused)

If the option is unset, the default is /tmp/clamd. Thanks to David Saez forcontributing the code for this scanner.

cmdline

This is the keyword for the generic command line scanner interface. It can beused to attach virus scanners that are invoked from the shell. This scannertype takes 3 mandatory options:

The full path and name of the scanner binary, with all command line options,and a placeholder (%s) for the directory to scan.
A regular expression to match against the STDOUT and STDERR output of thevirus scanner. If the expression matches, a virus was found. You must makeabsolutely sure that this expression matches on “virus found”. This is calledthe “trigger” expression.
Another regular expression, containing exactly one pair of parentheses, tomatch the name of the virus found in the scanners output. This is called the“name” expression.

For example, Sophos Sweep reports a virus on a line like this:

Virus 'W32/Magistr-B' found in file ./those.bat

For the trigger expression, we can match the phrase “found in file”. For thename expression, we want to extract the W32/Magistr-B string, so we can matchfor the single quotes left and right of it. Altogether, this makes theconfiguration setting:

av_scanner = cmdline:\ /path/to/sweep -ss -all -rec -archive %s:\ found in file:'(.+)'

drweb

The DrWeb daemon scanner (https://www.sald.ru/) interfacetakes one option,either a full path to a UNIX socket,or host and port specifiers separated by white space.The host may be a name or an IP address; the port is either asingle number or a pair of numbers with a dash between.For example:

av_scanner = drweb:/var/run/drwebd.sockav_scanner = drweb:192.168.2.20 31337

If you omit the argument, the default path /usr/local/drweb/run/drwebd.sockis used. Thanks to Alex Miller for contributing the code for this scanner.

f-protd

The f-protd scanner is accessed via HTTP over TCP.One argument is taken, being a space-separated hostname and port number(or port-range).For example:

av_scanner = f-protd:localhost 10200-10204

If you omit the argument, the default values shown above are used.

f-prot6d

The f-prot6d scanner is accessed using the FPSCAND protocol over TCP.One argument is taken, being a space-separated hostname and port number.For example:

av_scanner = f-prot6d:localhost 10200

If you omit the argument, the default values show above are used.

fsecure

The F-Secure daemon scanner (https://www.f-secure.com/) takes oneargument which is the path to a UNIX socket. For example:

av_scanner = fsecure:/path/to/.fsav

If no argument is given, the default is /var/run/.fsav. Thanks to JohanThelmen for contributing the code for this scanner.

kavdaemon

This is the scanner daemon of Kaspersky Version 4. This version of theKaspersky scanner is outdated. Please upgrade (see aveserver above). Thisscanner type takes one option, which is the path to the daemon’s UNIX socket.For example:

av_scanner = kavdaemon:/opt/AVP/AvpCtl

The default path is /var/run/AvpCtl.

mksd

This was a daemon type scanner that is aimed mainly at Polish users,though some documentation was available in English.The history can be shown at https://en.wikipedia.org/wiki/Mks_virand this appears to be a candidate for removal from Exim, unlesswe are informed of other virus scanners which use the same protocolto integrate.The only option for this scanner type isthe maximum number of processes used simultaneously to scan the attachments,provided that mksd hasbeen run with at least the same number of child processes. For example:

av_scanner = mksd:2

You can safely omit this option (the default value is 1).

sock

This is a general-purpose way of talking to simple scanner daemonsrunning on the local machine.There are four options:an address (which may be an IP address and port, or the path of a Unix socket),a commandline to send (may include a single %s which will be replaced withthe path to the mail file to be scanned),an RE to trigger on from the returned data,and an RE to extract malware_name from the returned data.For example:

av_scanner = sock:127.0.0.1 6001:%s:(SPAM|VIRUS):(.*)$

Note that surrounding whitespace is stripped from each option, meaningthere is no way to specify a trailing newline.The socket specifier and both regular-expressions are required.Default for the commandline is %s\n (note this does have a trailing newline);specify an empty element to get this.

sophie

Sophie is a daemon that uses Sophos’ libsavi library to scan for viruses.You can get Sophie at http://sophie.sourceforge.net/. The only optionfor this scanner type is the path to the UNIX socket that Sophie uses forclient communication. For example:

av_scanner = sophie:/tmp/sophie

The default path is /var/run/sophie, so if you are using this, you can omitthe option.

When av_scanner is correctly set, you can use the malware condition inthe DATA ACL. Note: You cannot use the malware condition in the MIMEACL.

The av_scanner option is expanded each time malware is called. Thismakes it possible to use different scanners. See further below for an example.The malware condition caches its results, so when you use it multiple timesfor the same message, the actual scanning process is only carried out once.However, using expandable items in av_scanner disables this caching, inwhich case each use of the malware condition causes a new scan of themessage.

The malware condition takes a right-hand argument that is expanded beforeuse and taken as a list, slash-separated by default.The first element can then be one of

“true”, “*”, or “1”, in which case the message is scanned for viruses.The condition succeeds if a virus was found, and fail otherwise. This is therecommended usage.
“false” or “0” or an empty string, in which case no scanning is done andthe condition fails immediately.
A regular expression, in which case the message is scanned for viruses. Thecondition succeeds if a virus is found and its name matches the regularexpression. This allows you to take special actions on certain types of virus.Note that “/” characters in the RE must be doubled due to the list-processing,unless the separator is changed (in the usual way 6.21).

You can append a defer_ok element to the malware argument list to acceptmessages even if there is a problem with the virus scanner.Otherwise, such a problem causes the ACL to defer.

You can append a tmo=<val> element to the malware argument list tospecify a non-default timeout. The default is two minutes.For example:

malware = * / defer_ok / tmo=10s

A timeout causes the ACL to defer.

When a connection is made to the scanner the expansion variable $callout_addressis set to record the actual address used.

When a virus is found, the condition sets up an expansion variable called$malware_name that contains the name of the virus. You can use it in amessage modifier that specifies the error returned to the sender, and/or inlogging data.

Beware the interaction of Exim’s message_size_limit with any size limitsimposed by your anti-virus scanner.

Here is a very simple scanning example:

deny malware = * message = This message contains malware ($malware_name)

The next example accepts messages when there is a problem with the scanner:

deny malware = */defer_ok message = This message contains malware ($malware_name)

The next example shows how to use an ACL variable to scan with both sophie andaveserver. It assumes you have set:

av_scanner = $acl_m0

in the main Exim configuration.

deny set acl_m0 = sophie malware = * message = This message contains malware ($malware_name)deny set acl_m0 = aveserver malware = * message = This message contains malware ($malware_name)

2. Scanning with SpamAssassin and Rspamd

The spam ACL condition calls SpamAssassin’s spamd daemon to get a spamscore and a report for the message.Support is also provided for Rspamd.

For more information about installation and configuration of SpamAssassin orRspamd refer to their respective websites athttps://spamassassin.apache.org/ and https://www.rspamd.com/

SpamAssassin can be installed with CPAN by running:

perl -MCPAN -e 'install Mail::SpamAssassin'

SpamAssassin has its own set of configuration files. Please review itsdocumentation to see how you can tweak it. The default installation should worknicely, however.

By default, SpamAssassin listens on 127.0.0.1, TCP port 783 and if youintend to use an instance running on the local host you do not need to setspamd_address. If you intend to use another host or port for SpamAssassin,you must set the spamd_address option in the global part of the Eximconfiguration as follows (example):

spamd_address = 192.168.99.45 783

The SpamAssassin protocol relies on a TCP half-close from the client.If your SpamAssassin client side is running a Linux system with aniptables firewall, consider settingnet.netfilter.nf_conntrack_tcp_timeout_close_wait to at least thetimeout, Exim uses when waiting for a response from the SpamAssassinserver (currently defaulting to 120s). With a lower value the Linuxconnection tracking may consider your half-closed connection as dead toosoon.

To use Rspamd (which by default listens on all local addresseson TCP port 11333)you should add variant=rspamd after the address/port pair, for example:

spamd_address = 127.0.0.1 11333 variant=rspamd

As of version 2.60, SpamAssassin also supports communication over UNIXsockets. If you want to us these, supply spamd_address with an absolutefilename instead of an address/port pair:

spamd_address = /var/run/spamd_socket

You can have multiple spamd servers to improve scalability. These canreside on other hardware reachable over the network. To specify multiplespamd servers, put multiple address/port pairs in the spamd_addressoption, separated with colons (the separator can be changed in the usual way 6.21):

spamd_address = 192.168.2.10 783 : \ 192.168.2.11 783 : \ 192.168.2.12 783

Up to 32 spamd servers are supported.When a server fails to respond to the connection attempt, all otherservers are tried until one succeeds. If no server responds, the spamcondition defers.

Unix and TCP socket specifications may be mixed in any order.Each element of the list is a list itself, space-separated by defaultand changeable in the usual way (6.21);take care to not double the separator.

For TCP socket specifications a host name or IP (v4 or v6, butsubject to list-separator quoting rules) address can be used,and the port can be one or a dash-separated pair.In the latter case, the range is tried in strict order.

Elements after the first for Unix sockets, or second for TCP socket,are options.The supported options are:

pri=<priority> Selection priorityweight=<value> Selection biastime=<start>-<end> Use only between these times of dayretry=<timespec> Retry on connect failtmo=<timespec> Connection time limitvariant=rspamd Use Rspamd rather than SpamAssassin protocol

The pri option specifies a priority for the server within the list,higher values being tried first.The default priority is 1.

The weight option specifies a selection bias.Within a priority setservers are queried in a random fashion, weighted by this value.The default value for selection bias is 1.

Time specifications for the time option are <hour>.<minute>.<second>in the local time zone; each element being one or more digits.Either the seconds or both minutes and seconds, plus the leading .characters, may be omitted and will be taken as zero.

Timeout specifications for the retry and tmo optionsare the usual Exim time interval standard, e.g. 20s or 1m.

The tmo option specifies an overall timeout for communication.The default value is two minutes.

The retry option specifies a time after which a single retry fora failed connect is made.The default is to not retry.

The spamd_address variable is expanded before use if it starts witha dollar sign. In this case, the expansion may return a string that isused as the list so that multiple spamd servers can be the result of anexpansion.

When a connection is made to the server the expansion variable $callout_addressis set to record the actual address used.

3. Calling SpamAssassin from an Exim ACL

Here is a simple example of the use of the spam condition in a DATA ACL:

deny spam = joe message = This message was classified as SPAM

The right-hand side of the spam condition specifies a name. This isrelevant if you have set up multiple SpamAssassin profiles. If you do not wantto scan using a specific profile, but rather use the SpamAssassin system-widedefault profile, you can scan for an unknown name, or simply use “nobody”.Rspamd does not use this setting. However, you must put something on theright-hand side.

The name allows you to use per-domain or per-user antispam profiles inprinciple, but this is not straightforward in practice, because a message mayhave multiple recipients, not necessarily all in the same domain. Because thespam condition has to be called from a DATA-time ACL in order to be able toread the contents of the message, the variables $local_part and $domainare not set.Careful enforcement of single-recipient messages(e.g. by responding with defer in the recipient ACL for all recipientsafter the first),or the use of PRDR,are needed to use this feature.

The right-hand side of the spam condition is expanded before being used, soyou can put lookups or conditions there. When the right-hand side evaluates to“0” or “false”, no scanning is done and the condition fails immediately.

Scanning with SpamAssassin uses a lot of resources. If you scan every message,large ones may cause significant performance degradation. As most spam messagesare quite small, it is recommended that you do not scan the big ones. Forexample:

deny condition = ${if < {$message_size}{10K}} spam = nobody message = This message was classified as SPAM

The spam condition returns true if the threshold specified in the user’sSpamAssassin profile has been matched or exceeded. If you want to use thespam condition for its side effects (see the variables below), you can makeit always return “true” by appending :true to the username.

When the spam condition is run, it sets up a number of expansionvariables.Except for $spam_report,these variables are saved with the received message so areavailable for use at delivery time.

$spam_score: The spam score of the message, for example, “3.4” or “30.5”. This is usefulfor inclusion in log or reject messages.
$spam_score_int: The spam score of the message, multiplied by ten, as an integer value. Forexample “34” or “305”. It may appear to disagree with $spam_scorebecause $spam_score is rounded and $spam_score_int is truncated.The integer value is useful for numeric comparisons in conditions.
$spam_bar: A string consisting of a number of “+” or “-” characters, representing theinteger part of the spam score value. A spam score of 4.4 would have a$spam_bar value of “++++”. This is useful for inclusion in warningheaders, since MUAs can match on such strings. The maximum length of thespam bar is 50 characters.
$spam_report: A multiline text table, containing the full SpamAssassin report for themessage. Useful for inclusion in headers or reject messages.This variable is only usable in a DATA-time ACL.Beware that SpamAssassin may return non-ASCII characters, especiallywhen running in country-specific locales, which are not legalunencoded in headers.
$spam_action: For SpamAssassin either ’reject’ or ’no action’ depending on thespam score versus threshold.For Rspamd, the recommended action.

The spam condition caches its results unless expansion inspamd_address was used. If you call it again with the same user name, itdoes not scan again, but rather returns the same values as before.

The spam condition returns DEFER if there is any error while runningthe message through SpamAssassin or if the expansion of spamd_addressfailed. If you want to treat DEFER as FAIL (to pass on to the next ACLstatement block), append /defer_ok to the right-hand side of thespam condition, like this:

deny spam = joe/defer_ok message = This message was classified as SPAM

This causes messages to be accepted even if there is a problem with spamd.

Here is a longer, commented example of the use of the spamcondition:

# put headers in all messages (no matter if spam or not)warn spam = nobody:true add_header = X-Spam-Score: $spam_score ($spam_bar) add_header = X-Spam-Report: $spam_report# add second subject line with *SPAM* marker when message# is over thresholdwarn spam = nobody add_header = Subject: *SPAM* $h_Subject:# reject spam at high scores (> 12)deny spam = nobody:true condition = ${if >{$spam_score_int}{120}{1}{0}} message = This message scored $spam_score spam points.

4. Scanning MIME parts

The acl_smtp_mime global option specifies an ACL that is called once foreach MIME part of an SMTP message, including multipart types, in the sequenceof their position in the message. Similarly, the acl_not_smtp_mime optionspecifies an ACL that is used for the MIME parts of non-SMTP messages. Theseoptions may both refer to the same ACL if you want the same processing in bothcases.

These ACLs are called (possibly many times) just before the acl_smtp_dataACL in the case of an SMTP message, or just before the acl_not_smtp ACL inthe case of a non-SMTP message. However, a MIME ACL is called only if themessage contains a Content-Type: header line. When a call to a MIMEACL does not yield “accept”, ACL processing is aborted and the appropriateresult code is sent to the client. In the case of an SMTP message, theacl_smtp_data ACL is not called when this happens.

You cannot use the malware or spam conditions in a MIME ACL; these canonly be used in the DATA or non-SMTP ACLs. However, you can use the regexcondition to match against the raw MIME part. You can also use themime_regex condition to match against the decoded MIME part (see section45.5).

At the start of a MIME ACL, a number of variables are set from the headerinformation for the relevant MIME part. These are described below. The contentsof the MIME part are not by default decoded into a disk file except for MIMEparts whose content-type is “message/rfc822”. If you want to decode a MIMEpart into a disk file, you can use the decode condition. The generalsyntax is:

decode = [/<path>/]<filename>

The right hand side is expanded before use. After expansion,the value can be:

“0” or “false”, in which case no decoding is done.
The string “default”. In that case, the file is put in the temporary“default” directory <spool_directory>/scan/<message_id>/ witha sequential filename consisting of the message id and a sequence number. Thefull path and name is available in $mime_decoded_filename after decoding.
A full path name starting with a slash. If the full name is an existingdirectory, it is used as a replacement for the default directory. The filenameis then sequentially assigned. If the path does not exist, it is used asthe full path and filename.
If the string does not start with a slash, it is used as thefilename, and the default path is then used.

The decode condition normally succeeds. It is only false for syntaxerrors or unusual circ*mstances such as memory shortages.

The variable $mime_filename will have the suggested name for the file.Note however that this might contain anything, and is very difficultto safely use as all or even part of the filename.

If you place files outside of the default path, they are notautomatically unlinked.

For RFC822 attachments (these are messages attached to messages, with acontent-type of “message/rfc822”), the ACL is called again in the same manneras for the primary message, only that the $mime_is_rfc822 expansionvariable is set (see below). Attached messages are always decoded to diskbefore being checked, and the files are unlinked once the check is done.

The MIME ACL supports the regex and mime_regex conditions. These can beused to match regular expressions against raw and decoded MIME parts,respectively. They are described in section 45.5.

The following list describes all expansion variables that areavailable in the MIME ACL:

$mime_anomaly_level

$mime_anomaly_text

If there are problems decoding, these variables contain information onthe detected issue.

$mime_boundary

If the current part is a multipart (see $mime_is_multipart below), it shouldhave a boundary string, which is stored in this variable. If the current parthas no boundary parameter in the Content-Type: header, this variablecontains the empty string.

$mime_charset

This variable contains the character set identifier, if one was found in theContent-Type: header. Examples for charset identifiers are:

us-asciigb2312 (Chinese)iso-8859-1

Please note that this value is not normalized, so you should do matchescase-insensitively.

$mime_content_description

This variable contains the normalized content of the Content-Description:header. It can contain a human-readable description of the parts content. Someimplementations repeat the filename for attachments here, but they are usuallyonly used for display purposes.

$mime_content_disposition

This variable contains the normalized content of the Content-Disposition:header. You can expect strings like “attachment” or “inline” here.

$mime_content_id

This variable contains the normalized content of the Content-ID: header.This is a unique ID that can be used to reference a part from another part.

$mime_content_size

This variable is set only after the decode modifier (see above) has beensuccessfully run. It contains the size of the decoded part in kilobytes. Thesize is always rounded up to full kilobytes, so only a completely empty parthas a $mime_content_size of zero.

$mime_content_transfer_encoding

This variable contains the normalized content of theContent-transfer-encoding: header. This is a symbolic name for an encodingtype. Typical values are “base64” and “quoted-printable”.

$mime_content_type

If the MIME part has a Content-Type: header, this variable contains itsvalue, lowercased, and without any options (like “name” or “charset”). Hereare some examples of popular MIME types, as they may appear in this variable:

text/plaintext/htmlapplication/octet-streamimage/jpegaudio/midi

If the MIME part has no Content-Type: header, this variable contains theempty string.

$mime_decoded_filename

This variable is set only after the decode modifier (see above) has beensuccessfully run. It contains the full path and filename of the filecontaining the decoded data.

$mime_filename

This is perhaps the most important of the MIME variables. It contains aproposed filename for an attachment, if one was found in either theContent-Type: or Content-Disposition: headers. The filename will beRFC2047or RFC2231decoded, but no additional sanity checks are done. If no filename wasfound, this variable contains the empty string.

$mime_is_coverletter

This variable attempts to differentiate the “cover letter” of an e-mail fromattached data. It can be used to clamp down on flashy or unnecessarily encodedcontent in the cover letter, while not restricting attachments at all.

The variable contains 1 (true) for a MIME part believed to be part of thecover letter, and 0 (false) for an attachment. At present, the algorithm is asfollows:

The outermost MIME part of a message is always a cover letter.
If a multipart/alternative or multipart/related MIME part is a cover letter,so are all MIME subparts within that multipart.
If any other multipart is a cover letter, the first subpart is a cover letter,and the rest are attachments.
All parts contained within an attachment multipart are attachments.

As an example, the following will ban “HTML mail” (including that sent withalternative plain text), while allowing HTML files to be attached. HTMLcoverletter mail attached to non-HTML coverletter mail will also be allowed:

deny !condition = $mime_is_rfc822 condition = $mime_is_coverletter condition = ${if eq{$mime_content_type}{text/html}{1}{0}} message = HTML mail is not accepted here

$mime_is_multipart

This variable has the value 1 (true) when the current part has the main type“multipart”, for example, “multipart/alternative” or “multipart/mixed”.Since multipart entities only serve as containers for other parts, you may notwant to carry out specific actions on them.

$mime_is_rfc822

This variable has the value 1 (true) if the current part is not a part of thechecked message itself, but part of an attached message. Attached messagedecoding is fully recursive.

$mime_part_count

This variable is a counter that is raised for each processed MIME part. Itstarts at zero for the very first part (which is usually a multipart). Thecounter is per-message, so it is reset when processing RFC822 attachments (see$mime_is_rfc822). The counter stays set after acl_smtp_mime iscomplete, so you can use it in the DATA ACL to determine the number of MIMEparts of a message. For non-MIME messages, this variable contains the value -1.

5. Scanning with regular expressions

You can specify your own custom regular expression matches on the full body ofthe message, or on individual MIME parts.

The regex condition takes one or more regular expressions as arguments andmatches them against the full message (when called in the DATA ACL) or a rawMIME part (when called in the MIME ACL). The regex condition matcheslinewise, with a maximum line length of 32K characters. That means you cannothave multiline matches with the regex condition.

The mime_regex condition can be called only in the MIME ACL. It matches upto 32K of decoded content (the whole content at once, not linewise). If thepart has not been decoded with the decode modifier earlier in the ACL, itis decoded automatically when mime_regex is executed (using default pathand filename values). If the decoded data is larger than 32K, only the first32K characters are checked.

The regular expressions are passed as a colon-separated list. To include aliteral colon, you must double it. Since the whole right-hand side string isexpanded before being used, you must also escape dollar signs and backslasheswith more backslashes, or use the \N facility to disable expansion.Here is a simple example that contains two regular expressions:

deny regex = [Mm]ortgage : URGENT BUSINESS PROPOSAL message = contains blacklisted regex ($regex_match_string)

The conditions returns true if any one of the regular expressions matches. The$regex_match_string expansion variable is then set up and contains thematching regular expression.The expansion variables $regex1 $regex2 etcare set to any substrings captured by the regular expression.

Warning: With large messages, these conditions can be fairlyCPU-intensive.