What is a Postfix Policy Service?
Providing email services to a large number of customers requires a number of moving parts. Postfix is a tried and trusted SMTP server; it has a large feature set which is already too large. In order to avoid having even more features to control email policy, Postfix provides a facility for obtaining email policy decisions from a service over a socket. Without going into a long description, the policy service receives a payload of metadata about each email, and returns an SMTP command which tells the SMTP server what to do with the email. Generally, these instructions amount to “OK”, “DUNNO” (which means OK if nothing else says different), and “REJECT”. As such, the policy server can determine which emails are forwarded and which are not.
Policy services (or delegates) are the main way of enforcing outbound email quotas. Other sorts of decisions are also possible, such as controlling what apparent senders a user is allowed to masquerade as when sending email, enforcing SPF policies for inbound mail, greylisting, etc. Postfix is capable of some basic kinds of policy enforcement, but a delegate can make more complicated decisions, since the delegate can run arbitrary code. That said, we don’t want the code which runs on every policy decision to take very long to finish.
Cluebringer, also known as PolicyD or
cbpolicyd, was one of these services, which hasn’t been maintained in years. A newer one is called
mtpolicyd. We used Cluebringer for many years, until it could not keep up with the volume of email we handle, and then attempted to switch to
mtpolicyd, and discovered that it did not scale any better. For a small site,
mtpolicyd might be easier to install and operate since it would not need to scale.
Who is CHAPPS for?
If your organization provides email services for numerous, separate DNS domains, CHAPPS can help to enforce sending limits on a per-user basis. It can also enforce rules about which users can send email from which domains, and when no domain matches are found, can enforce specific whole-email masquerading rules. This is helpful in preventing users of the service from sending emails which appear to come from other users.
If performance of other solutions for outbound quota enforcement has been a problem for your organization, this software may help to make things run smoother. When a single user sends a large number of emails in sequence, CHAPPS uses the database exactly once, to obtain policy data for that user, and thereafter, on each send attempt, checks and updates only Redis. Remaining quota information is available in real time from the REST API.
If your environment already contains a custom dashboard for managing your services, it might be extended to provide UI elements for configuring CHAPPS. The CHAPPS project provides a REST API for accessing and maintaining the policy config, and also for inspecting and clearing the Redis cache. It does not provide a front-end web application (UI). Each organization will have user and domain data organized in different ways, and will likely have different ways of establishing who should have what kind of authorization. CHAPPS tries to make it easy to update the email policy settings, without assuming anything about whatever data exists in its greater environment.
What does CHAPPS do differently?
Both of PolicyD and
mtpolicyd rely heavily if not entirely on use of an RDBMS in order to maintain state. Plenty of web applications use an RDBMS backend (MySQL, MariaDB, Postgres) to maintain their state. It is, after all the “M” in LAMP, and it is one of the stack layers in the “full stack”. It is easy to fall into the canonical design pattern, simply stack logic to do the job atop a database adapter, and call it a day. But at a certain point, the database will have trouble processing all of the statements in a timely fashion. Generally, the table will be locked during updates, and so other queries wait; nearly half of the accesses involve doing updates in order to remember how many emails have been sent. While there are ways to make a database more performant, it made more sense to me to design a policy service which used a non-relational database (in this case, Redis) to serve as a caching layer for policy configuration data. The fact that Redis can also operate as a message queue may come in handy down the road, as well.
In a nutshell, what happens is this: when a policy request is evaluated, CHAPPS first attempts to find the policy data in Redis. If it finds nothing in Redis, it then attempts to obtain the data it needs from the RDBMS, and once it receives a response, it caches that data in Redis for a day. Redis can automatically expire entries, so no grooming is required in order to expire the cached data; the expiry time is simply set at the same time as the cache itself. Emails are usually emitted in clusters, whether because a human being is checking and responding to their email, or because a job is running which sends email. Each check after the first one each day will not result in an RDBMS query.
Future versions may allow the cache TTL to be customized, but one day seems a good trade-off between speed and sensitivity to policy configuration updates (i.e. some dashboard or other utility has altered the contents of the relational database). Since the REST API provides the capability for the configuration manager to actively reset the cache, some sites might value speed so much that they want a much longer TTL. Quotas are evaluated on a 24-hour basis, and so control data inherited that TTL.
CHAPPS is in the early stages of its development and so it has support only for a few different types of policies. On the outbound side, CHAPPS can enforce outbound quotas and also rules about apparent origin, also known as the envelope-from address. Outbound quotas are implemented as a limit of transmission attempts; they are not based on the size of the messages. Quota usage is tracked by making timestamp entries in a list which can be truncated to reflect precisely the preceding 24 hours at any point in time, rather than being reset at midnight. CHAPPS can be configured to count each recipient of a multi-recipient email as a transmission. When that feature is being used, a margin of grace may be configured, to allow multi-recipient emails to be sent even if the transmission total would exceed the quota, as long as the overage is within the margin amount.
It is worth noting that rejected send attempts are counted in the transmission list — but this only happens when the quota policy evaluates the policy request. Sender domain authorization (described below) happens first, and when it fails, the quota policy is not consulted. It is advantageous to include rejected transmission attempts, as this automatically blocks spamming once the quota has been reached. Alternative methods of preventing spamming should also be used, but as a fallback, this feature helps to prevent massive spamming by compromised email accounts.
The policy module which controls apparent origin is called “Sender Domain Authorization”. CHAPPS consults a pair of associated tables to determine whether a particular user is allowed to send an email which appears to come from a particular domain, or if there are no domain matches, it looks to see if the user is allowed to send an email which appears to come from the particular email address in the sender field. Site operators are responsible for ensuring that email users have appropriate entries in the control tables. This is facilitated by the REST API.
CHAPPS also supports some inbound features (Greylisting and SPF enforcement) but as of this writing those features are not quite complete. Both feature sets seem likely to grow over time, as well.
High Availability and Fault Tolerance
CHAPPS uses both Redis and an RDBMS (currently MySQL or MariaDB; support for other types is planned). For testing purposes or quick deployments, a single Redis server and a single DB server will suffice. However, such a deployment is full of single points of failure. Ideally, every tier is made highly available through clustering or load-balancing.
Simple load-balancing is sufficient to round-robin incoming client connections among two or more identically-configured Postfix servers, each running a CHAPPS instance locally. Those CHAPPS instances should also have identical configurations, which tell them how to connect to Redis and the RDBMS.
For the best reliability and fault-tolerance, the Redis instance provided for CHAPPS should be a Sentinel cluster. Whether to run the Sentinel cluster on the actual email servers themselves is a decision left to the operators of any particular organization. Where possible, using three additional servers (hopefully virtual) to run the Redis/Sentinel instances on seems better, since that will not consume resources from the email servers themselves.
Similarly, the RDBMS should also be clustered for maximum fault tolerance. In production, we will be using a ring of three MariaDB servers in a multi-master replication ring, with MaxScale in front of it.
(FWIW, the network infrastructure is also redundant, since all units are virtual and all hypervisors are homed onto multiple backbone switches.)
What gets installed?
It is highly recommended to install CHAPPS via
pip into a virtualenv. When CHAPPS is installed this way, it generates SystemD unit files which appear in
<venv>/chapps/install and correctly invoke the various different CHAPPS service scripts properly within the virtual environment. (I have been using Python 3’s
venv library. If you use a different flavor of virtual environment and had to make adjustments, please open an issue to let me know what they were, or send a pull request.) The API requires extra dependencies (see below) and is launched by the service unit called
chapps_rest_api.service. It is implemented with FastAPI, so if you already have a favorite way to run FastAPIs, then you can certainly do it that way. Please note that it is not recommended to run the API service on any of the mail servers or Redis servers involved in serving email. It is for this reason that the dependencies are separated.
In the virtual environment’s
bin directory, there will be a number of scripts, which are invoked by the SystemD service units. There are currently service scripts which provide just outbound quota enforcement, just inbound greylisting, and one which provides both sender domain authorization and quota enforcement, called
chapps_outbound_multi. There is no script to launch the API service, because that is handled entirely within the SystemD service unit definition.
When installing CHAPPS, if you want to run the API, please install it with the API extras, like so:
python -m pip install chapps[API]
This will ensure that all the dependencies for the API script are also installed, along with all of the base dependencies. At the time of writing, two different low-level database adapter libraries end up being installed, because I did not start by using SQLAlchemy, which does not support the real MariaDB adapter library, but wants to use the MySQL one instead. Eventually the older code will be adapted to use SQLAlchemy, so that the entire codebase will be able to work with anything SQLAlchemy can connect to.
If you use it, please let me know about your experience
CHAPPS has been written to address a specific need, but I have tried to design it in a way which is extensible and which will be useful for others also. I have tried to document thoroughly, but I am very close to the software so my documentation may sometimes assume the forest while describing the trees in great detail. Critiques of the documentation are welcome.
I am also interested to know about issues getting the software installed and set up, the weird problems which arise in unforeseen circumstances, performance issues, etc. File issues or submit pull requests on project’s Github page, or comment below.
The outbound quota enforcement logic is based upon
rolling-rate-limiter by Peter Hayes. I embellished it a bit by adding a suffix in the data part in order to represent multiple recipients in a single email. That routine is neat and slick, and its side-effects are beneficial, I find. See the INSTALLATION document for other acknowledgements.
No one could do anything like this without the entire community of open-source authors who share both their code and their experience with the community. As has been said before, we all stand on the shoulders of giants.