Online demo Privacy policy GoBD notes

This is an old revision of the document!

Piler basics in a nutshell

The piler email archiver uses the following components:

  • mysql: piler stores crucial metadata of the messages
  • sphinx: a search engine used by the gui to return the search results
  • file system: this is where the encrypted and compressed messages, attachments are stored

How do emails get to the archive? You configure your email server to pass a copy of emails to the piler daemon via smtp, since piler is an SMTP(-talking) daemon. Note that you don't need to create any system or virtual users or email addresses for the piler daemon to work, because it simply archives every email it receives.

When an email is received, then it's parsed, disassembled, compressed, encrypted, and finally stored in the file system: one file for every email and attachment. Also, the textual data is written to the sph_index table. The periodic indexer job reads the sph_index table, and updates the sphinx databases.

The GUI uses sphinx and mysql database to return the search results to the users.

Piler has a built-in access control to prevent a user to access other's messages. Auditors can see every archived email. Piler parses the header and extracts the From:, To: and Cc: addresses (in case of From: it only stores the first email address, since some spammers include tons of addresses in the From: field), and when a user searches for his emails then piler tries to match his email addresses against the email addresses in the messages. To sum it up, a regular user can see only the emails he sent or received.

This leads to a limitation: piler will hide an email from a user if he was (only) in the Bcc: field. This limitation has another side effect related to external mailing lists. You have to maintain which user belongs to which external mailing lists, otherwise users won't see these messages. Internal mailing lists are not a problem as long as piler can extract the membership information from openldap OR Active Directory.

Fortunately both Exchange and postfix (and probably some other MTAs, too) are able to put envelope recipients to the email, so the limitation mentioned above is solved.

How to search

Users can access the archived emails using a browser. They have to login using any of their known email address and the password. They can set the preferred language - currently English, (Brazilian) Portuguise, and Hungarian are supported - page length and theme.

By default users can enter the search terms into a text field, and the web interface splits them into components, guesses the format of the components, and builds up a search query. If you type 2012-01-31 then it knows it's a date. If it has @ sign, then it's an email address.

If you need a finer search query, then click on the down arrow at the right corner of the search field, and a popup layer appears where you can specify the sender, recipient, subject, attachment, dates, etc. Then click on the “Search” button, and you get the search results.

There is another way for a more detailed search query. You may specify different labels, such as “from:”, “to:”, “subject”, etc. and pass some values. It's called expert search.

It's also possible to use wildcards while searching. if you are not sure about a word, then it may be enough to enter the beginning of the word, then an asterisk (*), eg. encryp*, and sphinx will find the email if it contains for instance “encrypt”, “encrypted” or “encryption”, etc. By default you need 6 characters, then *.

You may specify the following fields:

from:sender address
to:recipient address
subject:subject of the message
body:body of the message
date1:from ('not before') date (YYYY-MM-DD 00:00:00)
date2:to ('not after') date (YYYY-MM-DD 23:59:59)
size:size of the message in bytes
direction:direction of the message
d:same as direction
attachment:attachment type, possible values: word, excel, powerpoint, pdf, compressed, text, odf, image, audio, video, flash, other, any
a:same as attachment

Expert search examples

Email from Gmail before 2012.02.29 00:00:00:

date2: 2012-02-28, from:

Email from Agent Smith:

from: Agent Smith

Email to someone in Big company after 2012.01.31:

date1: 2012-01-31, to:

Email from jane@aaa.fu OR bill@aaa.fu on 2012.02.15 having any kind of attachment:

date1:2012-02-15, date2:2012-02-15, from: jane@aaa.fu, bill@aaa.fu, attachment:any

Viagra spam bigger than 200 kB spoofing my email address as the sender, and having 'order', then 'now' in the body

size:>.2M, subject: viagra OR cialis, body: order << now, from: my@email.address

Price list to jenny@aaa.fu, in pdf attachment(s) smaller than 150 kB

direction: inbound, size:<150k, attachment: pdf, subject: price list, to: jenny@aaa.fu

Additional notes

By hitting the 'Search' button without entering any search criteria, piler returns the newest 1000 messages in a paged style. The search engine (=sphinx) has a limit (1000 by default) on the returned results. It's possible to return more hits, however you have to edit sphinx.conf to do that.

You can use any sphinx operators, eg. |, &, «, for the subject and body fields.

The entered search phrases are in explicit Boolean AND relation, eg. cat dog means that the document has to contain both cat and dog.

Some examples:

cat dog = having cat and dog (order is not important)
cat OR dog = having cat or dog
cat | dog = having cat or dog
"cat dog" = having the expression "cat dog"
!dog = not having dog
-dog = not having dog
"cat dog"~10 = proximity search
cat << dog = before operator: cat has to precede dog

See 5.2. Boolean query syntax and 5.3. Extended query syntax for more details on the sphinx search site.

Using the search results

If you have a search result then you can view any of the messages in the result by clicking on the subject line. A popup window will come up showing the results. You can also download the given message as an EML file, or restore it to your mailbox via SMTP. You may assign tags to the email in the popup window.

It's also possible to download the search results from the current page as a zip file. To do so, click on the blue download icon.

Piler allows you to save the search criteria for later use by clicking on the “Save” button. If you have any saved searches click on the “Load” button to have them shown, then you can run the saved search by clicking on it.

Google Analytics Alternative