Online demo!

FAQ

How to backup and restore the archive?

You have two options:

#1: every day export yesterday emails, eg.

cd /opt/backup
pilerexport -a 2017.09.18 -b 2017.09.18
tar cfz /path/to/backup-2017-09-18.tar.gz *
rm *

The restore procedure in this case is to deploy piler from scratch, then import all the saved emails.

#2: backup the piler, sphinx data, and the configs

Backup regularly the following:

  • /usr/local/etc/piler directory (piler and sphinx config files, including piler.key file)
  • /var/piler/store/00 directory contents (note that only the last directory in '00' changes)
  • /var/piler/sphinx/main[1-4]* files
  • piler mysql database
  • nginx/apache configs

The restore procedure is to deploy piler, then restore the previously saved files and directories mentioned above.

mysql_stmt_execute error: *Incorrect string value: '\xF0\x9D\x91\x83\xF0\x9D…' for column 'body' at row 1* (errno: 1366)

Mysql supports utf-8 character set. However, mysql does NOT support the complete utf-8 code table, only a subset of it. If the email has a character or symbol in the unsupported region (think of the 4 byte values), then it can't store such a value if you are using mysql 5.7 or newer (or mariadb 10.2 or newer).

The solution is either to use an older mysql version (eg. 5.5 or mariadb 10.1) which is more forgiving with 'invalid' (=read valid, but not supported by mysql version's of utf-8) utf-8 character sequences. Or switch to utf8mb4 encoding which supports the whole range of the utf-8 characters.

See http://stackoverflow.com/questions/10957238/incorrect-string-value-when-trying-to-insert-utf-8-into-mysql-via-jdbc for details. You have the following options:

Note that the epoll branch uses utf8mb4 by default, so this issue is fixed.

I can see mysql_stmt_execute error: *Out of range value for column 'retained' at row 1 errors in the mail log

See #813 fix on Bitbucket.

My older emails are gone!

Follow these instructions to figure out if you really lost your emails. Chances are that you haven't!

  1. check the metadata table, and verify that your emails are there
  2. find some of your “lost” emails under the /var/piler/store/00 directory. The files are named based on the their metadata.piler_id value. If you can find them, then your emails are in the archive.
  3. double check with the pilerget utility, ie. if eg. pilerget 4000000057950fd127ea028c005a30f2f951 returns the email (don't forget to use your *actual* piler_id!), then the email is in the archive
  4. check the sphinx database (not a mysql database, sphinx has its own database which can be accessed with mysql -h 127.0.0.1 -P 9306) if the older IDs (the same as in metadata.id) are present: select * from main1,dailydelta1,delta1 where id=123 (also use the actual id!)

Why am I getting a “cannot write current directory!” message?

Most of the piler binaries are setuid/setgid to piler:piler in order to read/write the store directory. It means that these commands run as user/group piler. The above error message is pretty self explaining: the piler user can't write the current directory. Solution: cd /tmp or cd /var/tmp or cd /var/piler/imap, ie. cd to a directory where piler has rwx permissions.

What's the 'master branch'?

The master branch is the developer version of piler, kinda the nightly build. This contains all the latest (and probably unstable) code. It doesn't mean that the master branch is a buggy crap, however it does mean that the code is not in its final shape. Just with any other software, unless you are told otherwise, please use a stable version.

What archive and retention rule do I need to archive every email, and keep them for 5 years?

By default piler archives everything it receives, so you don't need any rules for that. In fact, the archiving rules define what NOT to archive. Also the default_retention_days value in piler.conf applies to all emails, unless it's overridden in the retention rules. So in this just set default_retention_days=1827, restart piler, and you are fine.

I can see lots of duplicates in the archive

The duplicate detection is based on the Message-ID header field. If the piler daemon detects an already known message-id, then it discards the email. As you can see, the Message-ID header is crucial. The piler daemon may archive a single message with a missing message-id field, and discards the subsequent messages having no message-id.

The pilerimport utility, however, accepts messages even without message-id, because it assumes you really want to archive those messages. So be sure to remove any duplicates before importing them with pilerimport, otherwise you'll have duplicates in the archive.

The only negative side effect is the extra resources required, eg. disk space, memory, etc.

Some internal email sending applications don't know how to create a valid email header, and neglects the message-id. They should be fixed ASAP!

I don't have a /var partition, how to make the disk space projection work?

Set the following in config-site.php:

$config['DATA_PARTITION'] = '/';

I've just sent a few test emails, however no search results in the gui

The piler daemon processes email as soon as they are received. However emails are indexed in every 30 mins. So an email is visible in the GUI 30 mins. You may adjust the indexing frequency by editing the cron jobs of piler.

I can't see any results, however the sphinx query in the maillog reports 0 hits, and >0 total found:

Aug 11 20:21:36 mailpiler piler-webui[10670]: sphinx query: 'SELECT id FROM main1,dailydelta1,delta1 WHERE MATCH('@(subject,body)  whatever I need') ORDER BY `sent` DESC LIMIT 0,1000 OPTION max_matches=1000' in 0.29 s, 0 hits, 25508 total found

You use an outdated and not supported version of sphinx, typically 2.0.x on Debian or Ubuntu. Make sure you have a recent 2.2.x version. You may download it from http://sphinxsearch.com/downloads/release/.

I want to see more hits!

Edit sphinx.conf, and set the max_matches variable, eg:

searchd {
   ...
   max_matches = 5000
}

Then restart searchd. Finally edit config-site.php, and set MAX_SEARCH_HITS to the same number:

$config['MAX_SEARCH_HITS'] = 5000;

Note that you should keep this number to a sane value, the bigger it is the more resources you need for both searchd and php.

I forgot the admin@local password

Connect to the piler database, and set the crypted password manually, eg:

mysql> update user set password='$5$TXL7EX$s17XtxwbCs1MDAzuulF/STauTkH0h/KJGHudlNQt3R4' where uid=0; 

The above crypted password sets the admin@local password to pilerrocks

How can I delete everything from the archive?

Follow the procedure below to do so, and it will reset the archive as if it were freshly installed.

#1: Stop searchd and piler:

/etc/init.d/rc.searchd stop
/etc/init.d/rc.piler stop

#2: Remove stored emails:

rm -rf /var/piler/store/00/*

#3: Drop and recreate the piler database:

mysql> drop database piler;
mysql> create database piler character set utf8;
mysql -u piler -p piler < /path/to/piler-source-directory/util/db-mysql.sql

#4: Reset the sphinx index data

rm -rf /var/piler/sphinx/*
su - piler
indexer --all

#5: Start searchd and piler:

/etc/init.d/rc.searchd start
/etc/init.d/rc.piler start 

Qmail (using as a smarthost) rejects emails from piler saying: 451 See http://pobox.com/~djb/docs/smtplf.html

Set the following in config-site.php:

$config['EOL'] = "\r\n";

I can see only today's emails in the archive and not any single previous emails.

Some Linux distributions (notably Debian and Ubuntu) have a daily cron job to reindex everything. Unfortunately this ruins the sphinx index files piler relies on. However the older emails are not lost you still have them, they are just disappeared from the sphinx index. To bring them back, perform the following steps.

1. Edit /etc/default/sphinxsearch, and set START=“no”.

I recommend you to use the piler shipped init.d/rc.searchd script to start searchd. You may call it from /etc/rc.local. (Note that it starts it as user piler, so make sure /var/piler/sphinx has proper ownership.)

2. Reindex old emails. After that older emails should appear after the next indexing is done.

cd /tmp
reindex -a 

How to add support for CJK (Chinese, Japanese, and Korean) languages?

Fix the sphinx indexer settings, ie. add the following two lines to all “index” sections in sphinx.conf:

ngram_len = 1
ngram_chars = U+3000..U+2FA1F

Finally chdir to /tmp, the reindex everything as the piler user.

cd /tmp
su piler
reindex -a 

I can't use the OVA with proxmox (qemu, kvm), because the filesystem is read-only, and showing disk errors.

Convert the vmdk files to qcow2 format.

I get a lot of “Format a4 is redefined”, and other messages when importing.

These warning or errors come from the external helper utilities, eg. pdftotext.

They are usually harmless, since piler is interested in only the textual part. However in case of a worst case scenario the given attachment is simply not indexed, and thus is not searchable, but the email is archived, and can be retrieved without any problem.

Outlook 2007 can't open a downloaded EML file.

Check out http://www.msoutlook.info/question/354 for a registry hack.

You get the following error: “Error: SQLSTATE[HY000] [2003] Can't connect to MySQL server on '127.0.0.1' (111) on database: sphinx”

Start searchd

I have Centos / Redhat, and when I click on the “Apply” button in the gui, it says: “add the following to /etc/sudoers: 'www-data all=nopasswd: /etc/init.d/rc.piler reload'“

Add the following to /etc/sudoers:

apache ALL=NOPASSWD: /etc/init.d/rc.piler reload
Defaults:%apache !requiretty 

I get the following warning in mysql logs:

[Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. REPLACE# SELECT is unsafe because the order in which rows are retrieved by the SELECT determines which (if any) rows are replaced. This order cannot be predicted and may differ on master and the slave. Statement: REPLACE INTO sph_counter (counter_id, max_doc_id) SELECT 1, MAX(id) FROM sph_index

Try setting MIXED format for binlog_format in the my.cnf file

binlog_format = MIXED 

mysql -u piler -p piler < ./util/db-mysql.sql gives me the following error: ERROR 1071 (42000) at line 109: Specified key was too long; max key length is 1000 bytes”

Make sure you have enabled the InnoDB storage engine, and try to create innodb tables (not myisam!).

I have lots of email addresses (100+) in the search query, and the query in the mail log is truncated, not ending with LIMIT xx,yy

Increase the thread_stack variable, and restart the searchd daemon. Be sure to read http://sphinxsearch.com/docs/latest/conf-thread-stack.html!

After importing EML files extracted from a PST file, some emails have “From: 'Some Name' <MAILER-DEAMON>” (or the To: field is “redacted”) in the email header.

It's not a piler bug, the problematic emails have bogus From and/or To headers. The solution is to fix them BEFORE importing. If you realise it only after importing, then purge the archive (if you can afford this) (see the FAQ item above on how), fix the emails, and import them again.

How much memory sphinx requires

See the explantion at http://sphinxsearch.com/blog/2011/11/11/sphinx-memory-consumption/.

Google Analytics Alternative