Pablo Iranzo Gómez's blog

ago 28, 2015

Filtering email with imapfilter

Since some time ago, email filter management was not scaling for me as I was using server-side filtering, I had to deal with the web-based interface which was missing some elements like drag&drop reordering of rules, cloning, etc.

As I was already using offlineimap to sync from the remote mailserver to my system into a maildir folder, I had almost all the elements I needed.

After searching for several options imapfilter seemed to be a perfect fit, so I started with a small set of rules and start integration with my email process.

On my first attempts, I setup a pre-sync hook on offlineimap by using as well as the postsync hook I already had:

presynchook  = time imapfilter
postsynchook = ~/.mutt/postsync-offlineimap.sh

Initial attempts were not good at all, applying filters on the remote imapserver was very time consuming and my actual 1 minute delay after finishing one check was becoming a real 10-15 minute interval between checks because of the imapfiltering and this was not scaling as I was putting new rules.

After some tries, and as I already had all the email synced offline, moved filtering to be locally instead of server-side, but as imapfilter requires an imap server, I tricked dovecot into using the local folder to be offered via imap:

protocols = imap
mail_location = maildir:~/.maildir/FOLDER/:INBOX=~/.maildir/FOLDER/.INBOX/
auth_debug_passwords=yes

This also required to change my foldernames to use "." in front of them, so I needed to change mutt configuration too for this:

set mask=".*"

and my mailfoders script:

set mbox_type=Maildir
set folder="~/.maildir/FOLDER"
set spoolfile="~/.maildir/FOLDER/.INBOX"

#mailboxes `echo -n "+ "; find ~/.cache/notmuch/mutt/results ~/.maildir/FOLDER -type d -not -name 'cur' -not -name 'new' -not -name 'tmp' -not -name '.notmuch' -not -name 'xapian' -not -name 'FOLDER' -printf "+'%f' "`

mailboxes `find ~/.maildir/FOLDER -type d -name cur -printf '%h '|tr " " "\n"|grep -v "^/home/iranzo/.maildir/FOLDER$"|sort|xargs echo`
#Store reply on current folder
folder-hook . 'set record="^"'

After this, I could start using imapfilter and start working on my set of rules... but first problem appeared, apparently I started having some duplicated email as I was cancelling and rerunning the script while debugging so a new tool was also introduced to 'dedup' my imap folder named IMAPdedup with a small script:

#!/bin/bash
(
for folder in $(python ~/.bin/imapdedup.py -s localhost  -u iranzo    -w '$PASSWORD'  -m -c -v  -l)
do
    python ~/.bin/imapdedup.py -s localhost  -u iranzo    -w '$PASSWORD'  -m -c  "$folder"

done
) 2>&1|grep "will be marked as deleted"

This script was taking care of listing all email foders on 'localhost' with my username and password (can be scripted or use external tools to gather it) and dedup email after each sync (in my postsync-offlinemap.sh as well as lbdq script for fetchning new addresses, notmuch and running imapfilter after syncing (to cath the limited filtering I do sever-side)

I still do some server-side filtering (4 rules), to get on a "Pending sort" folder all email which is either:

  • New support cases remain at INBOX
  • All emails from case updates, bugzilla, etc to _pending
  • All emails containing 'list' or 'bounces' in from to _pending
  • All emails not containing me directly on CC or To, to _pending

This more or less ensures a clean INBOX with most important things still there, and easier rule handling for email sorting.

So, after some tests, this is at the moment a simplified version of my filtering file:

---------------
--  Options  --
---------------

options.timeout = 30
options.subscribe = true
options.create = false

function offlineimap (key)
    local status
    local value
    status, value = pipe_from('grep -A2 ACCOUNT ~/.offlineimaprc | grep -v ^#|grep '.. key ..'|cut -d= -f2')C
        value = string.gsub(value, ' ', '')
        value = string.gsub(value, '\n', '')
        return value
end

----------------
--  Accounts  --
----------------

-- Connects to "imap1.mail.server", as user "user1" with "secret1" as
-- password.
EXAMPLE = IMAP {
    server = 'localhost',
    username = 'iranzo',
    password = '$PASSWORD',
    port = 143
}
-- My email
myuser = 'ranzo'

function mine(messages)
    email=messages:contain_cc(myuser)+messages:contain_to(myuser)+messages:contain_from(myuser)
    return email
end

function filter(messages,email,destination)
    messages:contain_from(email):move_messages(destination)
    messages:contain_to(email):move_messages(destination)
    messages:contain_cc(email):move_messages(destination)
    messages:contain_field('sender', email):move_messages(destination)
end

function deleteold(messages,days)
    todelete=messages:is_older(days)-mine(messages)
    todelete:move_messages(EXAMPLE['Trash'])
end


-- Define the msgs we're going to work on

-- Move sent messages to INBOX to later sorting
sent = EXAMPLE.Sent:select_all()
sent:move_messages(EXAMPLE['INBOX'])

inbox = EXAMPLE['INBOX']:select_all()
pending = EXAMPLE['INBOX/_pending']:select_all()
todos = pending + inbox

-- Mark as read messages sent from my user
todos:contain_from(myuser):is_recent():mark_seen()

-- Delete google calendar forwards
todos:contain_to('piranzo@gapps.example.com'):delete_messages()

-- Move all spam messages to Junk folder
spam = todos:contain_field('X-Spam-Score','*****')
spam:move_messages(EXAMPLE['Junk'])

-- Move Jive notifications
filter(todos,'jive-notify@example.com',EXAMPLE['INBOX/EXAMPLE/Customers/_jive'])

-- Filter EXAMPLEN
filter(todos,'dev-null@rhn.example.com',EXAMPLE['Trash'])

-- Filter PNT
filter(todos:contain_subject('[PNT] '),'noreply@example.com',EXAMPLE['Trash'])

-- Filter CPG (Customer Private Group)
filter(todos:contain_subject('Red Hat - Group '),'noreply@example.com',EXAMPLE['INBOX/EXAMPLE/Customers/Other/CPG'])

-- Remove month start reminders
todos:contain_subject('mailing list memberships reminder'):delete_messages()

-- Delete messages about New accounts created (RHN)
usercreated=todos:contain_subject('New Red Hat user account created')*todos:contain_from('noreply@example.com')
usercreated:delete_messages()

-- Search messages from CPG's
cpg = EXAMPLE['INBOX/EXAMPLE/Customers/Other/CPG']:select_all()
cpg:contain_subject('Cust1'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust1/CPG'])
cpg:contain_subject('Cust2'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust2/CPG'])
cpg:contain_subject('Cust3'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust3/CPG'])
cpg:contain_subject('Cust4'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust4/CPG'])

-- Move bugzilla messages
filter(todos:contain_subject('] New:'),'bugzilla@example.com',EXAMPLE['INBOX/EXAMPLE/Customers/_bugzilla/new'])
filter(todos,'bugzilla@example.com',EXAMPLE['INBOX/EXAMPLE/Customers/_bugzilla'])

-- Move all support messages to Other for later processing
filter(todos:contain_subject('(NEW) ('),'support@example.com',EXAMPLE['INBOX/EXAMPLE/Customers/_new'])
filter(todos:contain_subject('Case '),'support@example.com',EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases'])

EXAMPLE['INBOX/EXAMPLE/Customers/_new']:is_seen():move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases'])

support = EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases']:select_all()
-- Restart the search only for messages in Other to also process if we have new rules

support:contain_subject('is about to breach its SLA'):delete_messages()
support:contain_subject('has breached its SLA'):delete_messages()
support:contain_subject(' has had no activity in '):delete_messages()

-- Here the process is customer after customer and mark as read messages from non-prio customers
support:contain_body('Cust1'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust1/cases'])
support:contain_body('Cust2'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust2/cases'])
support:contain_body('Cust3'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust3/cases'])
support:contain_body('Cust4'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust4/cases'])

-- For customer swith common matching names, use header field
support:contain_field('X-SFDC-X-Account-Number', 'XXXX'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust5/cases'])
support:contain_body('Customer         : COMMONNAME'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust6/cases'])

-- Non prio customers (mark updates as read)
cust7 = support:contain_body('WATCHINGCUST') + support:contain_body('Cust7')
cust7:mark_seen()
cust7:move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust7/cases'])

-- Filter other messages by domain
filter(todos,'todos.es', EXAMPLE['INBOX/EXAMPLE/Customers/Cust8'])

-- Process all remaining messages in INBOX + all read messages in pending-sort for mailing lists and move to lists folder
filter(todos,'list', EXAMPLE['INBOX/Lists'])
filter(todos,'bounces',EXAMPLE['INBOX/Lists'])

-- Add EXAMPLE lists, inbox and _pending and Fedora default bin for reprocessing in case a new list has been added
lists = todos + EXAMPLE['INBOX/Lists']:select_all() + EXAMPLE['INBOX/Lists/Fedora']:select_all()

-- Mailing lists

-- EXAMPLE
filter(lists,'outages-list',EXAMPLE['INBOX/Lists/EXAMPLE/general/outage'])
filter(lists,'announce-list',EXAMPLE['INBOX/Lists/EXAMPLE/general/announce'])

-- Fedora
filter(lists,'kickstart-list',EXAMPLE['INBOX/Lists/Fedora/kickstart'])
filter(lists,'ambassadors@lists.fedoraproject.org',EXAMPLE['INBOX/Lists/Fedora/Ambassador'])
filter(lists,'infrastructure@lists.fedoraproject.org',EXAMPLE['INBOX/Lists/Fedora/infra'])
filter(lists,'announce@lists.fedoraproject.org',EXAMPLE['INBOX/Lists/Fedora/announce'])
filter(lists,'lists.fedoraproject.org',EXAMPLE['INBOX/Lists/Fedora'])

-- OSP
filter(lists,'openstack@lists.openstack.org',EXAMPLE['INBOX/Lists/OpenStack'])
filter(lists,'openstack-es@lists.openstack.org',EXAMPLE['INBOX/Lists/OpenStack/es'])

-- Filter my messages not filtered back to INBOX
mios=pending:contain_from(myuser)
mios:move_messages(EXAMPLE['INBOX'])

-- move messages we're in BCC to INBOX for manual sorting
hidden = pending - mine(pending)
hidden:move_messages(EXAMPLE['INBOX'])

-- Start processing of messages older than:
maxage=60

-- Delete old messages from mailing lists
deleteold(EXAMPLE['INBOX/Lists/EXAMPLE/general/media'],maxage)
deleteold(EXAMPLE['INBOX/Lists/EXAMPLE/general/outage'],maxage)

-- delete old cases
maxage=180

-- for each in $(cat .imapfilter/config.lua|grep -i cases|tr " ,()" "\n"|grep cases|sort|uniq|grep -v ":" );do echo "deleteold($each,maxage)";done
deleteold(EXAMPLE['INBOX/EXAMPLE/Customers/Cust1/cases'],maxage)
deleteold(EXAMPLE['INBOX/EXAMPLE/Customers/Cust2/cases'],maxage)
deleteold(EXAMPLE['INBOX/EXAMPLE/Customers/Cust3/cases'],maxage)
deleteold(EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases'],maxage)

deleteold(EXAMPLE['INBOX/EXAMPLE/Customers/_bugzilla'],maxage)

-- Empty trash every 7 days
maxage=7
deleteold(EXAMPLE['Trash'],maxage)

As this is applied filtering twice, offlineimap might be uploading part of your changes already, making it faster to next syncs, and suffle some of your emails while it runs.

The point of adding the already filtered set to be filtered again (CPG, cases, etc) is that if a new customer is consiredered to be filter on a folder of its own, the messages will be picked up and moved accordingly automatically ;-)

Hope it helps, and happy filtering!

Click to read and post comments

mar 28, 2015

Install RHEL7/Centos/Fedora on a software raid device

Installing Linux on a RAID has lot of advantages, from using RAID1 to enjoy protection against drive failures or RAID0 to combine the size of several drives to create bigger space for files with all the smaller disks we have.

There are several RAID level definitions and may have different uses depending on our needs and hardware availability.

For this, I focused on using raid1 for the system disks (for greater redundancy/protection against failures) and raid0 (for combining several disks to make bigger space available for non important data)..

Why or why not use a RAID via software

Pros

  • There's no propietary data on the disks that could require this specific controller in case the hardware fails.
  • Can be performed on any system, disk combination, etc

Cons

  • The use of dedicated HW RAID cards allows to offload the CPU intensive tasks for raid calculation, etc to the dedicated processor, freeing internal CPU for system/user usage.
  • Dedicated cards may have fancier features that require no support from the operating system as are all implemented by the card itself and presented to the OS as a standard drive.

Performing the setup

As I was installing on a HP Microserver G8 recently, I had to first disable the advanced mode for the included controller, so it behaved like a standard SATA one, once done, I was able to boot from my OS image (in this case EL7 iso).

Once the ISO is booted in rescue mode, I could switch to the second console with ALT-F2 so I could start executing commands on the shell.

First step is to setup partitioning, in this case I did two partitions, first one for holding /boot and the second one for setting up the LVM physical volume where the other Logical Volumes will be defined later.

I've elected this setup over others because mdadm allows transparent support for booting (grub supports booting form it) and easy to manage setup.

For partitions, remember to allocate at least 500mb for /boot and as much as needed for your SO, for example, if only base OS is expected to have RAID protection, having a 20Gb partition will be enough, leaving the remaining disk to be used for a RAID0 device for allocating non-critical files.

For both partitions, set type with fdisk to fd: Linux RAID autodetect, and setup the two drives we'll use for initial setup using the same values, for example:

fdisk /dev/sda
n # for new partition
p # for primary
<ENTER> # for first sector
+500M # for size
t # for type
fd # for Linux RAID autodetect
n # new partition
p # primary
<ENTER>
+20G #for size
t #for type
2 # for select 2nd partition
fd # for Linux RAID autodetect
# n for new partition
p # for primary
<ENTER> # for first sector
<ENTER> # for remaining disk
t # for type
3 # for third partition
fd # for Linux RAID Autodetect
w # for Writing changes

And repeat that for /dev/sdb

At this point, we'll have both sda and sdb with the same partitions defined: sd{a,b}1 with 500Mb for /boot and sd{a,b}2 with 20Gb for LVM and the remaining disk for RAID0 LVM.

Now, it's time to create the raid device on top, for simplicity, I tend to use md0 for /boot, so let's start with it.

Creating the raid devices with Multiple Devices mdadm

Let's create the raid devices for each system, starting with /boot:

mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda2 /dev/sdb2
mdadm --create /dev/md2 --level=0 --raid-devices=2 /dev/sda3 /dev/sdb3

Now, check the status of the raid device creation by issuing:

cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid1 sda1[0] sdb1[1]
      534760 blocks level 1, 64k chunk, algorithm 2 [2/2] [UU]
            [==>..................]  recovery = 12.6% (37043392/292945152) finish=127.5min speed=33440K/sec
md1 : active raid1 sda2[0] sdb2[1]
      20534760 blocks level 1, 64k chunk, algorithm 2 [2/2] [UU]
            [=====>...............]  recovery = 25.9% (37043392/692945152) finish=627.5min speed=13440K/sec
...

When it finishes, all the devices will appear as synced, and we can start the installation of the operating system.

What I did, after this point, is to reboot the install media, so I could use anaconda installer to select manually the filesystems, creating /boot on /dev/md0, then the Physical Volume on /dev/md1 for the operating system.

Select the manual partitioning during the installation to define above devices as their intended usage, and once it has been installed, create the additional Physical volume on /dev/md2 and define the intended mountpoints, etc.

Enjoy!

Click to read and post comments

mar 16, 2015

Podcasts with flexget and transmission

Some podcasts are available via rss feeds, so you can get notified of new episodes, so the best way I've found so far to automate this procedure is to use the utility 'flexget'.

Flexget can download an rss feed and get the .torrent files associated to them and store locally, which makes a perfect fit for later using Transmission's watch folder, to automatically add them to your download queue.

In order to do so, install flexget either via pip (pip install flexget) or using a package for your distribution a create a configuration file similar to this:

cat ~/.flexget/config.yml

tasks:
  download-rss:
    rss: http://URL/TO/YOUR/PODCAST/FEED
    all_series: yes
    only_new: yes
    download: /media/watch/

At each invokation of flexget execute it will access the rss feed, search for new files and store the relevant .torrent files on the folder /media/watch from where transmission will pick up the new files and add them to your downloading queue for automatic download.

posted at 21:45  ·   ·  fedora
Click to read and post comments

dic 08, 2011

Herramientas de sincronización/backup

Hace tiempo, estuve buscando una forma de simplificar por un lado, la sincronización de los documentos, fotos, etc entre diversos ordenadores y por otro utilizar esa copia en otros equipos a modo de 'backup' en caso de fallo hardware de algún equipo.

Estuve probando y evaluando diversas herramientas multiplataforma y junto a algunas más, las que más probé fueron estas:

Actualmente necesito 'salvar/sincronizar' unos 35Gb de datos (fotos la gran mayoría y no precisamente en RAW), lo que provocó estas comparativas, preferiblemente de servicios que ofrecieran espacio gratuito.

En resumen, de las opensource (Syncany/Sparkleshare), están todavía bastante en pañales, aunque prometen mucho, más syncany por la variedad de posibles destinos de copias que sparkleshare (necesidad de repositorio git, etc).

De las 'comerciales', Wuala ofrecía posibilidad de compartir espacio local a cambio de espacio en la nube que permitía compartir esos GB no utilizados por GB fuera de donde tenías tus ordenadores, garantizando que en caso de problemas fisicos, ambientales,etc, poder acceder a los datos, pero en Octubre de 2011, con la salida de la nueva versión, dejaron de ofrecer el servicio, convirtiendo el espacio conseguido en un 'vale' hasta Octubre de 2012... que ha provocado que tenga que reemplazar esa solución que venía utilizando.

Dropbox, a pesar de los referrals y de cuenta de email ".edu", no ofrece más de 16 Gb a no ser que sean planes de pago, que por ahora la deja fuera (además del hecho de no permitir especificar clave de cifrado de los datos, que por otro lado aporta que si alguien había subido el fichero antes, la subida es instantánea).

SpiderOak ofrece menos espacio (pero parte de 5Gb y ahora es posible subir hasta 50Gb por referrals), tiene repositorio para yum para las actualizaciones y se puede ejecutar tanto en modo 'GUI' como en línea de comandos (por contra la sincronización se lanza cada x tiempo, no parece tan fluida como Dropbox).

AeroFS: una recién llegada, que sigue en versión alpha, basada en Java, multiplataforma (EL6 o superior) y que no ofrece de 'salida' espacio online, simplemente la sincronización entre tus equipos, pudiendo hacer cambios en cualquiera de ellos y luego sincronizándolos entre sí, incluso sin conectividad a Internet (sólo hace falta internet para dar de alta una carpeta nueva en un equipo o dar de alta un equipo (para sincronización de autorizaciones con el servidor de Aerofs)).

Actualmente AeroFS es la que más promete por no limitar el espacio excepto al que dispongas en tus equipos, pero con mi colección de documentos y 4 equipos a la vez he tenido problemas de ficheros renombrados "terminados con números entre paréntesis", sincronizaciones lentas (utilizar un interfaz vía hamachi en lugar del interfaz de red local), no sincronización de enlaces simbólicos y destrucción de ficheros con enlaces duros (deja sólo una copia), pero es la única que cubre por ahora mis necesidades con respecto a lo que hacía wuala (a excepción de no tener aplicaciones para móvil y claro está, no funcionar si ningún equipo con los datos está conectado).

De como evolucionen syncany y aerofs de aquí a octubre, decidirá si escojo una de estas dos o sigo evaluando alternativas para los backups... por ahora sigo con Dropbox para cosas rápidas que no necesito excesiva privacidad y para el resto Wuala y he añadido duplicity para hacer backup a un NAS en mi red local a la espera de ir probando mejor AeroFS y ver cómo respira syncany...

Click to read and post comments

Probando nuevos escritorios

Hace ya años que pasé de KDE con la llegada de KDE 3 a utilizar GNOME, más 'espartano', pero con mejor rendimiento.

Con Fedora 15 pasé a utilizar GNOME 3, bastante 'innovador', algo extraño, pero me fui acostumbrando relativamente bien, pero con la llegada de Fedora 16 y la versión 3.2, noto el equipo (que sigue siendo el mismo) con frecuentes enganchones y lentitud.

Una opción sería subir los 2Gb de memoria a 4Gb, pero me parece tirar dinero cuando el equipo necesitaría una renovación durante el próximo año (P4 dual 2.8 Ghz del año 2004).

Así que me he animado, primero con XFCE, que va mejor, pero lo sigo notando lento de respuesta y por ahora con LXDE, de entorno muy parecido a GNOME 2.x, rápido, con iconos en el escritorio, etc, pero con alguna pega.

Mejora:

  • Velocidad
  • Iconos en el escritorio
  • Velocidad del terminal, tanto en XFCE como en LXDE con respecto a gnome-terminal y combinaciones de teclas muy parecidas para nuevas pestañas, etc

Empeora:

  • El gestor de ficheros no ordena alfabéticamente de forma correcta
  • La configuración de algunos aspectos como combinaciones de teclas para bloquear entorno deben hacerse con un editor de ficheros
  • Echo en falta el 'exposé' de GNOME3 ahora que me había acostumbrado
  • A veces el alt-f2 abre el diálogo de ejecutar por detrás de la ventana activa y toca usar alt-tab para ponerlo en primer plano.

Por ahora voy a seguir con LXDE en dos equipos y ver si finalmente lo pongo en todos o busco alguna alternativa más.

posted at 11:13  ·   ·  fedora
Click to read and post comments
Next → Page 1 of 3