Pablo Iranzo Gómez's blog

A bunch of unrelated data

oct 16, 2018

Contributing to OSP upstream a.k.a. Peer Review

Table of contents

  1. Introduction
  2. Upstream workflow
    1. Peer review
    2. CI tests (Verified +1)
    3. Code Review+2
    4. Workflow+1
    5. Cannot merge, please rebase
  3. How do we do it with Citellus?


In the article "Contributing to OpenStack" we did cover on how to prepare accounts and prepare your changes for submission upstream (and even how to find low hanging fruits to start contributing).

Here, we'll cover what happens behind the scene to get change published.

Upstream workflow

Peer review

Upstream contributions to OSP and other projects are based on Peer Review, that means that once a new set of code has been submitted, several steps for validation are required/happen before having it implemented.

The last command executed (git-review) on the submit sequence (in the prior article) will effectively submit the patch to the defined git review service (git-review -s does the required setup process) and will print an URL that can be used to access the review.

Each project might have a different review platform, but usually for OSP it's while for other projects it can be,, etc (this is defined in .gitreview file in the repository).

A sample .gitreview file looks like:


For a review example, we'll use one from gerrithub from Citellus project:

Here, we can see that we're on review 380646 and that's the link that allows us to check the changes submitted (the one printed when executing git-review).

CI tests (Verified +1)

Once a review has been submitted, usually the bots are the first ones to pick them and run the defined unit testing on the new changes, to ensure that it doesn't break anything (based on what is defined to be tested).

This is a critical point as:

  • Tests need to be defined if new code is added or modified to ensure that later updates doesn't break this new code without someone being aware.
  • Infrastructure should be able to test it (for example you might need some specific hardware to test a card or network configuration)
  • Environment should be sane so that prior runs doesn't affect the validation.

OSP CI can be checked at 'Zuul' where you can 'input' the number for your review and see how the different bots are running CI tests on it or if it's still queued.

If everything is OK, the bot will 'vote' your change as Verified +1 allowing others to see that it should not break anything based on the tests performed

In the case of OSP, there's also third-party CI's that can validate other changes by third party systems. For some of them, the votes are counting towards or against the proposed change, for others it's just a comment to take into account.

Even if sometimes you know that your code is right, there's a failure because of the infrastructure, in those cases, writing a new comment saying recheck, will schedule a new CI test run.

This is common usually during busy periods when it's harder for the scheduler to get available resources for the review validation. Also, sometimes there are errors in the configuration of CI that must be fixed in order to validate those changes.

Note: you can run some of the tests on your system to validate faster if you've issues by running tox this will setup virtual environment for tests to be run so it's easier to catch issues before upstream CI does (so it's always a good idea to run tox even before submitting the review with git-review to detect early errors).

This is however not always possible as some changes include requirements like testing upgrades, full environment deployments, etc that cannot be done without the required preparation steps or even the infrastructure.

Code Review+2

This is probably the 'longest' process, it requires peers to be added as 'reviewer' (you can get an idea on the names based on other reviews submitted for the same component) or they will pick up new reviews as the pop un on notification channels or pending queues.

On this, you must prepare mentally for everything... developers could suggest to use a different approach, or highlight other problems or just do some small nit comments to fixes like formating, spacing, var naming, etc.

After each comment/change suggested, repeat the workflow for submitting a new patchset, but make sure you're using the same review id (that's by keeping the commit id that is appended): this allows the Code Review platform to identify this change as an update to a prior one, and allow you for example to compare changes across versions, etc. (and also notify the prior reviewers of new changes).

Once reviewers are OK with your code, and with some 'Core' developers also agreeing, you'll see some voting happening (-2..+2) meaning they like the change in its actual form or not.

Once you get Code Review +2 and with the prior Verified +1 you're almost ready to get the change merged.


Ok, last step is to have someone with Workflow permissions to give a +1, this will 'seal' the change saying that everything is ok (as it had CR+2 and Verified+1) and change is valid...

This vote will trigger another build by CI, and when finished, the change will be merged into the code upstream, congratulations!

Cannot merge, please rebase

Sometimes, your change is doing changes on the same files that other programmers did on the code, so there's no way to automatically 'rebase' the change, in this case the bad news is that you need to:

git checkout master # to change to master branch
git pull # to push latest upstream changes
git checkout yourbranch # to get back to your change branch
git rebase master # to apply your changes on top of current master

After this step, it might be required to manually fix the code to solve the conflicts and follow instructions given by git to mark them as reviewed.

Once it's done, remember to do like with any patchset you submited afterwards:

git commit --amend # to commit the new changes on the same commit Id you used
git-review # to upload a new version of the patchset

This will start over the progress, but will, once completed to get the change merged.

How do we do it with Citellus?

In Citellus we've replicated more or less what we've upstream... even the use of tox.

Citellus does use (free service that hooks on github and allows to do PR)

We've setup a machine that runs Jenkins to do 'CI' on the tests we've defined (mostly for python wrapper and some tests) and what effectively does is to run tox, and also, we do use free Tier to repeat the same on other platform.

Tox is a tool that allows to define several commands that are executed inside python virtual environments, so without touching your system libraries, it can get installed new ones or removed just for the boundaries of that test, helping into running:

  • pep8 (python formating compliance)
  • py27 (python 2.7 environment test)
  • py35 (python 3.5 environment test)

The py tests are just to validate the code can run on both base python versions, and what they do is to run the defined unit testing scripts under each interpreter to validate.

For local test, you can run tox and it will go trough the different tests defined and report status... if everything is ok, it should be possible that your new code review passes also CI.

Jenkins will do the +1 on verified and 'core' reviewers will give +2 and 'merge' the change once validated.

Hope you enjoy!


Click to read and post comments

ago 28, 2015

Filtering email with imapfilter

Since some time ago, email filter management was not scaling for me as I was using server-side filtering, I had to deal with the web-based interface which was missing some elements like drag&drop reordering of rules, cloning, etc.

As I was already using offlineimap to sync from the remote mailserver to my system into a maildir folder, I had almost all the elements I needed.

After searching for several options imapfilter seemed to be a perfect fit, so I started with a small set of rules and start integration with my email process.

On my first attempts, I setup a pre-sync hook on offlineimap by using as well as the postsync hook I already had:

presynchook  = time imapfilter
postsynchook = ~/.mutt/

Initial attempts were not good at all, applying filters on the remote imapserver was very time consuming and my actual 1 minute delay after finishing one check was becoming a real 10-15 minute interval between checks because of the imapfiltering and this was not scaling as I was putting new rules.

After some tries, and as I already had all the email synced offline, moved filtering to be locally instead of server-side, but as imapfilter requires an imap server, I tricked dovecot into using the local folder to be offered via imap:

protocols = imap
mail_location = maildir:~/.maildir/FOLDER/:INBOX=~/.maildir/FOLDER/.INBOX/

This also required to change my foldernames to use "." in front of them, so I needed to change mutt configuration too for this:

set mask=".*"

and my mailfoders script:

set mbox_type=Maildir
set folder="~/.maildir/FOLDER"
set spoolfile="~/.maildir/FOLDER/.INBOX"

#mailboxes `echo -n "+ "; find ~/.cache/notmuch/mutt/results ~/.maildir/FOLDER -type d -not -name 'cur' -not -name 'new' -not -name 'tmp' -not -name '.notmuch' -not -name 'xapian' -not -name 'FOLDER' -printf "+'%f' "`

mailboxes `find ~/.maildir/FOLDER -type d -name cur -printf '%h '|tr " " "\n"|grep -v "^/home/iranzo/.maildir/FOLDER$"|sort|xargs echo`
#Store reply on current folder
folder-hook . 'set record="^"'

After this, I could start using imapfilter and start working on my set of rules... but first problem appeared, apparently I started having some duplicated email as I was cancelling and rerunning the script while debugging so a new tool was also introduced to 'dedup' my imap folder named IMAPdedup with a small script:

for folder in $(python ~/.bin/ -s localhost  -u iranzo    -w '$PASSWORD'  -m -c -v  -l)
    python ~/.bin/ -s localhost  -u iranzo    -w '$PASSWORD'  -m -c  "$folder"

) 2>&1|grep "will be marked as deleted"

This script was taking care of listing all email foders on 'localhost' with my username and password (can be scripted or use external tools to gather it) and dedup email after each sync (in my as well as lbdq script for fetchning new addresses, notmuch and running imapfilter after syncing (to cath the limited filtering I do sever-side)

I still do some server-side filtering (4 rules), to get on a "Pending sort" folder all email which is either:

  • New support cases remain at INBOX
  • All emails from case updates, bugzilla, etc to _pending
  • All emails containing 'list' or 'bounces' in from to _pending
  • All emails not containing me directly on CC or To, to _pending

This more or less ensures a clean INBOX with most important things still there, and easier rule handling for email sorting.

So, after some tests, this is at the moment a simplified version of my filtering file:

--  Options  --

options.timeout = 30
options.subscribe = true
options.create = false

function offlineimap (key)
    local status
    local value
    status, value = pipe_from('grep -A2 ACCOUNT ~/.offlineimaprc | grep -v ^#|grep '.. key ..'|cut -d= -f2')C
        value = string.gsub(value, ' ', '')
        value = string.gsub(value, '\n', '')
        return value

--  Accounts  --

-- Connects to "imap1.mail.server", as user "user1" with "secret1" as
-- password.
    server = 'localhost',
    username = 'iranzo',
    password = '$PASSWORD',
    port = 143
-- My email
myuser = 'ranzo'

function mine(messages)
    return email

function filter(messages,email,destination)
    messages:contain_field('sender', email):move_messages(destination)

function deleteold(messages,days)

-- Define the msgs we're going to work on

-- Move sent messages to INBOX to later sorting
sent = EXAMPLE.Sent:select_all()

inbox = EXAMPLE['INBOX']:select_all()
pending = EXAMPLE['INBOX/_pending']:select_all()
todos = pending + inbox

-- Mark as read messages sent from my user

-- Delete google calendar forwards

-- Move all spam messages to Junk folder
spam = todos:contain_field('X-Spam-Score','*****')

-- Move Jive notifications

-- Filter EXAMPLEN

-- Filter PNT
filter(todos:contain_subject('[PNT] '),'',EXAMPLE['Trash'])

-- Filter CPG (Customer Private Group)
filter(todos:contain_subject('Red Hat - Group '),'',EXAMPLE['INBOX/EXAMPLE/Customers/Other/CPG'])

-- Remove month start reminders
todos:contain_subject('mailing list memberships reminder'):delete_messages()

-- Delete messages about New accounts created (RHN)
usercreated=todos:contain_subject('New Red Hat user account created')*todos:contain_from('')

-- Search messages from CPG's
cpg = EXAMPLE['INBOX/EXAMPLE/Customers/Other/CPG']:select_all()

-- Move bugzilla messages
filter(todos:contain_subject('] New:'),'',EXAMPLE['INBOX/EXAMPLE/Customers/_bugzilla/new'])

-- Move all support messages to Other for later processing
filter(todos:contain_subject('(NEW) ('),'',EXAMPLE['INBOX/EXAMPLE/Customers/_new'])
filter(todos:contain_subject('Case '),'',EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases'])


support = EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases']:select_all()
-- Restart the search only for messages in Other to also process if we have new rules

support:contain_subject('is about to breach its SLA'):delete_messages()
support:contain_subject('has breached its SLA'):delete_messages()
support:contain_subject(' has had no activity in '):delete_messages()

-- Here the process is customer after customer and mark as read messages from non-prio customers

-- For customer swith common matching names, use header field
support:contain_field('X-SFDC-X-Account-Number', 'XXXX'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust5/cases'])
support:contain_body('Customer         : COMMONNAME'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust6/cases'])

-- Non prio customers (mark updates as read)
cust7 = support:contain_body('WATCHINGCUST') + support:contain_body('Cust7')

-- Filter other messages by domain
filter(todos,'', EXAMPLE['INBOX/EXAMPLE/Customers/Cust8'])

-- Process all remaining messages in INBOX + all read messages in pending-sort for mailing lists and move to lists folder
filter(todos,'list', EXAMPLE['INBOX/Lists'])

-- Add EXAMPLE lists, inbox and _pending and Fedora default bin for reprocessing in case a new list has been added
lists = todos + EXAMPLE['INBOX/Lists']:select_all() + EXAMPLE['INBOX/Lists/Fedora']:select_all()

-- Mailing lists


-- Fedora

-- OSP

-- Filter my messages not filtered back to INBOX

-- move messages we're in BCC to INBOX for manual sorting
hidden = pending - mine(pending)

-- Start processing of messages older than:

-- Delete old messages from mailing lists

-- delete old cases

-- for each in $(cat .imapfilter/config.lua|grep -i cases|tr " ,()" "\n"|grep cases|sort|uniq|grep -v ":" );do echo "deleteold($each,maxage)";done


-- Empty trash every 7 days

As this is applied filtering twice, offlineimap might be uploading part of your changes already, making it faster to next syncs, and suffle some of your emails while it runs.

The point of adding the already filtered set to be filtered again (CPG, cases, etc) is that if a new customer is consiredered to be filter on a folder of its own, the messages will be picked up and moved accordingly automatically ;-)

Hope it helps, and happy filtering!

Click to read and post comments

mar 28, 2015

Install RHEL7/Centos/Fedora on a software raid device

Table of contents

  1. Why or why not use a RAID via software
    1. Pros
    2. Cons
  2. Performing the setup
  3. Creating the raid devices with Multiple Devices mdadm

Installing Linux on a RAID has lot of advantages, from using RAID1 to enjoy protection against drive failures or RAID0 to combine the size of several drives to create bigger space for files with all the smaller disks we have.

There are several RAID level definitions and may have different uses depending on our needs and hardware availability.

For this, I focused on using raid1 for the system disks (for greater redundancy/protection against failures) and raid0 (for combining several disks to make bigger space available for non important data)..

Why or why not use a RAID via software


  • There's no propietary data on the disks that could require this specific controller in case the hardware fails.
  • Can be performed on any system, disk combination, etc


  • The use of dedicated HW RAID cards allows to offload the CPU intensive tasks for raid calculation, etc to the dedicated processor, freeing internal CPU for system/user usage.
  • Dedicated cards may have fancier features that require no support from the operating system as are all implemented by the card itself and presented to the OS as a standard drive.

Performing the setup

As I was installing on a HP Microserver G8 recently, I had to first disable the advanced mode for the included controller, so it behaved like a standard SATA one, once done, I was able to boot from my OS image (in this case EL7 iso).

Once the ISO is booted in rescue mode, I could switch to the second console with ALT-F2 so I could start executing commands on the shell.

First step is to setup partitioning, in this case I did two partitions, first one for holding /boot and the second one for setting up the LVM physical volume where the other Logical Volumes will be defined later.

I've elected this setup over others because mdadm allows transparent support for booting (grub supports booting form it) and easy to manage setup.

For partitions, remember to allocate at least 500mb for /boot and as much as needed for your SO, for example, if only base OS is expected to have RAID protection, having a 20Gb partition will be enough, leaving the remaining disk to be used for a RAID0 device for allocating non-critical files.

For both partitions, set type with fdisk to fd: Linux RAID autodetect, and setup the two drives we'll use for initial setup using the same values, for example:

fdisk /dev/sda
n # for new partition
p # for primary
<ENTER> # for first sector
+500M # for size
t # for type
fd # for Linux RAID autodetect
n # new partition
p # primary
+20G #for size
t #for type
2 # for select 2nd partition
fd # for Linux RAID autodetect
# n for new partition
p # for primary
<ENTER> # for first sector
<ENTER> # for remaining disk
t # for type
3 # for third partition
fd # for Linux RAID Autodetect
w # for Writing changes

And repeat that for /dev/sdb

At this point, we'll have both sda and sdb with the same partitions defined: sd{a,b}1 with 500Mb for /boot and sd{a,b}2 with 20Gb for LVM and the remaining disk for RAID0 LVM.

Now, it's time to create the raid device on top, for simplicity, I tend to use md0 for /boot, so let's start with it.

Creating the raid devices with Multiple Devices mdadm

Let's create the raid devices for each system, starting with /boot:

mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda2 /dev/sdb2
mdadm --create /dev/md2 --level=0 --raid-devices=2 /dev/sda3 /dev/sdb3

Now, check the status of the raid device creation by issuing:

cat /proc/mdstat

Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid1 sda1[0] sdb1[1]
      534760 blocks level 1, 64k chunk, algorithm 2 [2/2] [UU]
            [==>..................]  recovery = 12.6% (37043392/292945152) finish=127.5min speed=33440K/sec
md1 : active raid1 sda2[0] sdb2[1]
      20534760 blocks level 1, 64k chunk, algorithm 2 [2/2] [UU]
            [=====>...............]  recovery = 25.9% (37043392/692945152) finish=627.5min speed=13440K/sec

When it finishes, all the devices will appear as synced, and we can start the installation of the operating system.

What I did, after this point, is to reboot the install media, so I could use anaconda installer to select manually the filesystems, creating /boot on /dev/md0, then the Physical Volume on /dev/md1 for the operating system.

Select the manual partitioning during the installation to define above devices as their intended usage, and once it has been installed, create the additional Physical volume on /dev/md2 and define the intended mountpoints, etc.


Click to read and post comments

mar 16, 2015

Podcasts with flexget and transmission

Some podcasts are available via rss feeds, so you can get notified of new episodes, so the best way I've found so far to automate this procedure is to use the utility 'flexget'.

Flexget can download an rss feed and get the .torrent files associated to them and store locally, which makes a perfect fit for later using Transmission's watch folder, to automatically add them to your download queue.

In order to do so, install flexget either via pip (pip install flexget) or using a package for your distribution a create a configuration file similar to this:

cat ~/.flexget/config.yml

    rss: http://URL/TO/YOUR/PODCAST/FEED
    all_series: yes
    only_new: yes
    download: /media/watch/

At each invokation of flexget execute it will access the rss feed, search for new files and store the relevant .torrent files on the folder /media/watch from where transmission will pick up the new files and add them to your downloading queue for automatic download.

posted at 21:45  ·   ·  fedora  foss
Click to read and post comments

dic 08, 2011

Herramientas de sincronización/backup

Hace tiempo, estuve buscando una forma de simplificar por un lado, la sincronización de los documentos, fotos, etc entre diversos ordenadores y por otro utilizar esa copia en otros equipos a modo de 'backup' en caso de fallo hardware de algún equipo.

Estuve probando y evaluando diversas herramientas multiplataforma y junto a algunas más, las que más probé fueron estas:

Actualmente necesito 'salvar/sincronizar' unos 35Gb de datos (fotos la gran mayoría y no precisamente en RAW), lo que provocó estas comparativas, preferiblemente de servicios que ofrecieran espacio gratuito.

En resumen, de las opensource (Syncany/Sparkleshare), están todavía bastante en pañales, aunque prometen mucho, más syncany por la variedad de posibles destinos de copias que sparkleshare (necesidad de repositorio git, etc).

De las 'comerciales', Wuala ofrecía posibilidad de compartir espacio local a cambio de espacio en la nube que permitía compartir esos GB no utilizados por GB fuera de donde tenías tus ordenadores, garantizando que en caso de problemas fisicos, ambientales,etc, poder acceder a los datos, pero en Octubre de 2011, con la salida de la nueva versión, dejaron de ofrecer el servicio, convirtiendo el espacio conseguido en un 'vale' hasta Octubre de 2012... que ha provocado que tenga que reemplazar esa solución que venía utilizando.

Dropbox, a pesar de los referrals y de cuenta de email ".edu", no ofrece más de 16 Gb a no ser que sean planes de pago, que por ahora la deja fuera (además del hecho de no permitir especificar clave de cifrado de los datos, que por otro lado aporta que si alguien había subido el fichero antes, la subida es instantánea).

SpiderOak ofrece menos espacio (pero parte de 5Gb y ahora es posible subir hasta 50Gb por referrals), tiene repositorio para yum para las actualizaciones y se puede ejecutar tanto en modo 'GUI' como en línea de comandos (por contra la sincronización se lanza cada x tiempo, no parece tan fluida como Dropbox).

AeroFS: una recién llegada, que sigue en versión alpha, basada en Java, multiplataforma (EL6 o superior) y que no ofrece de 'salida' espacio online, simplemente la sincronización entre tus equipos, pudiendo hacer cambios en cualquiera de ellos y luego sincronizándolos entre sí, incluso sin conectividad a Internet (sólo hace falta internet para dar de alta una carpeta nueva en un equipo o dar de alta un equipo (para sincronización de autorizaciones con el servidor de Aerofs)).

Actualmente AeroFS es la que más promete por no limitar el espacio excepto al que dispongas en tus equipos, pero con mi colección de documentos y 4 equipos a la vez he tenido problemas de ficheros renombrados "terminados con números entre paréntesis", sincronizaciones lentas (utilizar un interfaz vía hamachi en lugar del interfaz de red local), no sincronización de enlaces simbólicos y destrucción de ficheros con enlaces duros (deja sólo una copia), pero es la única que cubre por ahora mis necesidades con respecto a lo que hacía wuala (a excepción de no tener aplicaciones para móvil y claro está, no funcionar si ningún equipo con los datos está conectado).

De como evolucionen syncany y aerofs de aquí a octubre, decidirá si escojo una de estas dos o sigo evaluando alternativas para los backups... por ahora sigo con Dropbox para cosas rápidas que no necesito excesiva privacidad y para el resto Wuala y he añadido duplicity para hacer backup a un NAS en mi red local a la espera de ir probando mejor AeroFS y ver cómo respira syncany...

Click to read and post comments
Next → Page 1 of 3