Pablo Iranzo Gómez's blog

nov 05, 2016

Unit testing for stampy

Since my prior post on Contributing to OpenStack, I liked the idea of using some automated tests to validate functionality and specifically, the corner cases that could arise when playing with the code.

Most of the errors fixed so far on stampy, were related with some pieces of the code not properly handling UTF or some information returned, etc and still it has improved, the idea of ensuring that prior errors were not put back into the code when some other changes were performed, started to arise to be a priority.

For implementing them, I made use of nose, which can be executed with nosetests and are available on Fedora as 'python-nose' and to provide further automation, I've also relied on tox also inspired n what OpenStack does.

Let's start with tox: once installed, a new configuration file is created for it, defining the different environments and configuration in a similar way to:

minversion = 2.0
envlist = py27,pep8
skipsdist = True

passenv = CI TRAVIS TRAVIS_*
deps = -r{toxinidir}/requirements.txt
commands =
    /usr/bin/find . -type f -name "*.pyc" -delete
    nosetests \
commands = flake8

commands = {posargs}

commands =
  coverage report

show-source = True

This file, defines two environments, one for validating pep8 for the python formatting and another one for validating python 2.7.

The environment definition for the tests, also performs some commands like executed the forementioned nosetests to run the defined unit tests.

Above tox.ini also mentions requirements.txt and test-requirements.txt, which define the python packages required to validate the program, that will be automatically installed by tox on a virtualenv, so the alternate versions being used, doesn't interfere with the system-wide ones we're using.

About the tests themselves, as nosetests does automatic discovery of tests to perform, I've created a new folder named tests/ and placed there some files in alphabetically order:

ls -l tests
total 28
-rw-r--r--. 1 iranzo iranzo  709 nov  5 16:58
-rw-r--r--. 1 iranzo iranzo  739 nov  3 09:56
-rw-r--r--. 1 iranzo iranzo  456 nov  3 23:53
-rw-r--r--. 1 iranzo iranzo  581 nov  3 09:56
-rw-r--r--. 1 iranzo iranzo 3544 nov  5 18:19
-rw-r--r--. 1 iranzo iranzo  477 nov  3 23:15
-rw-r--r--. 1 iranzo iranzo  230 nov  3 09:56

First one test_00-setup takes the required commands to define the enviroment, as on each validation run of tox, a new environment should be available not to mask errors that could be overlooked.

#!/usr/bin/env python
# encoding: utf-8

from unittest import TestCase

from stampy.stampy import config, setconfig, createdb, dbsql

# Precreate DB for other operations to work

# Define configuration for tests
setconfig('token', '279488369:AAFqGVesZ-81n9sFafLQxUUCVO8_8L3JNEU')
setconfig('owner', 'iranzo')
setconfig('url', '')
setconfig('verbosity', 'DEBUG')

# Empty karma database in case it contained some leftover
dbsql('DELETE from karma')
dbsql('DELETE from quote')

class TestStampy(TestCase):
    def test_owner(self):
        self.assertEqual(config('owner'), 'iranzo')

This file creates the database if none is existing and defines some sample values, like DEBUG level, url for contacting telegram API servers, or even a token that can be used to test the functionality for sending messages.

Also, if the database is already existing, empties the karma table, quotes (and sets sequence to 0 to simulate TRUNCATE which is not available on sqlite)

An unittest is specified under the class inherited from TestCase imported from unittest, there for each one of the tests we want to performed, a new 'definition' is created and after it an assert is used, for example assertEqual validates that the function call returns the value provided as secondary argument, failing otherwise.

From that point, the tests are performed again in alphabetically order, so be careful in the naming of each tests or define a sequence number to use a top-to-bottom approach that will be probably easier to understand.

For example, for karma changes we've:

#!/usr/bin/env python
# encoding: utf-8

from unittest import TestCase

from stampy.stampy import getkarma, updatekarma, putkarma

class TestStampy(TestCase):
    def test_putkarma(self):
        putkarma('patata', 0)
        self.assertEqual(getkarma('patata'), 0)

    def test_getkarma(self):
        self.assertEqual(getkarma('patata'), 0)

    def test_updatekarmaplus(self):
        updatekarma('patata', 2)
        self.assertEqual(getkarma('patata'), 2)

    def test_updatekarmarem(self):
        updatekarma('patata', -1)
        self.assertEqual(getkarma('patata'), 1)

Which starts by putting a known karma on a word, validating, verifying the query, update the value by a positive number and later, decrease it with a negative one.

For the aliases, we use a similar aproach, as we also play with the karma changes when an alias is defined:

#!/usr/bin/env python
# encoding: utf-8

from unittest import TestCase

from stampy.stampy import getkarma, putkarma, createalias, getalias, deletealias

class TestStampy(TestCase):

    def test_createalias(self):
        createalias('patata', 'creilla')
        self.assertEqual(getalias('patata'), 'creilla')

    def test_getalias(self):
        self.assertEqual(getalias('patata'), 'creilla')

    def test_increasealiaskarma(self):
        updatekarma('patata', 1)
        self.assertEqual(getkarma('patata'), 1)

        # Alias doesn't get increased as the 'aliases' modifications are in
        # process, not in the individual functions
        self.assertEqual(getkarma('creilla'), 0)

    def test_removealias(self):
        self.assertEqual(getkarma('creilla'), 0)

    def test_removekarma(self):
        putkarma('patata', 0)
        self.assertEqual(getkarma('patata'), 0)

Where an alias is created, verified, karma in creased on the word with an alias, and then the aliased value.

As noted in the above example, the individual function for the karma doesn't take into consideration the aliases so this must be handled by processing a message set via process(messages) which has been also modified as well as other functions to allow the implementation of individual tests for them.

This will for sure end up with some more code rewriting so the functions can be fully tested individually and as a whole, to ensure that the bot behaves as intended... and many more tests to come to the code.

As an end, an example of the execution of tox and the results raised:

py27 installed: coverage==4.2,nose==1.3.7,prettytable==0.7.2
py27 runtests: PYTHONHASHSEED='604985980'
py27 runtests: commands[0] | /usr/bin/find . -type f -name *.pyc -delete
py27 runtests: commands[1] | nosetests
Ran 18 tests in 14.996s

pep8 installed: coverage==4.2,nose==1.3.7,prettytable==0.7.2
pep8 runtests: PYTHONHASHSEED='604985980'
pep8 runtests: commands[0] | flake8
WARNING:test command found but not installed in testenv
  cmd: /usr/bin/flake8
  env: /home/iranzo/DEVEL/private/stampython/.tox/pep8
Maybe you forgot to specify a dependency? See also the whitelist_externals envconfig setting.
__________________________________________________________________________ summary ___________________________________________________________________________
  py27: commands succeeded
  pep8: commands succeeded
  congratulations :)

If you're using a CI system, like 'Travis', which is also available to repos, a .travis.yml can be added to the repo to ensure those tests are performed automatically on each code push:

language: python
    - 2.7

    email: false

    - pip install pep8
    - pip install misspellings
    - pip install nose

    # Run pep8 on all .py files in all subfolders
    # (I ignore "E402: module level import not at top of file"
    # because of use case sys.path.append('..'); import <module>)
    - find . -name \*.py -exec pep8 --ignore=E402,E501 {} +
    - find . -name '*.py' | misspellings -f -
    - nosetests


Click to read and post comments

jul 21, 2016

Contributing to OpenStack

Contributing to an opensource project might take some time at the beginning, the good thing with OpenStack is that there are lot of guides on how to start and collaborate.

What I did is to look for a bug in the project tagged as low-hanging-fruit, this allows to browse a large list of bugs that are classified as easy, so they are the best place for new starters to get familiar with the workflow.

I did found an issue with weight which is supposed to be an integer, that was doing a conversion from float to integer (0.1 -> 0) which was considered invalid, and instead an error should be returned.

When I checked the Neutron-LBaaS I found out where the problem was, as the value provided, was being converted to integer instead of validating it.

Before contributing you need to:

Submitting a change is quite easy:

# Select the project, 'neutron-lbaas' for me
git clone$each.git
cd $each
# This setups git-review, getting required hooks, etc
git-review -s
# create a new branch so we can keep our changes separate
git branch new-branch

# Edit files with changes
git add $files
git commit -m "Descriptive message"
# send  to upstream for review:

git-review will output an url you can use to preview your change, and the hooks will automatically add a 'Change-ID' so subsequent changes are linked to it.

NOTE: full reference is available at the Developer's Guide

The biggest issue started here:

  • In order to not require a new function to validate integers, I've used the one for non-negative which already does this tests, but one of the reviewers suggested to write a function
  • Functions were imported from neutron-lib so I submitted a second change to neutron-lib project
  • As the change in neutron-lib couldn't be marked as dependent as neutron-lbaas uses the build the version already published, I had to define another interim version of the function so that neutron-lbaas can use it in the meantime and raise another bug, to later remove this interim function once than neutron-lib includes the validate_integer function
  • As part of the comments on neutron-lib review, it was found that it would be nice to validate values, so after some discussion, I moved to use the internal validate_values.
  • Of course, validate_values is just doing data in valid_values, so it fails if data or valid_values are not comparable and doesn't do conversion of depending on the values itselves, so this spin-off another review for improving the ´validate_values´ function.

At the moment, I'm trying to close the one to neutron-lib to use the function already defined, and have it merged, and then continue with the other steps, like removing the interim function in neutron-lbaas and work on enhancing validate_values and close all the dependant launchpad bugs I've created for tracking.

My experience so far, is that sometimes it might be a bit difficult, as git-review is a collaborative environment so different opinions are being shared with different approachs and some of them are 'easier' and some others 'pickier' like having an 'extra space', etc.

Of course, all the code is checked by some automation engines when submitted, which validates that the code still builds, no formatting errors, etc but many of them can be executed locally by using tox, which allows to perform part of the tests like:

  • tox -e pep8
  • tox -e py27
  • tox -e coverage

To respectively, validate pep8 formatting (line length, spaces around operators, docsstrings formatting, etc) and to run another set of tests like the ones you define.

After each set of changes performed to apply the feedback received, ensure to:

# Add the modified files to a commit

git add $files_modified

# Create the commit with the changes

git commit -m "whatever"

# This will show you the last two commits, so you can leave the first one and
# on the beginning of the second one,
# replace 'pick' for 'f' so the changes are merged with first one without
# keeping the commit message

git rebase -i HEAD~2

# Fix the commit message if needed  (like fixing formatting,
# set dependant commits, or bugs it closes, etc)

git commit --amend

# Submit changes again for review


Also, keep in mind that apart from submitting the code change is important to submit automated validation tests, which can be executed with tox -e py27 to view that the functions return the values we expect even if the input data is out of what it should be, or like coverage, to validate that the code is covered (check on tox.ini what is defined).

And last but not least, expect to have lot of comments on more serius changes like changes to stable libs, as lot of reviewers will come to ensure that everything looks good and might even discuss it on the regular meetings to ensure, that a change is a good fit for the product in the proposed approach.

Click to read and post comments

jun 03, 2016

New blog rendering engine: Pelican

As always, I don't usually find myself keen to write about things I do, until I later realize they might be helpful for others, and that's why in the past I decided to switch the place I was putting the information about why did to Github and also, take benefit of practicing markdown for writing the entries.

At that time, I moved my old blog posts to markdown to be used in conjunction with Jekyll and to use Octopress as the engine rendering the contents into a static website. The setup and migration was not difficult, but still require to use some ruby, while I was more familiar with Python.

Since some time ago, I was checking other platforms, following the same approach of rendering markdown files and sticker to Pelican, it's included in Fedora Repos (python-pelican). Pelican offers a similar behaviour, having also a server for allowing you to quickly test the new settings (plugins, themes, etc) and to publish the resulting website to a hosting provider.

As I did with Jekyll+Octopress, I'm still using for it, and I'm in the process of adapting some changes like additional plugins, theme tweaks and consider to develop one of my own.

Click to read and post comments

ago 28, 2015

Filtering email with imapfilter

Since some time ago, email filter management was not scaling for me as I was using server-side filtering, I had to deal with the web-based interface which was missing some elements like drag&drop reordering of rules, cloning, etc.

As I was already using offlineimap to sync from the remote mailserver to my system into a maildir folder, I had almost all the elements I needed.

After searching for several options imapfilter seemed to be a perfect fit, so I started with a small set of rules and start integration with my email process.

On my first attempts, I setup a pre-sync hook on offlineimap by using as well as the postsync hook I already had:

presynchook  = time imapfilter
postsynchook = ~/.mutt/

Initial attempts were not good at all, applying filters on the remote imapserver was very time consuming and my actual 1 minute delay after finishing one check was becoming a real 10-15 minute interval between checks because of the imapfiltering and this was not scaling as I was putting new rules.

After some tries, and as I already had all the email synced offline, moved filtering to be locally instead of server-side, but as imapfilter requires an imap server, I tricked dovecot into using the local folder to be offered via imap:

protocols = imap
mail_location = maildir:~/.maildir/FOLDER/:INBOX=~/.maildir/FOLDER/.INBOX/

This also required to change my foldernames to use "." in front of them, so I needed to change mutt configuration too for this:

set mask=".*"

and my mailfoders script:

set mbox_type=Maildir
set folder="~/.maildir/FOLDER"
set spoolfile="~/.maildir/FOLDER/.INBOX"

#mailboxes `echo -n "+ "; find ~/.cache/notmuch/mutt/results ~/.maildir/FOLDER -type d -not -name 'cur' -not -name 'new' -not -name 'tmp' -not -name '.notmuch' -not -name 'xapian' -not -name 'FOLDER' -printf "+'%f' "`

mailboxes `find ~/.maildir/FOLDER -type d -name cur -printf '%h '|tr " " "\n"|grep -v "^/home/iranzo/.maildir/FOLDER$"|sort|xargs echo`
#Store reply on current folder
folder-hook . 'set record="^"'

After this, I could start using imapfilter and start working on my set of rules... but first problem appeared, apparently I started having some duplicated email as I was cancelling and rerunning the script while debugging so a new tool was also introduced to 'dedup' my imap folder named IMAPdedup with a small script:

for folder in $(python ~/.bin/ -s localhost  -u iranzo    -w '$PASSWORD'  -m -c -v  -l)
    python ~/.bin/ -s localhost  -u iranzo    -w '$PASSWORD'  -m -c  "$folder"

) 2>&1|grep "will be marked as deleted"

This script was taking care of listing all email foders on 'localhost' with my username and password (can be scripted or use external tools to gather it) and dedup email after each sync (in my as well as lbdq script for fetchning new addresses, notmuch and running imapfilter after syncing (to cath the limited filtering I do sever-side)

I still do some server-side filtering (4 rules), to get on a "Pending sort" folder all email which is either:

  • New support cases remain at INBOX
  • All emails from case updates, bugzilla, etc to _pending
  • All emails containing 'list' or 'bounces' in from to _pending
  • All emails not containing me directly on CC or To, to _pending

This more or less ensures a clean INBOX with most important things still there, and easier rule handling for email sorting.

So, after some tests, this is at the moment a simplified version of my filtering file:

--  Options  --

options.timeout = 30
options.subscribe = true
options.create = false

function offlineimap (key)
    local status
    local value
    status, value = pipe_from('grep -A2 ACCOUNT ~/.offlineimaprc | grep -v ^#|grep '.. key ..'|cut -d= -f2')C
        value = string.gsub(value, ' ', '')
        value = string.gsub(value, '\n', '')
        return value

--  Accounts  --

-- Connects to "imap1.mail.server", as user "user1" with "secret1" as
-- password.
    server = 'localhost',
    username = 'iranzo',
    password = '$PASSWORD',
    port = 143
-- My email
myuser = 'ranzo'

function mine(messages)
    return email

function filter(messages,email,destination)
    messages:contain_field('sender', email):move_messages(destination)

function deleteold(messages,days)

-- Define the msgs we're going to work on

-- Move sent messages to INBOX to later sorting
sent = EXAMPLE.Sent:select_all()

inbox = EXAMPLE['INBOX']:select_all()
pending = EXAMPLE['INBOX/_pending']:select_all()
todos = pending + inbox

-- Mark as read messages sent from my user

-- Delete google calendar forwards

-- Move all spam messages to Junk folder
spam = todos:contain_field('X-Spam-Score','*****')

-- Move Jive notifications

-- Filter EXAMPLEN

-- Filter PNT
filter(todos:contain_subject('[PNT] '),'',EXAMPLE['Trash'])

-- Filter CPG (Customer Private Group)
filter(todos:contain_subject('Red Hat - Group '),'',EXAMPLE['INBOX/EXAMPLE/Customers/Other/CPG'])

-- Remove month start reminders
todos:contain_subject('mailing list memberships reminder'):delete_messages()

-- Delete messages about New accounts created (RHN)
usercreated=todos:contain_subject('New Red Hat user account created')*todos:contain_from('')

-- Search messages from CPG's
cpg = EXAMPLE['INBOX/EXAMPLE/Customers/Other/CPG']:select_all()

-- Move bugzilla messages
filter(todos:contain_subject('] New:'),'',EXAMPLE['INBOX/EXAMPLE/Customers/_bugzilla/new'])

-- Move all support messages to Other for later processing
filter(todos:contain_subject('(NEW) ('),'',EXAMPLE['INBOX/EXAMPLE/Customers/_new'])
filter(todos:contain_subject('Case '),'',EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases'])


support = EXAMPLE['INBOX/EXAMPLE/Customers/Other/cases']:select_all()
-- Restart the search only for messages in Other to also process if we have new rules

support:contain_subject('is about to breach its SLA'):delete_messages()
support:contain_subject('has breached its SLA'):delete_messages()
support:contain_subject(' has had no activity in '):delete_messages()

-- Here the process is customer after customer and mark as read messages from non-prio customers

-- For customer swith common matching names, use header field
support:contain_field('X-SFDC-X-Account-Number', 'XXXX'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust5/cases'])
support:contain_body('Customer         : COMMONNAME'):move_messages(EXAMPLE['INBOX/EXAMPLE/Customers/Cust6/cases'])

-- Non prio customers (mark updates as read)
cust7 = support:contain_body('WATCHINGCUST') + support:contain_body('Cust7')

-- Filter other messages by domain
filter(todos,'', EXAMPLE['INBOX/EXAMPLE/Customers/Cust8'])

-- Process all remaining messages in INBOX + all read messages in pending-sort for mailing lists and move to lists folder
filter(todos,'list', EXAMPLE['INBOX/Lists'])

-- Add EXAMPLE lists, inbox and _pending and Fedora default bin for reprocessing in case a new list has been added
lists = todos + EXAMPLE['INBOX/Lists']:select_all() + EXAMPLE['INBOX/Lists/Fedora']:select_all()

-- Mailing lists


-- Fedora

-- OSP

-- Filter my messages not filtered back to INBOX

-- move messages we're in BCC to INBOX for manual sorting
hidden = pending - mine(pending)

-- Start processing of messages older than:

-- Delete old messages from mailing lists

-- delete old cases

-- for each in $(cat .imapfilter/config.lua|grep -i cases|tr " ,()" "\n"|grep cases|sort|uniq|grep -v ":" );do echo "deleteold($each,maxage)";done


-- Empty trash every 7 days

As this is applied filtering twice, offlineimap might be uploading part of your changes already, making it faster to next syncs, and suffle some of your emails while it runs.

The point of adding the already filtered set to be filtered again (CPG, cases, etc) is that if a new customer is consiredered to be filter on a folder of its own, the messages will be picked up and moved accordingly automatically ;-)

Hope it helps, and happy filtering!

Click to read and post comments

jul 17, 2015

RHEV-M with nested VM for OSP

Since some time ago, I've been mostly dealing with OpenStack, requiring different releases to test for different tests, etc.

Virtualization, as provided by KVM requires some CPU flags to get accelerated operations, vmx and svm depending on your processor architecture, but, of course, this is only provided on bare-metal.

In order to get more flexibility at the expense of performance, nestedvt allows to expose those flags to the VM's running at the hypervisor so you can run another level of VM's inside those VM's (this starts to sound like the movie Inception).

The problem, so far is that this required changes on the kernel and drivers to make it work, and was lacking lot of stability, so this is something NOT SUPPORTED FOR PRODUCTION USE but which makes perfect sense for demo environments, labs, etc, allowing you to maximize the use of your hardware for better flexibility but at the cost of performance.

As I was using RHEV for managing my home-lab I hit the first issue, my hypervisors (HP Proliant G7 N54L) where using RHEL-6 as operating system, and the support for nested was not very good, but luckily, RHEV-M 3.5 includes support for hypervisors running on RHEL-7, enabling to use latest features included in kernel, networking stack, etc.

First step, was to redeploy the servers, wasn't that hard, but required some extra steps as I had another unsupported approach (servers were sharing local storage over NFS for providing Storage Domains to environment, HIGHLY UNSUPPORTED), so I moved them from NFS to iSCSI provided by an external server and with the help of the kickstart I use for other systems, I started the process.

Once the two servers were migrated, the last one, finished moving VM's from NFS to iSCSI and needed to be put on maintenance and enable the other two (as a safety measure, RHEL-6 and RHEL-7 hosts cannot coexist on the same cluster in RHEV).

From here, just needed to enable NestedVT on the environment.

NestedVT 'just' requires to expose the svm or vmx flag to the VM running directly from the bare-metal host, and we need to do that for every VM we start. On normal system with libvirt, we can just edit the XML for the VM definition and define the CPU like this:

<cpu mode='custom' match='exact'>
    <model fallback='allow'>Opteron_G3</model>
    <feature policy='require' name='svm'/>

For RHEV, however, we don't have an XML we can edit, as it is created dynamically with the contents of the database for the VM (disks, NICS, name, etc), but we've the VDSM-Hooks mechanism for doing this.

Hooks in vdsm are a powerful and dangerous tool, as they can modify in-flight the XML used to create the VM, and allow lot of features to be implemented.

In the past, for example, those hooks could be used to provide DirectLUN support to RHEV, or fixed BIOS Serial Number for VM's where the product was still lacking the official feature, and in this case, we'll use them to provide the CPU flags we need.

As you can imagine, this is something that has lot of interested people behind, and we can find upstream a repository with VDSM-Hooks.

In this case, the one that we're needing is 'nestedvt', so we can proceed to install it on our hosts like:

rpm -Uvh vdsm-hook-nestedvt-4.14.17-0.el7.noarch.rpm

You'll need to put a host in maintenance and activate for VDSM to refresh the hooks installed and start new VM so we have the hook injecting the XML.

After it boots, egrep 'svm|vmx' /proc/pcuinfo should show the flags there.

But wait...

RHEV also includes a security feature that makes it impossible for a VM to spy on the communications meant to other VM's that makes it impossible to simulate other MAC's within it, and this is performed via libvirt filters on the interfaces.

To come to our rescue, another hook comes to play in, this time macspoof which allows to disable this security measure for a VM so it can execute virtualization within.

First, let's repeat the procedure and install the hook on all of our hypervisors:

rpm -Uvh vdsm-hook-macspoof-4.14.17-0.el7.noarch.rpm

This will enable the hook in the system, but we also need to make the RHEV-M Engine aware of it, so we need to define a new Custom Property for VM's:

engine-config -s "UserDefinedVMProperties=macspoof=(true|false)"

This will ask us for the compatibility version (we'll choose 3.5) and enable a new true/false property for VM's that require this security measure lifted. We're doing of course this approach instead of disabling it for everyone to limit it's use to just the VM's needing it, not losing all the benefits on security provided.

As a side note, macspoof plugin is available in official repositories for RHEL7 hypervisor, so you can use this instead of oVirt's repo one.

Now when we create a new VM, for example to use with OpenStack, we can go to custom properties for this vm, select 'macspoof' and set a value of 'true' and once the VM is started will be able to see the processor extensions for virtualization and at the same time, the VM's created within, will be able to communicate with the outside world.


Click to read and post comments
← Previous Next → Page 2 of 13