Pablo Iranzo Gómez's blog

A bunch of unrelated data

oct 16, 2018

Contributing to OSP upstream a.k.a. Peer Review

Table of contents

  1. Introduction
  2. Upstream workflow
    1. Peer review
    2. CI tests (Verified +1)
    3. Code Review+2
    4. Workflow+1
    5. Cannot merge, please rebase
  3. How do we do it with Citellus?

Introduction

In the article "Contributing to OpenStack" we did cover on how to prepare accounts and prepare your changes for submission upstream (and even how to find low hanging fruits to start contributing).

Here, we'll cover what happens behind the scene to get change published.

Upstream workflow

Peer review

Upstream contributions to OSP and other projects are based on Peer Review, that means that once a new set of code has been submitted, several steps for validation are required/happen before having it implemented.

The last command executed (git-review) on the submit sequence (in the prior article) will effectively submit the patch to the defined git review service (git-review -s does the required setup process) and will print an URL that can be used to access the review.

Each project might have a different review platform, but usually for OSP it's https://review.openstack.org while for other projects it can be https://gerrit.ovirt.org, https://gerrithub.io, etc (this is defined in .gitreview file in the repository).

A sample .gitreview file looks like:

[gerrit]
host=review.gerrithub.io
port=29418
project=citellusorg/citellus.git

For a review example, we'll use one from gerrithub from Citellus project:

https://review.gerrithub.io/#/c/380646/

Here, we can see that we're on review 380646 and that's the link that allows us to check the changes submitted (the one printed when executing git-review).

CI tests (Verified +1)

Once a review has been submitted, usually the bots are the first ones to pick them and run the defined unit testing on the new changes, to ensure that it doesn't break anything (based on what is defined to be tested).

This is a critical point as:

  • Tests need to be defined if new code is added or modified to ensure that later updates doesn't break this new code without someone being aware.
  • Infrastructure should be able to test it (for example you might need some specific hardware to test a card or network configuration)
  • Environment should be sane so that prior runs doesn't affect the validation.

OSP CI can be checked at 'Zuul' http://zuul.openstack.org/ where you can 'input' the number for your review and see how the different bots are running CI tests on it or if it's still queued.

If everything is OK, the bot will 'vote' your change as Verified +1 allowing others to see that it should not break anything based on the tests performed

In the case of OSP, there's also third-party CI's that can validate other changes by third party systems. For some of them, the votes are counting towards or against the proposed change, for others it's just a comment to take into account.

Even if sometimes you know that your code is right, there's a failure because of the infrastructure, in those cases, writing a new comment saying recheck, will schedule a new CI test run.

This is common usually during busy periods when it's harder for the scheduler to get available resources for the review validation. Also, sometimes there are errors in the configuration of CI that must be fixed in order to validate those changes.

Note: you can run some of the tests on your system to validate faster if you've issues by running tox this will setup virtual environment for tests to be run so it's easier to catch issues before upstream CI does (so it's always a good idea to run tox even before submitting the review with git-review to detect early errors).

This is however not always possible as some changes include requirements like testing upgrades, full environment deployments, etc that cannot be done without the required preparation steps or even the infrastructure.

Code Review+2

This is probably the 'longest' process, it requires peers to be added as 'reviewer' (you can get an idea on the names based on other reviews submitted for the same component) or they will pick up new reviews as the pop un on notification channels or pending queues.

On this, you must prepare mentally for everything... developers could suggest to use a different approach, or highlight other problems or just do some small nit comments to fixes like formating, spacing, var naming, etc.

After each comment/change suggested, repeat the workflow for submitting a new patchset, but make sure you're using the same review id (that's by keeping the commit id that is appended): this allows the Code Review platform to identify this change as an update to a prior one, and allow you for example to compare changes across versions, etc. (and also notify the prior reviewers of new changes).

Once reviewers are OK with your code, and with some 'Core' developers also agreeing, you'll see some voting happening (-2..+2) meaning they like the change in its actual form or not.

Once you get Code Review +2 and with the prior Verified +1 you're almost ready to get the change merged.

Workflow+1

Ok, last step is to have someone with Workflow permissions to give a +1, this will 'seal' the change saying that everything is ok (as it had CR+2 and Verified+1) and change is valid...

This vote will trigger another build by CI, and when finished, the change will be merged into the code upstream, congratulations!

Cannot merge, please rebase

Sometimes, your change is doing changes on the same files that other programmers did on the code, so there's no way to automatically 'rebase' the change, in this case the bad news is that you need to:

git checkout master # to change to master branch
git pull # to push latest upstream changes
git checkout yourbranch # to get back to your change branch
git rebase master # to apply your changes on top of current master

After this step, it might be required to manually fix the code to solve the conflicts and follow instructions given by git to mark them as reviewed.

Once it's done, remember to do like with any patchset you submited afterwards:

git commit --amend # to commit the new changes on the same commit Id you used
git-review # to upload a new version of the patchset

This will start over the progress, but will, once completed to get the change merged.

How do we do it with Citellus?

In Citellus we've replicated more or less what we've upstream... even the use of tox.

Citellus does use https://gerrithub.io (free service that hooks on github and allows to do PR)

We've setup a machine that runs Jenkins to do 'CI' on the tests we've defined (mostly for python wrapper and some tests) and what effectively does is to run tox, and also, we do use https://travis-ci.org free Tier to repeat the same on other platform.

Tox is a tool that allows to define several commands that are executed inside python virtual environments, so without touching your system libraries, it can get installed new ones or removed just for the boundaries of that test, helping into running:

  • pep8 (python formating compliance)
  • py27 (python 2.7 environment test)
  • py35 (python 3.5 environment test)

The py tests are just to validate the code can run on both base python versions, and what they do is to run the defined unit testing scripts under each interpreter to validate.

For local test, you can run tox and it will go trough the different tests defined and report status... if everything is ok, it should be possible that your new code review passes also CI.

Jenkins will do the +1 on verified and 'core' reviewers will give +2 and 'merge' the change once validated.

Hope you enjoy!

Pablo

Click to read and post comments

sep 25, 2018

Peru for syncing specific git repo files

Peru a repo syncer

Some projects upstream bind together lot of files which might not be of interest, but still the conveniency of a git pull to get latest updates, makes you to download the whole repo for just a bunch of files or folders.

For example, this website uses Pelican to generate the webpages out of markdown files. Pelican does have a rich set of plugins but all of them are in the same folder in the git checkout.

Here, is where peru comes in to play. Peru (hosted at https://github.com/buildinspace/peru) comes handy at this task.

You can install peru from pipsi:

pipsi install peru

And will provide you a commandline tool that uses a yaml file for definition of repositories, like the one I do use here:

imports:
    # The dircolors file just goes at the root of our project.
    sitemap: plugins/
    better_codeblock_line_numbering: plugins/
    better_figures_and_images: plugins/
    yuicompressor: plugins/

git module sitemap:
    url: git@github.com:getpelican/pelican-plugins.git
    pick: sitemap
    rev: ead70548ce2c78ed999273e265e3ebe13b747d83

git module yuicompressor:
    url: git@github.com:getpelican/pelican-plugins.git
    pick: yuicompressor
    rev: ead70548ce2c78ed999273e265e3ebe13b747d83

git module better_figures_and_images:
    url: git@github.com:getpelican/pelican-plugins.git
    pick: better_figures_and_images
    rev: ead70548ce2c78ed999273e265e3ebe13b747d83

git module better_codeblock_line_numbering:
    url: git@github.com:getpelican/pelican-plugins.git
    pick: better_codeblock_line_numbering
    rev: ead70548ce2c78ed999273e265e3ebe13b747d83

In this way, peru sync will download from the provided repos the specified folder into the destination indicated, allowing you to integrate "other repo's files" in your own workflow.

For example, if you want to use citellus repo at one specific point in time, to be integrated in your code, you could use:

imports:
    citellus: ./

git module citellus:
    url: git@github.com:citellusorg/citellus.git
    pick: citellusclient/plugins/
    rev: 1ee1c6a36f51e8a7c809d5162004fb57ee99b168

This will checkout citellus repo at one specific point in time and merge in your current folder.

Enjoy!

Pablo

Click to read and post comments

ago 17, 2017

Jenkins for running CI tests

Table of contents

  1. Why?
  2. Setup
    1. Tuning the OS
  3. Installing Jenkins
  4. Configure Jenkins
    1. Creating a Job
    2. Checking execution

Why?

While working on Citellus and Magui it soon became evident that Unit testing for validating the changes was a requirement.

Initially, using a .travis.yml file contained in the repo and the free service provided by https://travis-ci.org we soon got https://github.com repo providing information about if the builds succeded or not.

When it was decided to move to https://gerrithub.io to work in a more similar way to what is being done in upstream, we improved on the code comenting (peer review), but we lost the ability to run the tests in an automated way until the change was merged into github.

After some research, it became more or less evident that another tool, like Jenkins was required to automate the UT process and report to individual reviews about the status.

Setup

Some initial steps are required for integration:

  • Create ssh keypair for jenkins to use
  • Creating github account to be used by jenkins and configuring above ssh keypair
  • Login into gerrithub with that account
  • Setup Jenkins and build jobs
  • Allow on the parent project, access to jenkins github account permission to +1/-1 on Verify

In order to setup the Jenkins environment a new VM was spawned in one of our RHV servers.

This VM was installed with:

  • 20 Gb of HDD
  • 2 Gb of RAM
  • 2 VCPU
  • Red Hat Enterprise Linux 7 'base install'

Tuning the OS

RHEL7 provides a stable environment for run on, but at the same time we were lacking some of the latest tools we're using for the builds.

As a dirty hack, it was altered in what is not a recomended way, but helped to quickly check as proof of concept if it would work or not.

Once OS was installed, some commands (do not run in production) were used:

pip install pip # to upgrade pip
pip install -U tox # To upgrade to 2.x version

# Install python 3.5 on the system
yum -y install openssl-devel gcc
wget https://www.python.org/ftp/python/3.5.0/Python-3.5.0.tgz
tar xvzf Python-3.5-0.tgz
cd Python*
./configure

# This will install in alternate  folder in system not to replace user-wide python version
make altinstall

# this is required to later allow tox to find the command as 'jenkins' user
ln -s /usr/local/bin/python3.5 /usr/bin/

Installing Jenkins

For the jenkins installation it's easier, there's a 'stable' repo for RHEL and the procedure is documented:

wget -O /etc/yum.repos.d/jenkins.repo http://pkg.jenkins-ci.org/redhat-stable/jenkins.repo
rpm --import https://jenkins-ci.org/redhat/jenkins-ci.org.key
yum install jenkins java
chkconfig jenkins on
service jenkins start
firewall-cmd --zone=public --add-port=8080/tcp --permanent
firewall-cmd --zone=public --add-service=http --permanent
firewall-cmd --reload

This will install and start jenkins and enable the firewall to access it.

If you can get to the url of your server at the port 8080, you'll be presented an initial procedure for installing Jenkins.

Jenkins dashboard

During it, you'll be asked for a password on a file on disk and you'll be prompted to create an user we'll be using from now on to configure.

Also, we'll be offered to deploy the most common set of plugins, choose that option, and later we'll add the gerrit plugin and Python.

Configure Jenkins

Once we can login into gerrit, we need to enter the administration area, and install new plugins and install Gerrit Trigger.

Manage Jenkins

Above link details how to do most of the setup, in this case, for gerrithub, we required:

  • Hostname: our hostname
  • Frontend URL: https://review.gerrithub.io
  • SSH Port: 29418
  • Username: our-github-jenkins-user
  • SSH keyfile: path_to_private_sshkey

Gerrit trigger configuration

Once done, click on Test Connection and validate if it worked.

At the time of this writing, version reported by plugin was 2.13.6-3044-g7e9c06d when connected to gerrithub.io.

Gerrit servers

Creating a Job

Now, we need to create a Job (first option in Jenkins list of jobs).

  • Name: Citellus
  • Discard older executions:
    • Max number of executions to keep: 10
  • Source code Origin: Git
    • URL: ssh://@review.gerrithub.io:29418/citellusorg/citellus
    • Credentials: jenkins (Created based on the ssh keypair defined above)
    • Branches to build: $GERRIT_BRANCH
    • Advanced
      • Refspec: $GERRIT_REFSPEC
    • Add additional behaviours
      • Strategy for choosing what to build:
        • Choosing strategy Gerrit Trigger
  • Triggers for launch:
    • Change Merged
    • Commend added with regexp: .recheck.
    • Patchset created
    • Ref Updated
    • Gerrit Project:
      • Type: plain
      • Pattern: citellusorg/citellus
    • Branches:
      • Type: Path
      • Pattern: **
  • Execute:
    • Python script:
import os
import tox

os.chdir(os.getenv('WORKSPACE'))

# environment is selected by ``TOXENV`` env variable
tox.cmdline()

Jenkins Job configuration

From this point, any new push (review) made against gerrit will trigger a Jenkins build (in this case, running tox). Additionally, a manual trigger of the job can be executed to validate the behavior.

Manual trigger

Checking execution

In our project, tox checks some UT's on python 2.7, and python 3.5, as well as python's PEP compliance.

Now, Jenkins will build, and post messages on the review, stating that the build has started and the results of it, setting also the 'Verified' flag.

Gerrithub commens by Jenkins

Enjoy having automated validation of new reviews before accepting them into your code!

Click to read and post comments

jul 31, 2017

Magui for analysis of issues across everal hosts.

Background

Citellus allows to check a sosreport against known problems identified on the provided tests.

This approach is easy to implement and easy to test but has limitations when a problem can span across several hosts and only the problem reveals itself when a general analysis is performed.

Magui tries to solve that by running the analysis functions inside citellus across a set of sosreports, unifying the data obtained per citellus plugin.

At the moment, Magui just does the grouping of the data and visualization, for example, give it a try with the seqno plugin of citellus to report the sequence number in galera database:

[user@host folder]$ magui.py * -f seqno # (filtering for ‘seqno’ plugins).
{'/home/remote/piranzo/citellus/citellus/plugins/openstack/mysql/seqno.sh': {'ctrl0.localdomain': {'err': '08a94e67-bae0-11e6-8239-9a6188749d23:36117633\n',
                                                                                                   'out': '',
                                                                                                   'rc': 0},
                                                                             'ctrl1.localdomain': {'err': '08a94e67-bae0-11e6-8239-9a6188749d23:36117633\n',
                                                                                                   'out': '',
                                                                                                   'rc': 0},
                                                                             'ctrl2.localdomain': {'err': '08a94e67-bae0-11e6-8239-9a6188749d23:36117633\n',
                                                                                                   'out': '',
                                                                                                   'rc': 0}}}

Here, we can see that the sequence number on the logs is the same for the hosts.

The goal, once tis has been discussed and determined, is to write plugins that get the raw data from citellus and applies logic on top by parsing the raw data obtained by the increasing number of citellus plugins and is able to detect issues like, for example:

  • galera seqno
  • cluster status
  • ntp syncronization across nodes
  • etc

Hope it's helpful for you! Pablo

PD: We've proposed this to be a talk in upcoming OSP Summit 2017 in Sydney, so if you want to see us there, don't forget to vote on https://www.openstack.org/summit/sydney-2017/vote-for-speakers#/19095

Click to read and post comments

jul 26, 2017

Citellus: framework for detecting known issues in systems.

Table of contents

  1. Background
  2. Citellus
  3. Writing a new test
  4. How to debug?

Background

Since I became Technical Account Manager for Cloud and later as Software Maintenance Engineer for OpenStack, I became officially part of Red Hat Support.

We do usually diagnose issues based on data from the affected systems, sometimes from one system, and most of the times, from several at once.

It might be controllers nodes for OpenStack, Computes running instances, IdM, etc

In order to make it easier to grab the required information, we rely on sosreport.

Sosreport has a set of plugins for grabbing required information from system, ranging from networking configuration, installed packages, running services, processes and even for some components, it can also check API, database queries, etc.

But that's all, it does data gathering, packaging in a tarball but nothing else.

In OpenStack we've already identified common issues, so we create kbases for them, ranging from covering some documentation gaps, to speficic use cases or configuration options.

Many times, a missed configuration (documented) is causing headaches and can be checked with a simple checks, like TTL in ceilometer or stonith configuration on pacemaker.

Here is where Citellus comes to play.

Citellus

The Citellus project https://github.com/citellusorg/citellus/ created by my colleague Robin, aims on creating a set of tests that can be executed against a live system or an uncompressed sosreport tarball (it depends on the test if it applies to one or the other).

The philosphy behind is very easy: - There's a wrapper citellus.py which allows to select plugins to use, or folder containing plugins, verbosity, etc and a sosreport folder to act against. - The wrapper does check the plugins available (can be anything executable from linux, so bash, python, etc are there to be used) - Then it setups some environment variables like the path to find the data and proceeds to execute the plugins against, recording the output of them. - The plugins, on their side, determine if: - Plugin should be run or skipped if it's a live system, a sosreport - Plugin should run because of required file or package missing - Provide return code of: - $RC_OKAY for success - $RC_FAILED for failure - $RC_SKIPPED for skip - anything else (Undetermined error) - Provide 'stderr' with relevant messages: - Reason to be skipped - Reason for failure - etc - The wrapper then sorts the output, and prints it based on settings (grouping skipped and ok by default) and detailing failures.

You can check the provided plugins on the github repo (and hopefully also collaborate sending yours).

Our target is to keep plugins easy to write, so we can extend the plugin set as much as possible, hilighting were focus should be put at first and once typical issues are ruled out, check on the deeper analysis.

Even if we've started with OpenStack plugins (that's what we do for a living), the software is open to check against whatever is there, and we've reached to other colleagues in different speciality areas to provide more feedback or contributions to make it even more useful.

As Citellus works with sosreports it is easy to have it installed locally and test new tests.

Writing a new test

Leading by the example is probably easier, so let's illustrate how to create a basic plugin for checking if a system is a RHV hosted engine:

#!/bin/bash

if [ "$CITELLUS_LIVE" = "0" ];  ## Checks if we're running live or not
then
        grep -q ovirt-hosted-engine-ha $CITELLUS_ROOT/installed-rpms ## checks package
        returncode=$?  #stores return code
        if [ "x$returncode" == "x0" ];
        then
            exit $RC_OKAY
        else
            echo “ovirt-hosted-engine is not installed “ >&2 #Outputs info
            exit $RC_FAILED e #returns code to wrapper
        fi
else
        echo “Not running on Live system” >&2
        exit $RC_SKIPPED
fi

Above example is a bit 'hacky', as we count on wrapper not outputing information if return code is $RC_OKAY, so it should have another conditional to write output or not.

How to debug?

Easiest way to do trial-error would be to create a new folder for your plugins to test and use something like this:

[user@host mytests]$ ~/citellus/citellus.py /cases/01884438/sosreport-20170724-175510/ycrta02.rd1.rf1 ~/mytests/  [-d debug]


DEBUG:__main__:Additional parameters: ['/cases/sosreport-20170724-175510/hostname', '/home/remote/piranzo/mytests/']
DEBUG:__main__:Found plugins: ['/home/remote/piranzo/mytests/ovirt-engine.sh']
_________ .__  __         .__  .__                
\_   ___ \|__|/  |_  ____ |  | |  |  __ __  ______
/    \  \/|  \   __\/ __ \|  | |  | |  |  \/  ___/
\     \___|  ||  | \  ___/|  |_|  |_|  |  /\___ \
 \______  /__||__|  \___  >____/____/____//____  >
        \/              \/                     \/
found #1 tests at /home/remote/piranzo/mytests/
mode: fs snapshot /cases/sosreport-20170724-175510/hostname
DEBUG:__main__:Running plugin: /home/remote/piranzo/mytests/ovirt-engine.sh
# /home/remote/piranzo/mytests/ovirt-engine.sh: failed
    “ovirt-hosted-engine is not installed “

DEBUG:__main__:Plugin: /home/remote/piranzo/mytests/ovirt-engine.sh, output: {'text': u'\x1b[31mfailed\x1b[0m', 'rc': 1, 'err': '\xe2\x80\x9covirt-hosted-engine is not installed \xe2\x80\x9c\n', 'out': ''}

That debug information comes from the python wrapper, if you need more detail inside your test, you can try set -x to have bash showing more information about progress.

Keep always in mind that all functionality is based on return codes and the stderr message to keep it simple.

Hope it's helpful for you! Pablo

PD: We've proposed this to be a talk in upcoming OSP Summit 2017 in Sydney, so if you want to see us there, don't forget to vote on https://www.openstack.org/summit/sydney-2017/vote-for-speakers#/19095

Click to read and post comments
Next → Page 1 of 3