What's new?
During recent weeks we've been coding and performing several changes to Citellus and Magui.
Checking the latest logs or list of issues open and closed on github is probably not an easy task or the best way to get 'up-to-date' with changes, so I'll try to compile a few here.
First of all, we're going to present it at Devconf.cz 2018, so come stop-by if assisting :-)
Some of the changes include...
Citellus
- New functions for bash scripts!
- We've created lot of functions to check different things:
- installed rpm
- rpm over specific version
- compare dates over X days
- regexp in file
- etc..
- Functions do allow to do quicker plugin development.
- save/restore options so they can be loaded automatically for each execution
- Think of enabled filters, excluded, etc
- metadata added for plugins and returned as dictionary
- plugin has a unique ID for all installations based on plugin relative path and plugin name
- We do use that ID in magui to select the plugin data we'll be acting on
- plugin priority!
- Plugins are assigned a number between 0 and 1000 that represents how likely it's going to affect your environment, and you can filter also on it with
--prio
- extended via 'extensions' to provide support for other plugins
- moved prior plugins to be
core
extension
- ansible playbook support via
ansible-playbook
command
- metadata plugins that just generate metadata (hostname, date for sosreport, etc)
- Web Interface!!
- David Valee Delisle did a great job on preparing an html that loads citellus.json and shows it graphically.
- Thanks to his work, we did extended some other features like priority, categories, etc that are calculated via citellus and consumed via citellus-www.
- Interface can also load
magui.json
(with ?json=magui.json
) and show it's output.
- We did extend citellus to take
--web
to automatically create the json named citellus.json
on the folder specified with -o
and copy the citellus.html
file there. So if you provide sosreports over http, you can point to citellus.html to see graphical status! (check latest image at citellus website as www.png )
- Increased plugin count!
- Now we do have more than 119 across different categories
- A new plugin in python
reboot.py
that checks for unexpected reboots
- Spectre/Meltdown security checks!
Magui
- If there's an existing
citellus.json
magui does load it to speed it up process across multiple sosreports.
- Magui can also use
ansible-playbook
to copy citellus program to remote host and run there the command, and bring back the generated citellus.json
so you can quickly run citellus across several hosts without having to manually perform operations or generate sosreports.
- Moved prior data to two plugins:
citellus-outputs
- Citellus plugins output arranged by plugin and sosreport
citellus-metadata
- Outputs metadata gathered by
metadata
plugins in citellus arranged by plugin and sosreport
- First plugins that compare data received from citellus on global level
- Plugins are written in python and use each plugin
id
to just work on the data they know how to process
pipeline-yaml
- Checks if pipeline.yaml and warns if is different across hosts
seqno
- Checks latest galera seqno on hosts
release
- Reports RHEL release across hosts and warns if is different across hosts
- Enable
quiet
mode on the data received from citellus as well as local plugins, so only outputs with ERROR or different output on sosreports is shown, even on magui plugins.
Wrap up!
As you can see we've been busy trying to improve plugins, Citellus framework and Magui as well.
We've been also busy demonstrating to others it's value and raising lot of new issues and closing them with our commits (294 requests closed so far).
So, come and tell us what else are you missing or how can we improve it to suit your needs (or code them yourself and submit a review!)
Click to read and post comments
Why?
While working on Citellus and Magui it soon became evident that Unit testing for validating the changes was a requirement.
Initially, using a .travis.yml
file contained in the repo and the free service provided by https://travis-ci.org we soon got https://github.com repo providing information about if the builds succeded or not.
When it was decided to move to https://gerrithub.io to work in a more similar way to what is being done in upstream, we improved on the code comenting (peer review), but we lost the ability to run the tests in an automated way until the change was merged into github.
After some research, it became more or less evident that another tool, like Jenkins was required to automate the UT process and report to individual reviews about the status.
Setup
Some initial steps are required for integration:
- Create ssh keypair for jenkins to use
- Creating github account to be used by jenkins and configuring above ssh keypair
- Login into gerrithub with that account
- Setup Jenkins and build jobs
- Allow on the parent project, access to jenkins github account permission to +1/-1 on Verify
In order to setup the Jenkins environment a new VM was spawned in one of our RHV servers.
This VM was installed with:
- 20 Gb of HDD
- 2 Gb of RAM
- 2 VCPU
- Red Hat Enterprise Linux 7 'base install'
Tuning the OS
RHEL7 provides a stable environment for run on, but at the same time we were lacking some of the latest tools we're using for the builds.
As a dirty hack, it was altered in what is not a recomended way, but helped to quickly check as proof of concept if it would work or not.
Once OS was installed, some commands (do not run in production) were used:
pip install pip # to upgrade pip
pip install -U tox # To upgrade to 2.x version
# Install python 3.5 on the system
yum -y install openssl-devel gcc
wget https://www.python.org/ftp/python/3.5.0/Python-3.5.0.tgz
tar xvzf Python-3.5-0.tgz
cd Python*
./configure
# This will install in alternate folder in system not to replace user-wide python version
make altinstall
# this is required to later allow tox to find the command as 'jenkins' user
ln -s /usr/local/bin/python3.5 /usr/bin/
Installing Jenkins
For the jenkins installation it's easier, there's a 'stable' repo for RHEL and the procedure is documented:
wget -O /etc/yum.repos.d/jenkins.repo http://pkg.jenkins-ci.org/redhat-stable/jenkins.repo
rpm --import https://jenkins-ci.org/redhat/jenkins-ci.org.key
yum install jenkins java
chkconfig jenkins on
service jenkins start
firewall-cmd --zone=public --add-port=8080/tcp --permanent
firewall-cmd --zone=public --add-service=http --permanent
firewall-cmd --reload
This will install and start jenkins and enable the firewall to access it.
If you can get to the url of your server at the port 8080, you'll be presented an initial procedure for installing Jenkins.

During it, you'll be asked for a password on a file on disk and you'll be prompted to create an user we'll be using from now on to configure.
Also, we'll be offered to deploy the most common set of plugins, choose that option, and later we'll add the gerrit
plugin and Python
.
Configure Jenkins
Once we can login into gerrit, we need to enter the administration area, and install new plugins and install Gerrit Trigger.

Above link details how to do most of the setup, in this case, for gerrithub, we required:
- Hostname: our hostname
- Frontend URL: https://review.gerrithub.io
- SSH Port: 29418
- Username: our-github-jenkins-user
- SSH keyfile: path_to_private_sshkey

Once done, click on Test Connection
and validate if it worked.
At the time of this writing, version reported by plugin was 2.13.6-3044-g7e9c06d
when connected to gerrithub.io.

Creating a Job
Now, we need to create a Job (first option in Jenkins list of jobs).
- Name: Citellus
- Discard older executions:
- Max number of executions to keep: 10
- Source code Origin: Git
- URL: ssh://@review.gerrithub.io:29418/zerodayz/citellus
- Credentials: jenkins (Created based on the ssh keypair defined above)
- Branches to build: $GERRIT_BRANCH
- Advanced
- Add additional behaviours
- Strategy for choosing what to build:
- Choosing strategy Gerrit Trigger
- Triggers for launch:
- Change Merged
- Commend added with regexp: .recheck.
- Patchset created
- Ref Updated
- Gerrit Project:
- Type: plain
- Pattern: zerodayz/citellus
- Branches:
- Execute:
import os
import tox
os.chdir(os.getenv('WORKSPACE'))
# environment is selected by ``TOXENV`` env variable
tox.cmdline()

From this point, any new push (review) made against gerrit will trigger a Jenkins build (in this case, running tox
). Additionally, a manual trigger of the job can be executed to validate the behavior.

Checking execution
In our project, tox checks some UT's on python 2.7
, and python 3.5
, as well as python's PEP
compliance.
Now, Jenkins will build, and post messages on the review, stating that the build has started and the results of it, setting also the 'Verified' flag.

Enjoy having automated validation of new reviews before accepting them into your code!
Click to read and post comments
Background
Citellus allows to check a sosreport against known problems identified on the provided tests.
This approach is easy to implement and easy to test but has limitations when a problem can span across several hosts and only the problem reveals itself when a general analysis is performed.
Magui tries to solve that by running the analysis functions inside citellus across a set of sosreports, unifying the data obtained per citellus plugin.
At the moment, Magui just does the grouping of the data and visualization, for example, give it a try with the seqno
plugin of citellus to report the sequence number in galera database:
[user@host folder]$ magui.py * -f seqno # (filtering for ‘seqno’ plugins).
{'/home/remote/piranzo/citellus/citellus/plugins/openstack/mysql/seqno.sh': {'ctrl0.localdomain': {'err': '08a94e67-bae0-11e6-8239-9a6188749d23:36117633\n',
'out': '',
'rc': 0},
'ctrl1.localdomain': {'err': '08a94e67-bae0-11e6-8239-9a6188749d23:36117633\n',
'out': '',
'rc': 0},
'ctrl2.localdomain': {'err': '08a94e67-bae0-11e6-8239-9a6188749d23:36117633\n',
'out': '',
'rc': 0}}}
Here, we can see that the sequence number on the logs is the same for the hosts.
The goal, once tis has been discussed and determined, is to write plugins that get the raw data from citellus and applies logic on top by parsing the raw data obtained by the increasing number of citellus plugins and is able to detect issues like, for example:
- galera seqno
- cluster status
- ntp syncronization across nodes
- etc
Hope it's helpful for you!
Pablo
PD: We've proposed this to be a talk in upcoming OSP Summit 2017 in Sydney, so if you want to see us there, don't forget to vote on https://www.openstack.org/summit/sydney-2017/vote-for-speakers#/19095
Click to read and post comments