Difference between revisions of "Developers:BuildBot"

From OctopusWiki
Jump to navigation Jump to search
 
Line 8: Line 8:
 
* Each night, the code is build in different settings (with a lot of debugging turned on) and the testsuite is run.
 
* Each night, the code is build in different settings (with a lot of debugging turned on) and the testsuite is run.
  
For a failing build, an e-mail is send to octopus-notify@tddft.org as with the old nightly builds. This e-mail contains all output of the commands which have been run by the slave. In additon to this, there is also a so-called [http://www.tddft.org/trac/octopus/tracbb|waterfall page] in our {{name|Trac}} that lists all {{name|BuildBot}} activities, and that
+
For a failing build, an e-mail is send to octopus-notify@tddft.org as with the old nightly builds. This e-mail contains all output of the commands which have been run by the slave. In additon to this, there is also a so-called [http://www.tddft.org/trac/octopus/tracbb waterfall page] in our {{name|Trac}} that lists all {{name|BuildBot}} activities, and that
 
can be used to have a look at the build and test results.
 
can be used to have a look at the build and test results.
  
== Configuration of the {{name|BuildBot}} master ==
+
== Installation of {{name|BuildBot}} ==
 +
 
 +
To install {{name|BuildBot}} the packages
 +
 
 +
* [http://www.zope.org/Products/ZopeInterface/3.3.0/zope.interface-3.3.0.tar.gz zope.interface-3.3.0.tar.gz],
 +
* [http://tmrc.mit.edu/mirror/twisted/Twisted/2.5/Twisted-2.5.0.tar.bz2 Twisted-2.5.0.tar.bz2], and
 +
* [http://downloads.sourceforge.net/buildbot/buildbot-0.7.5.tar.gz buildbot-0.7.5.tar.gz]
 +
 
 +
and {{name|Python}}, at least version 2.3.
 +
 
 +
Either install thise software via your distribution's package system or simply by hand: After unpacking run
 +
 
 +
<pre>
 +
$ python ./setup.py bdist
 +
$ python ./setup.py install --prefix=DIR
 +
</pre>
 +
 
 +
in the package's source directory. This will install the files in {{file|DIR}} in a hierarchy usually found below {{file|/usr/lib/python}}. Note, that it might be necessary to set the {{code|PYTHONPATH}} environment variable to {{file|DIR/lib/pythonX.Y/site-packages}} (no slash at the end) that the {{name|Python}} system can find the libraries.
 +
 
 +
== Setup of the {{name|BuildBot}} master ==
 +
 
 +
The {{name|BuildBot}} master is installed in {{file|/home/lorenzen/opt/python}} on <tt>www.tddft.org</tt> and also runs under my account. Its configuration and data files are in {{file|/server/www/tddft.org/programs/octopus/buildbot-master}} so that all developers have access to the configuration and can add new slaves (see below).
 +
 
 +
The {{name|Trac}} plugin (http://www.trac-hacks.org/attachment/ticket/339/TracBB-0.1.2.tar.gz), I had to hack it a little bit to make it work. There are still some glitches but for simply browsing the waterfall it works.
 +
 
 +
== Setup of a {{name|BuildBot}} slave ==
 +
 
 +
There are two step to set up a new slave:
 +
 
 +
* install {{name|BuildBot}} on the machine in question, and
 +
* add the appropriate entries in the master configuration.
 +
 
 +
=== Installing a slave ===
 +
 
 +
Run the steps to install {{name|BuildBot}} as described above. Then run
 +
 
 +
<pre>
 +
$ DIR/bin/buildbot create-slave BASEDIR MASTER NAME PASSWORD
 +
</pre>
 +
 
 +
with {{file|BASEDIR}} being the directory where all the builds of this slave take place, {{code|MASTER}} being <tt>www.tddft.org:8898</tt> in our case, {{code|NAME}} and {{code|PASSWORD}} are from the master configuration (see below).
 +
 
 +
As last step, you have to start the slave:
 +
 
 +
<pre>
 +
$ DIR/bin/buildbot start BASEDIR
 +
</pre>
 +
 
 +
If you want to be sure, that it is restarted when the machine reboots, the simplest thing as a {{file|crontab}} entry like
 +
 
 +
<pre>
 +
@reboot DIR/bin/buildbot start BASEDIR
 +
</pre>
 +
 
 +
but this one only works of you have Vixie {{command|cron}} running, usually the case on {{name|Linux}} distributions but commonly not on {{name|System V}} (check out {{command|man 5 crontab}}.
 +
 
 +
Please also note, that the environment set up by {{command|cron}} might be very different from your login environment. On the slaves, I have set up, I had to include a line
 +
 
 +
<pre>
 +
PYTHONPATH=DIR/lib/python2.4/site-packages
 +
</pre>
 +
 
 +
in the {{file|crontab}}.
 +
 
 +
=== Registering a slave with the master ===
 +
 
 +
The entire configuration of the {{name|BuildBot}} master is done via the file {{file|master.cfg}} that lives in {{file|/server/www/tddft.org/programs/octopus/buildbot-master/}}. It is a {{name|Python}} file, so for those of you that unlike me know this language, it will be easy to understand what's going on.
 +
 
 +
The following steps have to be undertaken:
 +
 
 +
Add a slave entry to the {{code|c['bots']}} dictionary. The dictionary looks like this at the moment:
 +
 
 +
<pre>
 +
c['bots'] = [("athos", "secret1"),  # AMD Opteron, Debian
 +
            ("octopus", "secret2"), # AMD Opteron, Fedora
 +
            ("pepita", "secret3"),  # Sparc, Solaris
 +
            ("g22", "secret4")      # i386, Debian
 +
            ]
 +
</pre>
 +
 
 +
After adding your new slave, it might look like
 +
 
 +
<pre>
 +
c['bots'] = [("athos", "secret1"),  # AMD Opteron, Debian
 +
            ("octopus", "secret2"), # AMD Opteron, Fedora
 +
            ("pepita", "secret3"),  # Sparc, Solaris
 +
            ("g22", "secret4"),    # i386, Debian
 +
            ("wheel", "secret5")    # Alpha, OSF1
 +
            ]
 +
</pre>
 +
 
 +
with {{code|wheel}} being the slave's name and {{code|secret5}} the password you give on the {{command|create-slave}} command line. The name of the slave should not contain dots (at least I had trouble with that) and thus should not be the fully qualified domain name but something else which is unique and understandable.
 +
 
 +
Next, you have to add a builder, that knows how to compile (and test if you want) the code on the new slave. Go to the {{code|HOW TO BUILD THE CODE}} section of the file and add an entry like
 +
 
 +
<pre>
 +
x86_64_gfortran_build = octopusVpathBuild(
 +
        var={"FC" : "/home/lorenzen/opt/gcc-4.2.0/bin/gfortran",
 +
            "FCFLAGS" : "-Wall -I/home/lorenzen/opt/gfortran-4.2.0/netcdf/include"},
 +
        flags=["--with-blas=-lblas -L/home/lorenzen/opt/gfortran-4.2.0/blas/lib",
 +
              "--with-lapack=-llapack -L/home/lorenzen/opt/gfortran-4.2.0/lapack/lib",
 +
              "--with-fft-lib=-lfftw3 -L/home/lorenzen/opt/gfortran-4.2.0/fftw/lib",
 +
              "--with-sparskit=-lskit -L/home/lorenzen/opt/gfortran-4.2.0/sparskit/lib",
 +
              "--with-arpack=-larpack -L/home/lorenzen/opt/gfortran-4.2.0/arpack/lib",
 +
              "--with-netcdf=-lnetcdf -L/home/lorenzen/opt/gfortran-4.2.0/netcdf/lib",
 +
              "--with-gsl-prefix=/home/lorenzen/opt/gsl",
 +
              "--disable-gdlib"])
 +
</pre>
 +
 
 +
The name of the builder {{code|x86_64_gfortran_build}} and the {{code|octopusVpathBuild}} created a builder that uses {{code|VPATH}} capabilities of {{command|make}} to find more bugs in the build system ({{code|octopusVpathBuild}} is actually just a {{name|Pyhton}} function defined above the current section). The nomencalture for builders is {{code|arch_compiler_options_type}} with
 +
 
 +
* {{code|arch}} being the target architecture like {{code|sparc}}, {{code|i386}}, {{code|ppc}},
 +
* {{code|compiler}} the Fortran 90 compiler used like {{code|gfortran}}, {{code|ifort}}, {{code|nag}}, {{code|pgi}}, {{code|g95}}, {{code|sunf90}},
 +
* {{code|options}} special build characteristics like {{code|mpich2}}, and
 +
* {{code|type}} the kind of build, {{code|build}} and {{code|test}} for the moment. {{code|build}} only compiles and links the code, {{code|test}} additionally runs the testsuite.
 +
 
 +
The two keyword arguments {{code|var}} and {{code|flags}} are for the configure invocation:
 +
 
 +
Setting the arguments to
 +
<pre>
 +
var={"A" : "a", "B" : "b"},
 +
flags=["--with-bar=/usr/lib/bar",
 +
      "--without-foo"]
 +
</pre>
 +
 
 +
results into the configure line
 +
 
 +
<pre>
 +
$ A=a B=b ./configure --with-bar=/usr/lib/bar --without-foo
 +
</pre>
 +
 
 +
In order to run the testsuite, it might be sufficient to add the line
 +
 
 +
<pre>
 +
octopusAddTest(x86_64_gfortran_build)
 +
</pre>
 +
 
 +
This causes a {{command|make -C testuite check}} to be issued after the compilation. For special builds and environment, this might not be enough, e. g. {{name|MPI}} builds which require additional setup.
 +
 
 +
For these cases, the testing step can be added by
 +
 
 +
<pre>
 +
x86_64_gfortran_mpich2_test.addStep(shell.ShellCommand,
 +
                                    description="testing",
 +
                                    descriptionDone="test",
 +
                                    command="cd _build && COMMAND")
 +
</pre>
 +
 
 +
with {{code|COMMAND}} being a shell command line that does the required steps. The {{code|cd _build}} is to get into the build tree is necessary because a {{code|VPATH}} build is being performed. The exit code of {{code|COMMAND}} determines the testresult. So, if you append several command by {{code|&&}} or {{code|;}} be sure that the correct exit code is returned, e. g. by
 +
 
 +
<pre>
 +
command="cd _build && do_preparations && do_tests; err=$?; do_cleanup; exit $?"
 +
</pre>
 +
 
 +
After specifying the build, you have to say in the {{code|AND WHERE}} section which slave shall perform the build.
 +
Add an entry
 +
 
 +
<pre>
 +
bot_BUILDNAME = {'name' : "BUILDNAME",
 +
                'slavename' : "SLAVE",
 +
                'builddir' : "BUILDNAME",
 +
                'factory' : BUILDNAME}
 +
</pre>
 +
 
 +
If several slaves are able to do the build instead of {{code|'slavename' : "SLAVE"}} a list {{code|'slavenames' : ["SLAVE1", "SLAVE2"]}} can be given. This allows for some load-balancing in the scheduler.
 +
The {{code|bot_BUILDNAME}} has to be added to the {{code|c['builders']}} list:
 +
 
 +
<pre>
 +
c['builders'] = [bot_x86_64_gfortran_build,
 +
bot_x86_64_ifort_build,
 +
 
 +
bot_x86_64_gfortran_mpich2_build,
 +
bot_x86_64_ifort_mpich2_build,
 +
bot_sparc_sunf90_build,
 +
bot_x86_64_ifort_test,
 +
bot_x86_64_ifort_mpich2_test,
 +
bot_i386_gfortran_test,
 +
                bot_BUILDNAME
 +
]
 +
</pre>
 +
 
 +
As a last step, the build has to be added to one or more schedulers. Scheduler are defined in the {{code|SCHEDULERS}} section and are responsible for actually triggering the builds. Currently, there is the {{code|buildonly}} scheduler that triggers rather quick compilations on each commit and the nightly scheduler to which all {{code|BUILDNAME_test}}s go. Simply add your build to the {{code|builderNames}} entru of the scheduler specification.
 +
 
 +
After saving the configuration file, it takes a little while until the master rereads its configuration, up to an hour because of the {{code|@hourly}} {{file|crontab}} entry I have set, until the new build appears on the waterfall page. If it does not appear, have a look at the {{file|twistd.log}} file in the master's directory, perhaps you rendered the {{name|Python}} file invalid. In such cases, the master simply ignores the change in configuration and continues with the old one.
 +
 
 +
 
 +
== List of slaves ==
 +
 
 +
We currently have the following slaves running:
 +
 
 +
{|border="1"
 +
|-
 +
! Name !! Architecture !! OS !! Location
 +
|-
 +
| g22 || i386 || Debian/GNU Linux || FU Berlin
 +
|-
 +
| pepita || sparc || Sun Solaris 10 || TU Berlin
 +
|-
 +
| octopus || x86_64 || Debian/GNU Linux || FU Berlin
 +
|}

Revision as of 22:48, 30 June 2007

Generalities

BuildBot is a piece of software to automate software builds and test. It operates a server that triggers jobs on a number of slaves. These slaves may be running on the same machine as the server but also on different ones. This opens the possibility to compile and check on different architectures and operating systems.

We now have BuildBot running on www.tddft.org. Currently, it is configured for two cases:

  • It tracks trunk of the Subversion repository and on each commit, it triggers several slaves to do a quick configure and compilation (without optimizations) of the code. This way, we get an instant response if a commit broke something in a different environment.
  • Each night, the code is build in different settings (with a lot of debugging turned on) and the testsuite is run.

For a failing build, an e-mail is send to octopus-notify@tddft.org as with the old nightly builds. This e-mail contains all output of the commands which have been run by the slave. In additon to this, there is also a so-called waterfall page in our Trac that lists all BuildBot activities, and that can be used to have a look at the build and test results.

Installation of BuildBot

To install BuildBot the packages

and Python, at least version 2.3.

Either install thise software via your distribution's package system or simply by hand: After unpacking run

$ python ./setup.py bdist
$ python ./setup.py install --prefix=DIR

in the package's source directory. This will install the files in DIR in a hierarchy usually found below /usr/lib/python. Note, that it might be necessary to set the PYTHONPATH environment variable to DIR/lib/pythonX.Y/site-packages (no slash at the end) that the Python system can find the libraries.

Setup of the BuildBot master

The BuildBot master is installed in /home/lorenzen/opt/python on www.tddft.org and also runs under my account. Its configuration and data files are in /server/www/tddft.org/programs/octopus/buildbot-master so that all developers have access to the configuration and can add new slaves (see below).

The Trac plugin (http://www.trac-hacks.org/attachment/ticket/339/TracBB-0.1.2.tar.gz), I had to hack it a little bit to make it work. There are still some glitches but for simply browsing the waterfall it works.

Setup of a BuildBot slave

There are two step to set up a new slave:

  • install BuildBot on the machine in question, and
  • add the appropriate entries in the master configuration.

Installing a slave

Run the steps to install BuildBot as described above. Then run

$ DIR/bin/buildbot create-slave BASEDIR MASTER NAME PASSWORD

with BASEDIR being the directory where all the builds of this slave take place, MASTER being www.tddft.org:8898 in our case, NAME and PASSWORD are from the master configuration (see below).

As last step, you have to start the slave:

$ DIR/bin/buildbot start BASEDIR

If you want to be sure, that it is restarted when the machine reboots, the simplest thing as a crontab entry like

@reboot DIR/bin/buildbot start BASEDIR

but this one only works of you have Vixie cron running, usually the case on Linux distributions but commonly not on System V (check out man 5 crontab.

Please also note, that the environment set up by cron might be very different from your login environment. On the slaves, I have set up, I had to include a line

PYTHONPATH=DIR/lib/python2.4/site-packages

in the crontab.

Registering a slave with the master

The entire configuration of the BuildBot master is done via the file master.cfg that lives in /server/www/tddft.org/programs/octopus/buildbot-master/. It is a Python file, so for those of you that unlike me know this language, it will be easy to understand what's going on.

The following steps have to be undertaken:

Add a slave entry to the c['bots'] dictionary. The dictionary looks like this at the moment:

c['bots'] = [("athos", "secret1"),   # AMD Opteron, Debian
             ("octopus", "secret2"), # AMD Opteron, Fedora
             ("pepita", "secret3"),  # Sparc, Solaris
             ("g22", "secret4")      # i386, Debian
             ]

After adding your new slave, it might look like

c['bots'] = [("athos", "secret1"),   # AMD Opteron, Debian
             ("octopus", "secret2"), # AMD Opteron, Fedora
             ("pepita", "secret3"),  # Sparc, Solaris
             ("g22", "secret4"),     # i386, Debian
             ("wheel", "secret5")    # Alpha, OSF1
             ]

with wheel being the slave's name and secret5 the password you give on the create-slave command line. The name of the slave should not contain dots (at least I had trouble with that) and thus should not be the fully qualified domain name but something else which is unique and understandable.

Next, you have to add a builder, that knows how to compile (and test if you want) the code on the new slave. Go to the HOW TO BUILD THE CODE section of the file and add an entry like

x86_64_gfortran_build = octopusVpathBuild(
        var={"FC" : "/home/lorenzen/opt/gcc-4.2.0/bin/gfortran",
             "FCFLAGS" : "-Wall -I/home/lorenzen/opt/gfortran-4.2.0/netcdf/include"},
        flags=["--with-blas=-lblas -L/home/lorenzen/opt/gfortran-4.2.0/blas/lib",
               "--with-lapack=-llapack -L/home/lorenzen/opt/gfortran-4.2.0/lapack/lib",
               "--with-fft-lib=-lfftw3 -L/home/lorenzen/opt/gfortran-4.2.0/fftw/lib",
               "--with-sparskit=-lskit -L/home/lorenzen/opt/gfortran-4.2.0/sparskit/lib",
               "--with-arpack=-larpack -L/home/lorenzen/opt/gfortran-4.2.0/arpack/lib",
               "--with-netcdf=-lnetcdf -L/home/lorenzen/opt/gfortran-4.2.0/netcdf/lib",
               "--with-gsl-prefix=/home/lorenzen/opt/gsl",
               "--disable-gdlib"])

The name of the builder x86_64_gfortran_build and the octopusVpathBuild created a builder that uses VPATH capabilities of make to find more bugs in the build system (octopusVpathBuild is actually just a Pyhton function defined above the current section). The nomencalture for builders is arch_compiler_options_type with

  • arch being the target architecture like sparc, i386, ppc,
  • compiler the Fortran 90 compiler used like gfortran, ifort, nag, pgi, g95, sunf90,
  • options special build characteristics like mpich2, and
  • type the kind of build, build and test for the moment. build only compiles and links the code, test additionally runs the testsuite.

The two keyword arguments var and flags are for the configure invocation:

Setting the arguments to

var={"A" : "a", "B" : "b"},
flags=["--with-bar=/usr/lib/bar",
       "--without-foo"]

results into the configure line

$ A=a B=b ./configure --with-bar=/usr/lib/bar --without-foo

In order to run the testsuite, it might be sufficient to add the line

octopusAddTest(x86_64_gfortran_build)

This causes a make -C testuite check to be issued after the compilation. For special builds and environment, this might not be enough, e. g. MPI builds which require additional setup.

For these cases, the testing step can be added by

x86_64_gfortran_mpich2_test.addStep(shell.ShellCommand,
                                    description="testing",
                                    descriptionDone="test",
                                    command="cd _build && COMMAND")

with COMMAND being a shell command line that does the required steps. The cd _build is to get into the build tree is necessary because a VPATH build is being performed. The exit code of COMMAND determines the testresult. So, if you append several command by && or ; be sure that the correct exit code is returned, e. g. by

command="cd _build && do_preparations && do_tests; err=$?; do_cleanup; exit $?"

After specifying the build, you have to say in the AND WHERE section which slave shall perform the build. Add an entry

bot_BUILDNAME = {'name' : "BUILDNAME",
                 'slavename' : "SLAVE",
                 'builddir' : "BUILDNAME",
                 'factory' : BUILDNAME}

If several slaves are able to do the build instead of 'slavename' : "SLAVE" a list 'slavenames' : ["SLAVE1", "SLAVE2"] can be given. This allows for some load-balancing in the scheduler. The bot_BUILDNAME has to be added to the c['builders'] list:

c['builders'] = [bot_x86_64_gfortran_build, 
		 bot_x86_64_ifort_build,

		 bot_x86_64_gfortran_mpich2_build, 
		 bot_x86_64_ifort_mpich2_build,
		 bot_sparc_sunf90_build,
		 bot_x86_64_ifort_test,
		 bot_x86_64_ifort_mpich2_test,
		 bot_i386_gfortran_test,
                 bot_BUILDNAME
		 ]

As a last step, the build has to be added to one or more schedulers. Scheduler are defined in the SCHEDULERS section and are responsible for actually triggering the builds. Currently, there is the buildonly scheduler that triggers rather quick compilations on each commit and the nightly scheduler to which all BUILDNAME_tests go. Simply add your build to the builderNames entru of the scheduler specification.

After saving the configuration file, it takes a little while until the master rereads its configuration, up to an hour because of the @hourly crontab entry I have set, until the new build appears on the waterfall page. If it does not appear, have a look at the twistd.log file in the master's directory, perhaps you rendered the Python file invalid. In such cases, the master simply ignores the change in configuration and continues with the old one.


List of slaves

We currently have the following slaves running:

Name Architecture OS Location
g22 i386 Debian/GNU Linux FU Berlin
pepita sparc Sun Solaris 10 TU Berlin
octopus x86_64 Debian/GNU Linux FU Berlin