Olympic Build and Packaging Pipelines

When setting up an automated continuous delivery pipeline for our current project, we decided to use RPMs and Yum – the native packaging and software updates platform of our staging, UAT and production environments – instead of more Ruby-esque solutions like Capistrano.

There were several reasons behind that, but by far the most important was the affinity we noticed the operations staff already had with RPMs and Yum: all of the system packages were being taken care of using it, with some deal of auditing thrown in: has this file changed since we installed the package? Why?

Making a decent RPM out of a Rails (or Sinatra) project wasn’t very hard: a bit of head-scratching and a few passes through Maximum RPM later, we had something we could work with.

The main headaches were working out which packages were necessary to build the RPM (the BuildRequires part), figuring out how to reliably install all of the gem dependencies from bundler into the packaged RPM (bundle install --deployment helped), and which changes needed to be made to the application to rely solely on environment variables set by /etc/default/[app]. This way, we wouldn’t have any configuration that could vary across environments coming from the package itself.

The next step was setting up a marathon of tests for those RPMs. They already contained unit tested code, and we built more obstacles: functional and integration tests, performance micro-benchmarks, metrics analysis and some manual inspection in UAT. With each step weeding out bad candidates, only truly excellent builds can get to production.

To be able to easily visualize how excellent our builds were, we came up with a simple and effective naming scheme, in time for the Olympics: precious metals.

A unit tested RPM would start out in the “tin” repository. Another step in the pipeline gets triggered and deploys it to a smoke test machine. If that works, it gets promoted to the “bronze” repository. Functional and integration tests cause it to be promoted to “silver”, and so on through “gold” (ready to go) and “platinum” (in production already).

To the operations staff, this makes a lot of sense: the production machines are fetch updates from “gold”, UAT environments look at “silver” and so on. It’s trivial to configure Yum to do that – chances are you alraedy did it when setting up your distribution –, and its output is very easy to read and understand when something goes wrong.

Looking back at the Capistrano days, my only regret is not having done this sooner!

Geek
General
Work

Comments (0)

Permalink

Logging: a UI problem

Your logs are part of the UI. They are streams of interesting and actionable events that will be consumed by both machines and humans.

The most useful practice I’ve followed so far is to keep that in mind and act accordingly: understand the computer systems parsing, filtering and analyzing logs and talk to all the people who will be notified when something of interest happens. Watch what they do, and ask yourself “how could the output of my application be more helpful in this scenario?”

The parties interested in your application’s logs are usually at a conflict: what’s interesting and actionable to developers and testers isn’t so important to production support engineers, and your SQL timing statements are probably seen as junk to the analytics tool looking for security issues.

In order to minimize that, whatever logging framework you’re using should be able to direct those streams of events with pre-defined (and hopefully, easily configurable) filters, and each type of environment or user should be able to have its own configuration.

Here’s a few examples to illustrate the point:

During development, it makes sense to have every debug statement relevant to the module being worked on going to the same stream, while telling the framework to take it easy with all other modules. Events from other modules may be interesting, but they should be filtered out if they’re not actionable, as you’re not going to do anything with them. Changing the filter so you can look at different modules should take no more than a few seconds of work (but may require bouncing a server or two).

While running unit tests on a continuous integration set-up, it may make sense to disable verbose logging altogether: if your automated testing environment is sufficiently mature, at least one of the tests will break and you’ll be able to replay the failure on a development workstation to get at the details. In that kind of environment, not only you want to be mindful of disk usage, the events themselves are usually not very actionable anyways.

In production, leave that configuration to people experienced with support: talk to engineers who will get paged at 3am and rushed into a cab if a particular type of error happens, and get their input. They will tell you exactly what kinds of errors they’re interested in on your application in particular. Remember this is probably specific to the domain you’re working on, and that support engineers usually take care of more than one application, and more than one server.

A very common mistake I see (looking at you, JBoss!) is to treat errors that developers should see as important (a NullPointerException, for example) and that production support people can’t do a thing about. Don’t wake them up unless there is something they can do to fix the problem, or risk crying wolf too many times and having them filtering out important, actionable notifications, like OutOfMemory errors, low disk space, etc.

General

Comments (0)

Permalink

Assando Times [pt_BR]

[This is a translation of the previous post into Brazilian Portuguese.]

Tenho um amigo cuja esposa adora cozinhar, e ele foi incumbido de ajudá-la numa noite dessas.

Ele estava sem muita paciência, e resolveu pegar uns atalhos para acabar com a tarefa o mais rápido possível e voltar ao seu código, videogame, ou seja lá o for que lhe chamava a atenção no momento.

Em vez de trabalhar rápido e em pequenos lotes – ou lentamente, mas em grandes lotes – ele jogou todos os ovos, farinha e leite juntos de uma vez só. Misturou tudo muito rápido, enquanto mexia muito pouco. Formaram-se pelotas e, eventualmente, uma bacia inteira de massa com uma aparência terrível tinha que ser jogada fora.

O segredo, aparentemente, é misturar os ingredientes secos e molhados com muito cuidado: se você sabe que vai adicionar mais do que a mistura vai aguentar, é melhor peneirar os ingredientes secos antes, cruzar os dedos e, em seguida, misturar e mexer tudo como louco: quanto mais desequilibrada a proporção de seco vs. molhado, maior o esforço necessário para manter tudo homogêneo.

É um equilíbrio delicado, se você está lidando com uma nova receita ou utensílios desconhecidos. Na verdade, uma mudança de altitude ou a umidade do ar pode ser suficiente para mudar as probabilidades em favor de bolos medíocres.

Enquanto meu amigo me contava essa história, de repente eu percebi que a mesma coisa funciona para equipes, também: ao adicionar muitas pessoas ao mesmo tempo sem que que haja espaço para que todos absorvam e compartilhem seus pontos de vista, você está caminhando para a formação de pelotas de conhecimento e cultura, que são realmente difíceis de dissolver. Novos domínios, tecnologias e processos são variáveis a mais, tornando-se muito difícil prever se você vai acertar ou não, e que jogam as probablilidades em favor de resultados medíocres.

Mas e se você tivesse que fazê-lo? Ou, em vez disso: se você decidisse que, dadas limitações atuais, a coisa mais sensata e desejável a se fazer seria acrescentar um monte de gente numa equipe em um período de tempo muito curto? O que você faria para mantê-la livre de “pelotas”?

General

Comments (2)

Permalink

Baking Teams

I have a friend whose wife loves baking, and he was tasked with helping her out one of these evenings. He was keen to cut corners and get it over with as soon as possible, so he could go back to his code, videogame or whatever it was that caught his fancy at the time.

Instead of working fast in small batches – or slowly in big ones – he mixed all the eggs, flour and milk together. Too fast, while stirring too little. It formed clumps, and eventually a whole bowl of terrible-looking dough had to be thrown away.

It turns out, the secret is to mix wet and dry ingredients really carefully: if you know you’re going to add more than the mixture will take, you better sift the dry stuff first, hope for the best and then whisk and stir like mad. It’s a delicate balance if you’re dealing with a new recipe or unfamiliar tools. In fact, a change in altitude or air humidity could be enough to tip the odds in favour of mediocre cake.

As my friend told me this story, I suddenly realized the same thing works for teams, too: add too many people at once and, unless there’s room for everybody to absorb and share their points of view, you’re going to get knowledge and culture “clumps” that are really hard to dissolve. New domains, technologies and processes: more variables, making it really hard to predict whether you’ll get it right or not, and they all tip the scales towards mediocre outcomes.

But what if you had to? Or, instead: what if you decided that, given other constraints, it’d be desirable to add lots of people to a team in a very short time span? What would you do to keep it from “clumping”?

General

Comments (1)

Permalink

ThoughtWorks Brazil: The Dinner Table

In the next few weeks, I’ll be putting together a few posts that tell a little bit about the stories behind the scenes of opening the ThoughtWorks Brazil office. They’re entirely from my point of view, might contain an embarrassing moment or two and are absolutely not to be taken as the official voice of ThoughtWorks on any of these matters. I’m not a PR person ;)

Right before starting the efforts to go set up the physical space of the new ThoughtWorks Brazil office, we hit a snag: the new building where we would be located wasn’t ready yet, so we got to borrow some space from the University temporarily.

When we got there, the sign near the door said “Open Source Labs”. I never found out what happened to the lab once we took their space, but I hope they’re alive and kicking somewhere else in the building. ThoughtWorks contributes a fair deal to OSS projects, and I felt bad for taking over some of that space, even if symbolically.

Overnight, the facilities management at PUCRS cleared the rooms and gave us the OK to put whatever furniture we wanted there.

- Hey Sid…
- Yeah, Carlos?
- The furniture guys just called. We are not going to have it until the middle of next month!
- It’s been delayed again!?
- Yeah… What do we do?
- Can we talk to someone from PUCRS and see if we can borrow some in the meantime?

So we did, and they gracefully gave us access to a storage room right next to ours, full of used chairs and desks. They told us to pick whatever we needed. Most likely, this was where they stored furniture that needed repairs, as most of it was broken, scratched, wobbly or otherwise in need of repair. It took us about an hour to measure everything and pick the items we needed and that were in usable conditions.

The facilities guys didn’t let us help in the move from that point on, which was sort of disappointing. Sid and I came prepared to handle all of the grunt work ourselves that morning – I even wore shoes!

Fast forward about a month, we were ready, settled and cracking on with our first two projects already, even though at this point, our “real” furniture hadn’t arrived yet. The steady growth of the office had us rubbing elbows and bumping into each other, making us push our luck with PUCRS to get another room. They gave us that storage room next door and cleared it out, but it needed a bit of work done. Builders were hired, and after fitting it with AC, ceiling tiles and a few other bits, we moved in.

Meanwhile, the office was still growing like crazy, and we needed another desk for workstations in the new room. Unfortunately, PUCRS didn’t have any we could borrow, but one of our developers did: Lipe Sabella had just moved from São Paulo to Porto Alegre and all his stuff was still in boxes at my apartment while he looked for a place. Frankly, his dinner table propped against a wall wasn’t doing my living room much service, so we arranged to get it delivered to the office, and now we had room for another 6 people!

- We’re going to have to protect it somehow. I’ve managed not to scratch it all these years!
- Some plastic, maybe?

Picture: Luiza, Rubem and Andrea tape down the most horrifying plastic cover we could find at the neaby mall.

General

Comments (0)

Permalink