Merge a Git repository and its history into a subdirectory of a second Git repository

On more than one occasion, I have needed to merge a Git repository and its history into a subdirectory of a second Git repository.

In this post, I show how to merge a Git repo (hereafter “puppet-roles”) into the modules/ directory a second Git repo (“control-repo”). And after the merge, I show how to filter the history so that commands like git log, git blame and git show all work as expected, and show a history as if the files in the subdirectory had always been there.

Set up a test environment

We begin by cloning control-repo into /var/tmp as follows:

$ cd /var/tmp $ git clone read more

Unit Testing a Bash Script with shUnit2

According to the docs, shUnit2:

…is a xUnit unit test framework for Bourne based shell scripts, and it is designed to work in a similar manner to JUnit, PyUnit, etc.. If you have ever had the desire to write a unit test for a shell script, shUnit2 can do the job.

In this post, I introduce the subject of unit testing shell scripts using Kate Ward’s shUnit2 unit testing framework, and show how I have used it to solve some testing problems.

Installing shUnit2

Install on a Mac

To install on a Macbook, run:

$ brew install shunit2

The following files are installed:

$ find /usr/local/bin/shunit2 /usr/local/Cellar/shunit2 /usr/local/bin/shunit2 /usr/local/Cellar/shunit2 /usr/local/Cellar/shunit2/2.1.6 /usr/local/Cellar/shunit2/2.1.6/.brew /usr/local/Cellar/shunit2/2.1.6/.brew/shunit2.rb /usr/local/Cellar/shunit2/2.1.6/bin /usr/local/Cellar/shunit2/2.1.6/bin/shunit2 /usr/local/Cellar/shunit2/2.1.6/INSTALL_RECEIPT.json read more

Using create_specs to refactor Puppet

After writing this it was pointed out to me that Corey Osman has written another tool that auto-generates Rspec code called Retrospec, which is also worth having a look at.

In this post I document a new method for complex Puppet code refactoring, which involves a simple tool that I wrote, create_specs.

I have been using this method a while now; I find it easier than catalog-diff and consider it to be safer as well.

The tool create_specs automatically generates Rspec test cases to test all aspects of the compiled catalog that is passed to it as an input. Of course, most Puppet modules can compile an infinite number of catalogs, unless they are very simple. Therefore, to have confidence in a real refactoring effort, we would need to compile a representative set of these catalogs and apply the method I describe here to each of those. This will be out of scope for today, but it is trivial to extend the method.

Here, I provide a simple Puppet module that manages an NTP service in a single class, and then I refactor it to split the module into several classes. I then show how this method proves with certainty that the refactoring did not introduce bugs.

I assume the reader already understands how to set up Rspec-puppet; if not, have a look at my earlier post.

Sample code

The sample code is a simple Puppet class that installs and configures NTP.

(Note: all of the code for this blog post is available at Github here. The reader can step through the revision history to see the examples before and after the refactoring.)

class ntp ( Array $servers, ) { package { 'ntp': ensure => installed, } file { '/etc/ntp.conf': content => template("${module_name}/ntp.conf.erb"), require => Package['ntp'], } service { 'ntp': ensure => running, enable => true, subscribe => File['/etc/ntp.conf'], } } read more

Introducing programmatic editing of Hiera YAML files


If you have ever maintained a complicated, multi-team deployment of Hiera, you have probably seen data keys repeated in flagrant violation of the Don’t Repeat Yourself principle.

To an extent, this is avoidable. It is possible to declare variables in Hiera and look them up from elsewhere in Hiera by calling the hiera function from within Hiera. It is also possible to define aliases in order to look up complex data from elsewhere within Hiera.

Meanwhile, the hiera_hash function can eliminate the need to repeat Hash keys at multiple levels of the hierarchy, although Puppet 3’s automatic parameter lookup will not return merged hash lookups.

On the other hand, many Puppet users don’t know about these features, and even when they do, tight project deadlines tempt the best of us to take shortcuts.

Bulk updating of Hiera data

The problem that arises can be stated as follows: Given many Hiera files, possibly in separate Git repos and maintained in separate teams, how would you update a similar block of Hiera data in all of these files?

I spent several hours on a Friday afternoon writing a simple Ruby script to double-check that I’d manually updated ~ 10 YAML files with changes to what were essentially the same data keys, and I wondered if there is a better way.

Python and ruamel.yaml

To my surprise, I discovered that it is simply impossible to programmatically update human-edited YAML files in Ruby because its parser cannot preserve commenting and formatting.

Mike Pastore states in his comment at

Most YAML libraries I’ve worked with don’t preserve formatting or comments. Some quick research turns up only one that does—and it’s for Python (ruamel.yaml). In my experience, YAML is great for human-friendly, machine-readable configuration files and not much else. It loses its allure the second you bring machine-writeability into the picture.

So to the Ruby community: someone needs to write a YAML parser that preserves commenting and formatting!

In the meantime, all power to Anthon van der Neut, who has forked the PyYAML project and solved a good 80% of the problem of preserving the commenting and formatting. He also proved to be incredibly helpful in answering questions about the parser on Stack Overflow, and in responding to bug reports.

I realised that a script that could execute snippets of arbitrary Python code on the YAML files in memory would provide a powerful and flexible interface for bulk editing of Hiera files. In the remainder of the post, I’ll show how various data editing – and viewing – problems can be solved using my new tool.

Installing the script

To install the script, just clone my Git repository and install the Python dependencies with PIP:

$ git clone $ cd hiera-bulk-edit $ pip install -r requirements.txt read more

Verifying file contents in a puppet catalog

One of the most useful applications of Rspec-puppet I have found is in the verification of generated ERB file content. However, it is not always obvious how to actually do this.

I discovered the verify_contents method one day when pondering a question at (ref). An undocumented feature of the Puppetlabs_spec_helper, it is used in a few Forge modules to allow testers to say, “the catalog should contain a file X, whose contents should contain lines A, B, ..”. For example, in the Haproxy module here.

In this post I’m going to document how I’ve used the verify_contents method and improved upon it when testing ERB generated file content.

Basic usage

The basic usage of verify_contents is as follows:


require 'puppetlabs_spec_helper/module_spec_helper' read more