Tue, 19 Jul 2016

Continuous Delivery on your Laptop


Permanent link

An automated deployment system, or delivery pipeline, builds software, and moves it through the various environments, like development, testing, staging, and production.

But what about testing and developing the delivery system itself? In which environment do you develop new features for the pipeline?

Start Small

When you are starting out you can likely get away with having just one environment for the delivery pipeline: the production environment.

It might shock you that you're supposed to develop anything in the production environment, but you should also be aware that the delivery system is not crucial for running your production applications, "just" for updating it. If the pipeline is down, your services still work. And you structure the pipeline to do the same jobs both in the testing and in the production environment, so you test the deployments in a test environment first.

A Testing Environment for the Delivery Pipeline?

If those arguments don't convince you, or you're at a point where developer productivity suffers immensely from an outage of the deployment system, you can consider creating a testing environment for the pipeline itself.

But pipelines in this testing environment should not be allowed to deploy to the actual production environment, and ideally shouldn't interfere with the application testing environment either. So you have to create at least a partial copy of your usual environments, just for testing the delivery pipeline.

This is only practical if you have automated basically all of the configuration and provisioning, and have access to some kind of cloud solution to provide you with the resources you need for this endeavour.

Creating a Playground

If you do decide that you do need some playground or testing environment for your delivery pipeline, there are a few options at your disposal. But before you build one, you should be aware of how many (or few) resources such an environment consumes.

Resource Usage of a Continuous Delivery Playground

For a minimal playground that builds a system similar to the one discussed in earlier blog posts, you need

  • a machine on which you run the GoCD server
  • a machine on which you run a GoCD agent
  • a machine that acts as the testing environment
  • a machine that acts as the production environment

You can run the GoCD server and agent on the same machine if you wish, which reduces the footprint to three machines.

The machine on which the GoCD server runs should have between one and two gigabytes of memory, and one or two (virtual) CPUs. The agent machine should have about half a GB of memory, and one CPU. If you run both server and agent on the same machine, two GB of RAM and two virtual CPUs should do nicely.

The specifications of the remaining two machines mostly depend on the type of applications you deploy and run on them. For the deployment itself you just need an SSH server running, which is very modest in terms of memory and CPU usage. If you stick to the example applications discussed in this blog series, or similarly lightweight applications, half a GB of RAM and a single CPU per machine should be sufficient. You might get away with less RAM.

So in summary, the minimal specs are:

  • One VM with 2 GB RAM and 2 CPUs, for go-server and go-agent
  • Two VMs with 0.5 GB RAM and 1 CPU each, for the "testing" and the "production" environments.

In the idle state, the GoCD server periodically polls the git repos, and the GoCD agent polls the server for work.

When you are not using the playground, you can shut off those processes, or even the whole machines.

Approaches to Virtualization

These days, almost nobody buys server hardware and runs such test machines directly on them. Instead there is usually a layer of virtualization involved, which both makes new operating system instances more readily available, and allows a denser resource utilization.

Private Cloud

If you work in a company that has its own private cloud, for example an OpenStack installation, you could use that to create a few virtual machines.

Public Cloud

Public cloud compute solutions, such as Amazon's EC2, Google's Compute Engine and Microsoft's Azure cloud offerings, allow you to create VM instances on demand, and be billed at an hourly rate. On all three services, you pay less than 0.10 USD per hour for an instance that can run the GoCD server[^pricedate].

[^pricedate]: Prices from July 2016, though I expect prices to only go downwards. Though resource usage of the software might increase in future as well.

Google Compute Engine even offers heavily discounted preemtible VMs. Those VMs are only available when the provider has excess resources, and come with the option to be shut down on relatively short notice (a few minutes). While this is generally not a good idea for an always-on production system, it can be a good fit for a cheap testing environment for a delivery pipeline.

Local Virtualization Solutions

If you have a somewhat decent workstation or laptop, you likely have sufficient resources to run some kind of virtualization software directly on it.

Instead of classical virtualization solutions, you could also use a containerization solution such as Docker, which provides enough isolation for testing a Continuous Delivery pipeline. The downside is that Docker is not meant for running several services in one container, and here you need at least an SSH server and the actual services that are being deployed. You could work around this by using Ansible's Docker connector instead of SSH, but then you make the testing playground quite dissimilar from the actual use case.

So let's go with a more typical virtualization environment such as KVM or VirtualBox, and Vagrant as a layer above them to automate the networking and initial provisioning. For more on this approach, see the next section.

Continuous Delivery on your Laptop

My development setup looks like this: I have the GoCD server installed on my Laptop running under Ubuntu, though running it under Windows or MacOS would certainly also work.

Then I have Vagrant installed, using the VirtualBox backend. I configure it to run three VMs for me: one for the GoCD agent, and one each as a testing and production machine. Finally there's an Ansible playbook that configures the three latter machines.

While running the Ansible playbook for configuring these three virtual machines requires internet connectivity, developing and testing the Continuous Delivery process does not.

If you want to use the same test setup, consider using the files from the playground directory of the deployment-utils repository, which will likely be kept more up-to-date than this blog post.

Network and Vagrant Setup

We'll use Vagrant with a private network, which allows you to talk to each of the virtual machines from your laptop or workstation, and vice versa.

I've added these lines to my /etc/hosts file. This isn't strictly necessary, but it makes it easier to talk to the VMs:

# Vagrant
172.28.128.1 go-server.local
172.28.128.3 testing.local
172.28.128.4 production.local
172.28.128.5 go-agent.local

And a few lines to my ~/.ssh/config file:

Host 172.28.128.* *.local
    User root
    StrictHostKeyChecking no
    IdentityFile /dev/null
    LogLevel ERROR

Do not do this for production machines. This is only safe on a virtual network on a single machine, where you can be sure that no attacker is present, unless they already compromised your machine.

That said, creating and destroying VMs is common in Vagrant land, and each time you create them anew, the will have new host keys. Without such a configuration, you'd spend a lot of time updating SSH key fingerprints.

Then let's get Vagrant:

$ apt-get install -y vagrant virtualbox

To configure Vagrant, you need a Ruby script called Vagrantfile:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure(2) do |config|
  config.vm.box = "debian/contrib-jessie64"

  {
    'testing'    => "172.28.128.3",
    'production' => "172.28.128.4",
    'go-agent'   => "172.28.128.5",
  }.each do |name, ip|
    config.vm.define name do |instance|
        instance.vm.network "private_network", ip: ip
        instance.vm.hostname = name + '.local'
    end
  end

  config.vm.synced_folder '/datadisk/git', '/datadisk/git'

  config.vm.provision "shell" do |s|
    ssh_pub_key = File.readlines("#{Dir.home}/.ssh/id_rsa.pub").first.strip
    s.inline = <<-SHELL
      mkdir -p /root/.ssh
      echo #{ssh_pub_key} >> /root/.ssh/authorized_keys
    SHELL
  end
end

This builds three Vagrant VMs based on the debian/contrib-jessie64 box, which is mostly a pristine Debian Jessie VM, but also includes a file system driver that allows Vagrant to make directories from the host system available to the guest system.

I have a local directory /datadisk/git in which I keep a mirror of my git repositories, so that both the GoCD server and agent can access the git repositories without requiring internet access, and without needing another layer of authentication. The config.vm.synced_folder call in the Vagrant file above replicates this folder into the guest machines.

Finally the code reads an SSH public key from the file ~/.ssh/config and adds it to the root account on the guest machines. In the next step, an Ansible playbook will use this access to configure the VMs to make them ready for the delivery pipeline.

To spin up the VMs, type

$ vagrant up

in the folder containing the Vagrantfile. The first time you run this, it takes a bit longer because Vagrant needs to download the base image first.

Once that's finished, you can call the command vagrant status to see if everything works, it should look like this:

$ vagrant status
Current machine states:

testing                   running (virtualbox)
production                running (virtualbox)
go-agent                  running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

And (on Debian-based Linux systems) you should be able to see the newly created, private network:

$ ip route | grep vboxnet
172.28.128.0/24 dev vboxnet1  proto kernel  scope link  src 172.28.128.1

You should now be able to log in to the VMs with ssh root@go-agent.local, and the same with testing.local and production.local as host names.

Ansible Configuration for the VMs

It's time to configure the Vagrant VMs. Here's an Ansible playbook that does this:

---
 - hosts: go-agent
   vars:
     go_server: 172.28.128.1
   tasks:
   - group: name=go system=yes
   - name: Make sure the go user has an SSH key
     user: name=go system=yes group=go generate_ssh_key=yes home=/var/go
   - name: Fetch the ssh public key, so we can later distribute it.
     fetch: src=/var/go/.ssh/id_rsa.pub dest=go-rsa.pub fail_on_missing=yes flat=yes
   - apt: package=apt-transport-https state=installed
   - apt_key: url=https://download.gocd.io/GOCD-GPG-KEY.asc state=present validate_certs=no
   - apt_repository: repo='deb https://download.gocd.io /' state=present
   - apt: update_cache=yes package={{item}} state=installed
     with_items:
      - go-agent
      - git

   - copy:
       src: files/guid.txt
       dest: /var/lib/go-agent/config/guid.txt
       owner: go
       group: go
   - lineinfile: dest=/etc/default/go-agent regexp=^GO_SERVER= line=GO_SERVER={{ go_server }}
   - service: name=go-agent enabled=yes state=started

 - hosts: aptly
   handlers:
    - name: restart lighttpd
      service: name=lighttpd state=restarted
   tasks:
     - apt: package={{item}} state=installed
       with_items:
        - ansible
        - aptly
        - build-essential
        - curl
        - devscripts
        - dh-systemd
        - dh-virtualenv
        - gnupg2
        - libjson-perl
        - python-setuptools
        - lighttpd
        - rng-tools
     - copy: src=files/key-control-file dest=/var/go/key-control-file
     - command: killall rngd
       ignore_errors: yes
       changed_when: False
     - command: rngd -r /dev/urandom
       changed_when: False
     - command: gpg --gen-key --batch /var/go/key-control-file
       args:
         creates: /var/go/.gnupg/pubring.gpg
       become_user: go
       become: true
       changed_when: False
     - shell: gpg --export --armor > /var/go/pubring.asc
       args:
         creates: /var/go/pubring.asc
       become_user: go
       become: true
     - fetch:
         src: /var/go/pubring.asc
         dest: =deb-key.asc
         fail_on_missing: yes
         flat: yes
     - name: Bootstrap the aptly repos that will be configured on the `target` machines
       copy:
        src: ../add-package
        dest: /usr/local/bin/add-package
        mode: 0755
     - name: Download an example package to fill the repo with
       get_url:
        url: http://ftp.de.debian.org/debian/pool/main/b/bash/bash_4.3-11+b1_amd64.deb
        dest: /tmp/bash_4.3-11+b1_amd64.deb
     - command: /usr/local/bin/add-package {{item}} jessie /tmp/bash_4.3-11+b1_amd64.deb
       args:
           creates: /var/go/aptly/{{ item }}-jessie.conf
       with_items:
         - testing
         - production
       become_user: go
       become: true

     - name: Configure lighttpd to serve the aptly directories
       copy: src=files/lighttpd.conf dest=/etc/lighttpd/conf-enabled/30-aptly.conf
       notify:
         - restart lighttpd
     - service: name=lighttpd state=started enabled=yes

 - hosts: target
   tasks:
     - authorized_key:
        user: root
        key: "{{ lookup('file', 'go-rsa.pub') }}"
     - apt_key: data="{{ lookup('file', 'deb-key.asc') }}" state=present

 - hosts: production
   tasks:
     - apt_repository:
         repo: "deb http://{{hostvars['agent.local']['ansible_ssh_host'] }}/debian/production/jessie jessie main"
         state: present

 - hosts: testing
   tasks:
     - apt_repository:
         repo: "deb http://{{hostvars['agent.local']['ansible_ssh_host'] }}/debian/testing/jessie jessie main"
         state: present

 - hosts: go-agent
   tasks:
     - name: 'Checking SSH connectivity to {{item}}'
       become: True
       become_user: go
       command: ssh -o StrictHostkeyChecking=No root@"{{ hostvars[item]['ansible_ssh_host'] }}" true
       changed_when: false
       with_items: groups['target']

You also need a hosts or inventory file:

[all:vars]
ansible_ssh_user=root

[go-agent]
agent.local ansible_ssh_host=172.28.128.5

[aptly]
agent.local

[target]
testing.local ansible_ssh_host=172.28.128.3
production.local ansible_ssh_host=172.28.128.4

[testing]
testing.local

[production]
production.local

... and a small ansible.cfg file:

[defaults]
host_key_checking = False
inventory = hosts
pipelining=True

This does a whole lot of stuff:

  • Install and configure the GoCD agent
    • copies a file with a fixed UID to the configuration directory of the go-agent, so that when you tear down the machine and create it anew, the GoCD server will identify it as the same agent as before.
  • Gives the go user on the go-agent machine SSH access on the target hosts by
    • first making sure the go user has an SSH key
    • copying the public SSH key to the host machine
    • later distributing it to the target machine using the authorized_key module
  • Creates a GPG key pair for the go user
    • since GPG key creation uses lots of entropy for random numbers, and VMs typically don't have that much entropy, first install rng-tools and use that to convince the system to use lower-quality randomness. Again, this is something you shouldn't do on a production setting.
  • Copies the public key of said GPG key pair to the host machine, and then distribute it to the target machines using the apt_key module
  • Creates some aptly-based Debian repositories on the go-agent machine by
    • copying the add-package script from the same repository to the go-agent machine
    • running it with a dummy package, here bash, to actually create the repos
    • installing and configuring lighttpd to serve these packages by HTTP
    • configuring the target machines to use these repositories as a package source
  • Checks that the go user on the go-agent machine can indeed reach the other VMs via SSH

After running ansible-playbook setup.yml, your local GoCD server should have a new agent, which you have to activate in the web configuration and assign the appropriate resources (debian-jessie and aptly if you follow the examples from this blog series).

Now when you clone your git repos to /datadisk/git/ (be sure to git clone --mirror) and configure the pipelines on the GoCD server, you have a complete Continuous Delivery-system running on one physical machine.


I'm writing a book on automating deployments. If this topic interests you, please sign up for the Automating Deployments newsletter. It will keep you informed about automating and continuous deployments. It also helps me to gauge interest in this project, and your feedback can shape the course it takes.

Subscribe to the Automating Deployments mailing list

* indicates required

[/automating-deployments] Permanent link