Like many other Development Operations (DevOps) teams in software today, LucidLink uses Jenkins as our core platform for continuous integration and test.
And like many other DevOps teams, managing and backing up the underlying storage is a challenge and a distraction.
Overview of our Jenkins storage challenge
Currently LucidLink supports 5 different OSes which results in running 12 different builds, and over time our build artifacts have increased in size.
- At the time of writing, each build is roughly 1GB in size.
- We push a nightly release build for each of the supported platforms.
- We also manually create custom builds for individual feature development, and employ static and dynamic analysis tools.
- A new release build is triggered when branch builds are merged into the master.
The result is that every 2 to 3 days, we end up with around 30GB of archived artifacts.
We originally did our best to plan, estimate, and pre-allocate the storage that we thought we would need as per best practices. We would often choose not to archive certain things that we deemed unnecessary for the purpose of saving on storage space. However, hitting the limits of traditional storage are inevitable, and before long we were looking at the time and resource consuming task of migrating our master Jenkins instance in order to increase the storage.
Wouldn’t it be great if we could simply mount an object store and make use of its elasticity, durability and low cost instead? No solution out there was really any simpler, and required significant changes to our existing infrastructure along with a lot of added complexity.
Luckily for us, LucidLink is in the business of building a distributed file system using object storage as a backend and supports all the OSes we need (including Linux)! And with our beta headed out the door, it was time to eat our own dog food for critical business applications. In fact, it was a version of this very challenge which sparked the idea for LucidLink in the first place.
Following is the step by step process in implementing LucidLink in our very standard Jenkins instance.
High level view of LucidLink, as the file system for Jenkins
From 30,000 ft level, implementing a LucidLink filespace is a pretty simple thing. We leverage AWS S3 as the back-end storage, overlay a distributed, streaming architecture, sell it as a subscription service, and implement on your devices as a configurable mount point. Here is what you need.
- An AWS account, make sure you have your access key and secret key handy for a one time configuration.
- Subscribe to the LucidLink service on AWS marketplace.
- Download the appropriate LucidLink client for your OS.
- Configure your application to use the LucidLink filespace, which will appear as local storage. (In actuality, it will be streamed on demand from S3.)
Linux ‘init’ systems
There are three major init systems in the Linux world: System V, Upstart and systemd. They are all completely different.
System V is the oldest and its scripts are stored in /etc/init.d/.
Upstart is something in the middle – it supersedes System V, but it’s already deprecated in favor of systemd. Upstart‘s scripts are located in /etc/init/.
systemd is the newest init system.
Our Jenkins master machine
Our Jenkins master machine is with Ubuntu 14.04.5 LTS and supports the following two init systems: System V and Upstart.
Unfortunately, Jenkins’ service configuration is written for System V (/etc/init.d/jenkins).
Supporting System V is a bit hard and it’s not that flexible. For this reason, we implemented the Lucid service as an Upstart service.
Upstart services may depend on each other, but cannot depend on System V services (neither can System V services depend on Upstart services). On the other hand, we need to start the Lucid service before starting Jenkins. To workaround this, we should first stop the auto start of Jenkins and then let the Lucid service manage the Jenkins service.
A few Upstart-related notes
- in general, you don’t want to use exec in the script/end script section;
- single line scripts may be replaced with a single exec line;
- at the moment, the mounted Lucid folder cannot be accessed by other Linux users, except by the owner of the Lucid process (that’s why we need to use the start-stop-daemon command when starting the Lucid process; the jenkins user must be used, otherwise Jenkins won’t be able to see the content of the mounted folder);
- using setuid jenkins and setgid jenkins in an Upstart script will permanently change the user. This is like a property of the script and it cannot be changed multiple times. In our case, we also need to have root permissions – to be able to start/stop the Jenkins service (hence the start-stop-daemon command from the previous point);
- in general, the code in script/end script section in an Upstart config file is executed by /bin/sh and not by /bin/bash. This is a significant difference (i.e. expressions like “if [[ …. ]]” cannot be used).
Running LucidLink as a Linux service
We have a script to share upon request: lucid.conf – the actual service configuration; note, that this is an Upstart service configuration (and it also manages the jenkins service).
The script is pretty straight forward and self-explanatory. A short summary:
- write logs messages in service.log file;
- start Lucid on Linux boot;
- handle system reboot and shutdown;
- restart service in case of a Lucid crash;
- start Lucid daemon as Jenkins user;
- once the daemon is started, activate Lucid;
- start Jenkins service when Lucid is activated;
- stop Jenkins service before stopping Lucid.
Lucid as Jenkins back-end
Jenkins’ service is System V and the host supports Upstart init system.
- Change build directory:
Jenkins -> Manage Jenkins -> Configure System -> Home directory -> Advanced -> Build Record Root Directoryset to
- Disable Jenkins’ autostart:
update-rc.d jenkins disable
- Copy lucid.conf into /etc/init/;
- Create directory /var/lib/jenkins/lucid/;
- Start Lucid daemon (just temporary, to be able to initialize the file system):
./Lucid daemon --config-path /var/lib/jenkins/lucid/.lucid &
- Initialize the file system:
./Lucid init-s3 --fs <name> --access-key XXXX --secret-key XXXX --https --s3 <region> --root-path /var/lib/jenkins/.lucid
- Link to file system:
./Lucid link <name> --mount-point /var/lib/jenkins/builds --root-path /var/lib/jenkins/lucid/.lucid
- Exit lucid (it must be stopped before running the service):
Run the system
Note: make sure there’s no running Lucid process.
service jenkins stop && service lucid start
- Check Upstart logs for Lucid service: /var/log/upstart/lucid.log
- Check the custom service log file: /var/lib/jenkins/lucid/service.log
Summary and benefits
With no change to our workflow, and only a few minimal configuration tweaks, we now have highly durable and elastic storage where we can store and access our build artifacts. Some benefits include:
- The storage grows on demand, and only pay for what we use.
- We now can easily (and cost effectively) save all our build artifacts without worry about capacity.
- All data is encrypted from the client; our IP is fully protected.
- The data is extremely durable, protected by S3’s underlying multiple replication.
- We can access artifacts from anywhere, any machine.
- Performance was not much of a concern before, and if anything it has improved.
This approach could be utilized for any process or utility within the DevOps space where you want to elegantly replace local storage (physical or EBS etc.) with object storage.