The Australian Government Linked Data Working Group (AGLDWG) provides a persistent identifier service for uniform resource identifiers (URI's) on the linked.data.gov.au domain. The provisioning of this service is the result of a community effort within the working group.

One key requirement of the AGLDWG's persistent identifier service is to make ontology resources available at the persistent linked.data.gov.au domain with Linked Data-enabled content negotiation. Currently the AGLDWG meets this requirement, but the manual process of publishing and updating the ontologies is laborious and prone to human error. This results in outdated ontology documentations online.

Proposal

A proposed solution to improving the manual process of publishing ontologies is to integrate a continuous integration (CI) and continuous deployment (CD) pipeline to the publication workflow with Drone.

Drone is a CI/CD platform written in Go. Go being a statically typed and compiled language makes Drone fast and lightweight. Drone is the perfect candidate as it is open-source software, allows self-hosting, and integrates with all the major Git providers like GitHub, Bitbucket, and GitLab. It also has one of the easiest configuration files expressed in YAML.

Since the ontologies published on the linked.data.gov.au domain all reside in different Git repositories owned by different government agencies, it is important that the CI/CD solution does not require adding additional security credentials to each Git repository, and is easy to set up. To solve this, Drone has a built-in cron-like scheduler (see cron) which enables schedules to run hourly, daily, etc. And since likely all ontologies are in publicly-accessible Git repositories, the Drone pipeline can simply clone each repository to build and deploy to the linked.data.gov.au domain.

Working implementation of the CI/CD described above:

High-level Overview Diagram

Overview pipeline diagram of using Drone CI.

Implementation Details

  • VPS: DigitalOcean standard 1GB and 1vCPU Droplet
  • OS: Ubuntu 18.04 LTS
  • Web server: Nginx with HTTP Accept headers for content negotiation
  • CI/CD: Drone CI.

Enabling Drone for CI/CD requires installing the Drone server and a Drone runner. Drone server requires installation with the user credentials of at least one of the Git platform providers. In this instance, we have chosen GitHub. For the Drone runner we have chosen the Exec Runner, which runs as a daemon and executes commands on the host machine without isolation. This allows us to do things such as copy files to the web directory, which is required for deploying the documentation generated by pyLODE. The Drone Exec Runner also provides a read-only dashboard of the status of each build pipeline. View it at https://drone-exec-runner.edmondchuc.com/ with the username drone and password agldwg.

With the connected GitHub account, we have created a repository called drone-cron. This repository is set up in Drone with the "cron" feature, set to hourly. The repository also contains the pipeline configuration file for Drone called .drone.yml. Therein lies the steps and commands on how to deploy each ontology to the AGLDWG web server.

The step to deploy the Plot ontology are as follows:

  • Clone the Plot ontology
  • Copy the RDF file to the web directory (in this case, plot.ttl)
  • Clone and run pyLODE to generate the HTML documentation
  • Clone and run pyLODE Inject to inject additional HTML such as logos and figures to the pyLODE HTML documentation
  • Copy the pyLODE .html and .css file to the web directory.

The workflow is straightforward but what is the pyLODE Inject tool?

| pyLODE Inject injects logos, links, paragraphs, and figures to a pyLODE document.

pyLODE Inject is a command-line application that injects additional HTML to embed organisation logos and figures (diagrams). For example, the author can change the content of the Plot ontology (in the RDF file) as well as embed new figures to the pyLODE documentation by editing the pyLODE YAML configuration file and have it reflected live on the web on the next hourly Drone pipeline execution. See pyLODE Inject configuration file.

We have successfully used Drone, pyLODE, and pyLODE Inject to create an automated deployment process for ontology documentation. We have designed a pipeline where the configuration for each tool is version-controlled, transparently available on GitHub, and easy to comprehend. The infrastructure will also allow us to integrate tests such as ontology validation with SHACL, etc. Most importantly, the infrastructure allows us to make edits to the source ontology in its respective Git repository and have it automatically updated online at the linked.data.gov.au domain within the hour.