Building a new tagging infrastructure for GOV.UK
(This post was published first on Inside GOV.UK)
We’re changing how tagging works behind the scenes to allow us to create a single way of categorising content (‘taxonomy’) for all content on GOV.UK.
In October we laid out our plan to improve navigation on GOV.UK. It consists of 3 steps:
- Build a hierarchy of subjects on GOV.UK.
- Tag all content to the hierarchy.
- Use the hierarchy to support navigation, orientation and search.
Since then we’ve started on the first step of the taxonomy work by doing content audits and creating prototypes for navigation (which we’ll write more about soon).
In this blog post we’ll explain what we’ve been doing on the technical side to make the second step possible.
Different places for tagging
Content on GOV.UK is published using a number of different applications.
This means that in some publishing applications (eg Mainstream Publisher) you can tag to mainstream browse pages and in others only to topics. Applications often save the tags in their own database, which means that bulk-tagging pages from different applications is impossible. Merging and removing tags is also hard because for each change we want to make we have to go into the application itself to change the information.
A single place to tag all the things
Because we are working on a single taxonomy, we need to make sure that all publishing apps are talking to the same thing and that there’s a central place to store tags. To support this we’ve been working on implementing a new tagging architecture for GOV.UK. The guiding principles for this architecture have been:
- all content on GOV.UK should be taggable
- there should be one database for all tagging information
- content can be tagged in more than one publishing app
- tag names and URLs should be easy to change
There are over 30 applications that interact with the content of GOV.UK. To properly lay the foundations for the new taxonomy, we needed to update them all. Here’s what we did:
- added new functionality for tagging to the publishing API
- made sure all pages on GOV.UK are in a central place
- built Content Tagger, a replacement app for Panopticon (the application we used at first to keep track of all the content items on GOV.UK - the first commit was in July 2011)
- made sure we’re referring to tags and content using a unique ID (and not a URL) so that we can change the URLs easily
Because this work overlaps with the project to rebuild the GOV.UK publishing tools, we’ve worked together with the Publishing Platform team on a lot of this work.
When we’ve finished this work we can start building on this foundation. Upcoming work on the publishing tools includes improving tagging interfaces, allowing bulk tagging and supporting an alternative workflow for tagging.
It also provides the technical infrastructure to allow us to make changes to how content is organised and displayed to users.