Ode to Tags

I have always been interested in tags for taxonomy on the web. They’re so easy, so understandable, so flexible and intuitive. There are many names for competing/complementing organisation schemes — categories, collections, sets, albums, playlists, plain ‘ol ‘lists’. And then you have the extra possibilities afforded by nesting.

Let’s examine the differences between some different organisational schemes (I’ll refer to them as ‘containers’ from now on):

Commonest name	Content-Container Relationship	Nesting?	Created ad-hoc?	Content must be contained?
Tags	One to many	Not conventionally (some syntaxes proposed (e.g. #tag.subtag, #tag/subtag, not implemented AFAIK)	Yep (Give an article a tag, and that tag springs into existence)	No
Categories	One to one	Yep	No	Yep
Sets	One to many	No	No	No

The different names given to subtly different systems in different software renders this table fairly inaccurate in real life, but it’s clear that the main differentiators are the content-container relationship type, the ability to nest containers and practical container management.

Soon after I started developing waterpigs.co.uk into a more complete publishing web app, I decided that I would only be using tags (alongside content-type) to organise my content. The benefits are obvious; no need for much management logic and extreme flexibility make them a first choice, despite the lack of nesting ability.

Tagging Policy

Whilst tags boast flexibility, in the hands of many people they lack focus. How many ‘tag clouds’ (bloody useless things they are) have you seen with plural/singular tags that referred to the same thing? Tags in different tenses? Worse, mis-spellings?

For tags to work as an organisational method unaided by a one-to-one, required container (e.g. category), you as the author must decide a tagging policy. I recommend using the order-from-chaos design pattern if you’re unsure — tag things with whatever seems right, then pick out patterns that suit the way you organise things.

This is the basic tag usage that I decided on pretty much from the beginning:

web: related to the web
design: related to design
dev: related to development
music: related to music
lutherie: related to musical instrument construction
tech: related to technology as a subject area, e.g. comments on Apple, Google, etc

These were all tags I’d been using in the previous iteration of my publishing software. Notice they’re all subjects. Tags are great for organising by subject, and the level of specificity can be controlled by tag combinations, e.g:

"web"

is something related to the web in general;

"web, dev"

is something related to web development; and

"web, dev, php"

is something related to php development for the web. I wouldn’t usually use tags to go any more specific than this unless I post a lot about a certain topic. The lovely thing about tags as opposed to nesting topics is that they’re so much more flexible than simply making a web category, then a dev subcategory… “ah, wait, will I need another dev category for non-web stuff? What’s the optimum nesting system? Do I need one-to-many categories? Isn’t that basically a tag? Etc Etc Etc…”.

Beyond the Basics

Machine Tags

Tags of the form namespace:predicate=value can be used to add extra structure and meaning to tags. A shorter form, e.g. predicate:value could also be used when a triple is overkill.

These tags could be used to provide higher levels of specificity without polluting the highest level namespace (as such. Heh. Sorry about the pretentious language here). They can also be used to attach arbitrary metadata to an object, although I’m uncertain about how far that should be taken. Certainly they take longer to type in!

These tags could be used to provide sets or lists, e.g. set:setname, which differs from a generic subject area tag by providing higher specificity.

Interoperability

Unlike other containers, tags are fairly standardised and oft-implemented in data APIs. Sure there are differences (space delimited vs comma delimited, “” encapsulated tags w. spaces in, etc) but those differences can usually be very easily resolved.

This means that tags (especially ones normalised to the minimal spec ones I propose below) can provide consistent containers that are pervasive across data stores. For example, my tag pages pull in content from my openphoto server. I also tag things consistently on soundcloud and youtube, so I could pull those in too if I wanted too.

This is of immense value in the federated, distributed web we publish in. Not only can we keep track of ownership across silos, we can also maintain our containers.

Gimmicks

Yet another BIG plus is all the fun that can be had from tagging things intelligently!

Art Direction

Much as I admire different-design-for-each-post people, I have neither the time, inclination or skill to do it myself. Tags to the rescue!

I expose an article’s tags in it’s <section> element as class names, e.g. something tagged with web, dev, php will be in <section class="clearfix t-web t-dev t-php">. I then use them to style some articles. Fun as it is, there are some downsides, and difficulties with the CSS involved, but that’s for another article.

Favouriting/Starring

I write and photograph rather a lot, and I use starring as a filtering mechanism (note: not much of this is implemented ATOW). Would I settle for a starred tag?

No.

It’s ★ all the way.

Not that I’m likely to stop there. There are tonnes of awesome unicode characters to use!

Practicalities

TODO: Add this and an abstract as a ‘tagging’ section on blogging policy, retain this article as methodology discussion.

Normalise tags to the following spec:

Lower case
Hyphens instead of spaces
Limit punctuation
Comma delimit
(?) How about rooting the tags? Would this be too ambiguous?

Tagging UI

The UI with which tags are added to an object varies hugely across the web. Some differences to note:

Delimiters
- I argue that comma delimiters are the only sensible choice, especially when the option to have multi–word tags by using double quotes is available.
- E.g. tag1, another tag, yet another tag vs tag1 "another tag" "yet another tag". The second is much uglier and more difficult to type