Tagging Practises

Tagging schemas currently in use on this website, as well as techniques I use when tagging my content online.

Schemas

I tend to use a small set of subject-based tags very often, typically because they're things I care about and talk about a lot. These include:

  • personal, family
  • web, design, dev, js, php, indieweb
  • music, lutherie, repair, gurdy

Tags as visible, data classifying metadata

I often record arbitrary data using notes tagged in a particular way. Some of these schemes were started purposefully, others are emergent patterns/built on such patterns.

  • bookmark: usually contains a link I wish to store for future reference, along with some tags to classify it
  • checkin: used to indicate a check-in to a venue, possibly with a photo and/or comment.
  • steps: I record the number of steps I take every day, usually in the format {num-steps} #steps today {sometimes a comment here} (example).

As of early 2013, I have started to tag some of my notes based on emotions. Typically I am hash-tagging words within the note itself instead of explicitly, separately tagging them. Because of this, I have standardised on the past/present tense forms of emotions, e.g. “stressed” instead of “stress”.

Tips

If tags can be abused, they will be.

Reminder to self: don’t trust yourself to stick to tagging principles, build them with laziness and ignorance in mind.

Don't make it possible for two tags to contradict each other

For example, if you use tags to determine authorisation, don't have tags which declare a user can and can't access content, as they can contradict each other.

Figure out your stemming

Namely, pluralisation and tense. The number of tag clouds I’ve seen where misspellings, tenses and pluralisation abound is ridiculous. Figure out what you're going for, then stick to it. Implement server-side automated stemming if you need to.

Pseudotags: only actively tag things markup can't express

Tag it automatically based in markup. For example, don't have a “quote” tag, as the presence of <blockquote> elements imply that. Instead, either auto-tag based on that element, or run queries which check for the presence of the tag. These can be implemented as pseudotags.

Currently I have the following auto-tagging algorims working:

  • If an object’s content contains a blockquote, tag with quote
  • If an object’s content contains an h-card, tag with mention
  • If an object has both lat and lon tags, tag with location
  • If an object has an inReplyTo uri, tag with reply