Last winter, I got fed up with how I do Internet, so I prototyped a browser extension that lets you create and view annotations within your social circle on any website. After sharing the alpha version with a friend, he mentioned Hypothesis – a team that has been working on the same problem for years and already made strong headway towards open source web annotation.

At first I was frustrated.
I spent many days with the torture of modern web development1, when most of the problem was already solved. It's a niche problem apparently, otherwise I'd have found a solution before feeling pressured to build one.

At the moment, annotation is mostly used by institutional knowledge workers (education, publisher,…). Almost none of my friends in tech or science use annotation for work and nobody does so privately. This seems strange given that you can surf the web as usual, AND, for example:

  • see other's insights, revisions, reviews, critiques, perspectives when you look at some content
  • add value with highlights, reactions, comments
  • discover shared interests together with friends or whatever group
  • build a richly interlinked knowledge and social graph while at it
  • ….

True, you could get something similar by stacking a few tools (maybe Twitter, Delicious, Evernote…),  switching between apps, keeping stock of context and references and so on…but having the input and display interfaces anchored to the actual segment of interest changes how we interact and reason about it quite a bit.

At least that’s what I thought when I started working on it. My question then became:

If it’s so awesome, why isn't it more common?

A shallow dive into the semantic web rabbit hole, promptly surfaced a graveyard of failed attempts at web annotation. Two that jumped out: Google SideWiki and (Rap)Genius.

  • SideWiki got cancelled in 2011. I couldn’t find out why exactly2. Google seemed to have all ingredients to win big.
  • (Rap)Genius, after getting funded by a16z six years back, wanted to annotate more than just song lyrics, but didn’t. They still pay lip service to the idea and have portfolio cases where it was used, but the project seems to be a silent failure (can’t even download a browser extension at this point). I guess many are cautious to silo their content inside another VC-funded media company.

There are many more, but Hypothesis surfaces as most reliable. Its open source funding structure makes it less prone to surveillance capitalism, sociopaths, the Chinese military or other yet-to-discover corporate BS. You own your data, full stop.

After I overcame the bias I had for my own project, I’ve grown to like Hypothesis and embedded it site-wide. The UI and social features are underdeveloped, but that’s fixable. I hope they succeed. It wouldn’t make sense to get VC funding and develop a competitor that then has to silo data for competitive advantage and network effects. I’ve had a taste of that and it’s not healthy for science, society or the web. I’d rather build a new generation of apps on top of the open ecosystem.

The optimist in me predicts an uprise of those kind of applications in the next three years, with one of them having a few million users. The pessimist still wants a compelling reason why annotation did not take off with knowledge workers so far...the tech was there over a decade ago.

Dreams about a semantic web with intelligent agents, rich context, provenance tracing, lateral search […] have been around for 30 years, but are persistently not happening. Currently, we still copy information mostly by value, not reference.

Referencing entire websites is not granular enough to reuse, test, verify and improve on an idea along a single chain of work and revision. Most entities don’t have a wikipedia page.

I feel we’re at a point in this giant internet project where many among us, especially developers, are throwing up (their hands in anger) and shout: “We need to refactor!".

Getting annotation right is a fundamental piece for a better web and working together in the future. To review, iterate, comment, tag, contextualize and version our ideas exactly where they live is how we build knowledge that lasts and gets better with use and time. Fingers crossed.


1. It’s a giant garbage fire of throw-away leaky abstractions and overbloaded, breaking dependencies. Satan recently switched torture providers and new arrivals to hell have to debug CD/CI pipelines of B2B apps

2. there was some backlash from site owners who did not want their sites smeared by “grafitti” from trolls, but that’s usually not a show stopper