The Linked Open Data approach to nurturing a next-generation Web is of . At the ³ÉÈËÂÛ̳, we've been for the past year and a half or so. It looks to be very promising indeed.
To Linked Open Data, Sir Tim Berners-Lee :
"It is about making links [between datasets], so that a person or machine can explore the web of data. With linked data, when you have some of it, you can find other, related, data."
Over the last few months we've been continuing our expansion of the amount of that we're publishing on the ³ÉÈËÂÛ̳ /programmes and /music sites, to provide additional detail about episodes of radio and TV programmes, and more links between the data exposed from each site. So, for example, we're now exposing segment data for Radio 2 & 6 Music programmes that link the artists in each section to the relevant data on the /music website.
This provides some really nice ways to navigate and mash-up the two websites. But we've also been wondering: what else could we do? What if there was a way to not only retrieve the data that underlies each page on the website, but also a way to run queries across the whole datasets? This would provide a way to do even more with the data, allowing it to be sliced, diced, queried and analysed in all kinds of new ways.
With this in mind, we've asked two companies who specialise in Linked Data technology ( & ) to start regularly crawling the ³ÉÈËÂÛ̳ /programmes and /music websites to harvest all of the data and load it into their semantic web platforms. Both platforms allow you to search and query the ³ÉÈËÂÛ̳ data in a number of different ways, including -- the standard query language for semantic web data. If you're not familiar with SPARQL, the Talis folk have published that uses some NASA data.
Talis & OpenLink are regularly crawling and updating the data, and we're working with them on ways to make sure it stays as up to date as possible, but for now expect it to lag a little behind the live data on our sites. But these already contain metadata for over 300,000 radio and TV episodes, over 6000 series, more than 4000 album reviews, and additional data about thousands of music artists and albums. All of the ³ÉÈËÂÛ̳ subject categories and programme genres are also included, so there are plenty of ways to query and slice up the data whether you're interested in a particular type of programme, channel, artist, or person. Where our data links to , we can include some additional context -- so for example, all of the music artist information can be queried from one source. And, as we add more data to the /programmes and /music sites, this will all get added.
The Talis Platform
The combined /programmes and /music data is in a store called "bbc-backstage" whose API is
available from: . The Talis developers have already put together a few example queries and which query the dataset, these show how to query the data using AJAX, e.g., fetching lists of music reviewers and their reviews, or analysing relationships between categories of TV programmes.
The OpenLink Virtuoso Platform
The Virtuoso hosted data can be found and queried via . In addition, the OpenLink provided Linked Data space offers a , engine, and REST API, alongside a collection of sample queries.
A richer ³ÉÈËÂÛ̳ data API, based on Linked Data
This is a trial project that we're running for six months to explore what the Backstage community can do with ³ÉÈËÂÛ̳ data when it's exposed through a richer API than we've been able to provide thus far. We're excited to see what you can create, and in the feedback you can provide us -- so we can learn what works and what doesn't, and make changes. So please do keep us up to date through the ³ÉÈËÂÛ̳ Backstage .
Enjoy!