A brief technical overview of the ³ÉÈËÂÛ̳ personalised mobile homepage
Most people I know are never more than a few feet away from their mobiles. They provide access to email, text and internet, including social networking sites. They're fashion accessories. They're becoming increasingly personal pieces of equipment and in the same way that junk mail landing on your doormat feels intrusive, so it is with mobile websites cluttering up your screen with unwanted content.
When it was released in April last year, the ³ÉÈËÂÛ̳ personalised mobile homepage aimed to address this by allowing many aspects of the page to be personalised.
This post aims to explain some of the technologies we've developed to generate personalised mobile pages that we hope provide you with the best experience, regardless of the make and model of your device.
Rendering pages for different devices
The appearance of the mobile homepage changes depending on the device you use to view it. If you browse the mobile homepage using a touchscreen device such as an Android or iPhone, you'll be presented with larger text and images to make it easier to navigate with your finger.
Non-touchscreen phones that use a trackball or buttons to navigate will present a more compact page. In addition, links to mobile iPlayer and other multimedia content is selectively displayed based on the phone capabilities and whether it's connecting over WiFi or 3G.
In order to do this, the page content is defined using a device-independent XML representation. Each tag in the device-independent XML is then translated into a fragment of XHTML appropriate to the capabilities of the client handset. These XHTML fragments make up a library of common components, such as links, headings and list items. This approach means that the look of the site is maintained throughout and also that the entire design can be updated by simply changing the templates.
Here's an example. The following XML describes the Radio & Music topic:
<header editable="true" text="Radio & Music" url="/mobile/radio/"/> <now_on_air title="NOW ON AIR"/> <list style="plainList"> <channel_list-item channel_url="/mobile/radio/radio1/index2.shtml?region=london" channel_name="Radio 1" brand_url="b00pjl2g" brand_name="Greg James"/> </list> <list style="boldList"> <list-item text="More stations and schedules" url="/mobile /customise/11"/> </list> <list style="audioList"> <list-item demi="15" text="Podcasts" url="/mobile/radio/podcasts/index.shtml"/> </list>
The XML above renders like this on an iPhone:
And like this on a Nokia 6331:
On the 6331 version the text and image sizes are reduced to account for the smaller screen and the podcast link is hidden.
Here's a simplified diagram of the flow during a mobile page request:
Page personalisation
Successfully navigating over 60 regional news areas, 17 sports categories, 181 football teams, 18 news topics, 9 radio stations and 6 TV channels on a mobile device requires some organisation. To this end many of the topics on the mobile home page can be personalised to show only the information you're interested in. When you personalise your page the personalisation settings are stored in a cookie on your device. Due to the number of personalisation combinations available a cookie format had to be designed to store these settings efficiently to reduce the storage space consumed on the device, while being flexible enough to allow future development.
In the end we settled on the format shown below:
11_3_8_4___G9_10__CD11__CK12_14_15_16_
Each of these fragments represents a topic on the homepage and its personalisation settings. The position of the fragment in the cookie determines the order in which the topics appear in the page.
Topics that can be personalised contain extra information in their fragment that represents the personalisation state of the topic. For example, the fragment '10__CD' describes the 'Television' topic and can be split into three fields: '10', '__' and 'CD'. The '10' is the topic ID, the next two characters are used to store the TV region as a 2 digit base 42 (b42) number and the rest of the fragment stores the selected TV channels. In this case the channels are 'CD' which correspond to ³ÉÈËÂÛ̳1 and ³ÉÈËÂÛ̳2. Adding ³ÉÈËÂÛ̳3 to the page changes the fragment to '10__CDF'. Both the channels and the order in which they will appear in the topic are stored. The formats of the other topics vary depending on the information to be stored and are outlined briefly below. We don't use vowels in the configuration cookies to avoid spelling unfortunate four letter words. With this many combinations they're bound to occur.
The topic IDs and their personalisation encodings are as follows:
1 Promo - none
3 News - 2 character b42 region + n character feed list
4 Weather - 3 character b42 region + 1 character b42 display format
8 Sport - 2 character b42 region + n character feed list
9 Entertainment - none
10 Television - 2 character b42 region + n character feed list
11 Radio & Music - 2 character b42 region + n character feed list
12 iPlayer - none
14 Featured Sites - none
15 Search - none
16 MyClub - 4 character club ID + 2 character b42 display format
The characters used for the base 42 encoding are:
'_CDFGHJKLMNPQRSTVWXYZbcdfghjklmnpqrstvwxyz'.
If you're so inclined, you can play about with the configuration format to see how it works. Paste the following URL into your desktop browser, edit the configuration and see what happens.
/mobile/ps/11_3_DfCV8__GW4__DG9_10_DGD11__HM12_14_15_16__C___/?bookmark
You'll notice that you can't remove the promo or search topic. These are now permanently part of the page but still appear in the personalisation settings. They'll be removed in the future.
Scaling personalised applications
The system described so far has all the functionality required, but in order to cope with the high load demanded caching must be used to reduce the load on the servers. 'Caching' is the process of storing in memory a piece of data that takes time to be rendered or downloaded so that next time you need it you can simply look it up.Non-personalised pages are relatively simple to cache as each user sees the same page. But personalised pages, where each user has a personal view on to a page, requires a little more thought; a user in Manchester doesn't want to see the weather for a user in Birmingham.
The problem we have is that the number of combinations of personalised pages is huge. The order of topics in the page alone gives us over 300,000 combinations. So what we've done is to cache the individual page topics separately, rather than the complete pages. When a client request is received for a particular topic order, the topics are simply retrieved from the cache in that order and concatenated to form the complete page. This immediately reduces the potential amount of data to be cached by a few orders of magnitude, but there's still the problem of the topic content.
For example the news component has over a million combinations and that's before all the regional news feeds have been factored in. We can't cache them all. Luckily there are a couple of things on our side:
- Not all personalisation combinations are equally represented: It turns out that nearly 70% of all requests are for the same couple of dozen personalisation combinations. The last 30% is still a large number, but the majority of the load can be effectively managed by caching.
- We don't want to cache the components forever: Many topics contain dynamic data such as news stories that need to be updated periodically. Caching components for just a few seconds is enough to considerably reduce the number of requests per second while not filling up the server cache. Perhaps in the future we could exploit the observation in point 1 and employ a more intelligent caching system where the more popular configurations are cached for longer.
Conclusion
The mobile homepage is still under constant development and there are many aspects of the system that can be improved. Mobile development is still a relatively new discipline and presents its own unique problems. But it's by building novel systems that we develop the techniques to solve them, some of which may find uses beyond their original application.
I hope this has provided an insight into some of the work that we do here at ³ÉÈËÂÛ̳ Mobile. With ever more powerful devices becoming available it's a very exciting field to be working in at the moment. I hope we can help to make it an exciting experience for you too.
Mark Longstaff-Tyrrell is a software engineer on the ³ÉÈËÂÛ̳ mobile platform.
Comment number 1.
At 20th Jan 2010, JoeAD wrote:Interesting behind the scenes insight.
Now when will we be able to browse ³ÉÈËÂÛ̳ blogs on a mobile device......
Complain about this comment (Comment number 1)
Comment number 2.
At 20th Jan 2010, drt wrote:Thanks for the interesting peek behind the scenes.
Can anyone explain why when I select Swindon as my location I get the weather for Salisbury, even though Swindon does have a mobile weather page?
Back in May last year I raised this with the ³ÉÈËÂÛ̳ and was sent an email saying: "We've passed on this issue to our development team to investigate - there do appear to be some issues with the system that matches postcodes to forecasts.
I can't give you an exact timetable for a fix on this but please be assured that the matter is being looked at."
Any news on a fix, please? Thanks!
Complain about this comment (Comment number 2)
Comment number 3.
At 20th Jan 2010, Ed Lyons wrote:Having access to the most recent blog list that the main bbc home page has would be useful (ideally with a few more blogs on it)!
Complain about this comment (Comment number 3)
Comment number 4.
At 21st Jan 2010, Dr_Bean wrote:Why can I listen again to some radio on iPlayer via my iPhone, but not listen to radio live, even if I'm connected via WiFi?
Complain about this comment (Comment number 4)
Comment number 5.
At 21st Jan 2010, Nick Reynolds wrote:Hello - this is not a general post for queries about mobile. Can people stay on topic please.
Complain about this comment (Comment number 5)
Comment number 6.
At 21st Jan 2010, iainhubbard wrote:Nice post, but i didnt see anything about the technology behind all the numbers.
What tech do you use for the cache? memcache, database, open/closed source?
What language(s) is it developed in?
Does it run on a cluster or some big iron.
Its would be nice to have a bit of insight into the technology that can cope with such demand.
Complain about this comment (Comment number 6)
Comment number 7.
At 21st Jan 2010, TV Licence fee payer against ³ÉÈËÂÛ̳ censorship wrote:5. At 11:28am on 21 Jan 2010, Nick Reynolds wrote:
"Hello - this is not a general post for queries about mobile. Can people stay on topic please."
Hi Nick, just a little observation, is this blog actually on-topic for this section? Well I know it is, my point being, doesn't it actually fit better within the "Web Developer" section, with perhaps a link back here, it seems to be well 'over the top' (technically speaking) here, the XML code samples are leaving many people with glazed-over eyes I suspect and thus a total miss-understanding of what the topic actually is...
But that said, thanks for giving it space!
Complain about this comment (Comment number 7)
Comment number 8.
At 24th Jan 2010, John99 wrote:"as a 2 digit base 42 (b42) number ... don't use vowels in the configuration cookies to avoid spelling unfortunate four letter words."
I can understand both
- a cookie being shown in alphanumeric form rather than say hex
(it makes the cookie shorter)
- avoiding vowels (as quoted from blog)
Is this a common standard or from a Douglas Adams SciFi fan, number 42
Complain about this comment (Comment number 8)
Comment number 9.
At 26th Jan 2010, Mark Longstaff-Tyrrell wrote:@Dr_Bean - we do not currently have the infrastructure to offer live streaming of television or radio content to iPhone or iPod Touch devices. We are looking into providing this functionality in a future version of iPlayer.
Complain about this comment (Comment number 9)
Comment number 10.
At 26th Jan 2010, Mark Longstaff-Tyrrell wrote:@drt - the mobile weather component is currently being updated after which it will be able to resolve local areas. This should be live in 6 weeks or so.
Complain about this comment (Comment number 10)
Comment number 11.
At 26th Jan 2010, Mark Longstaff-Tyrrell wrote:@iainhubbard - we use memcached and MySQL, with software written in PHP and Java. The mobile homepage, like an increasing number of ³ÉÈËÂÛ̳ sites, runs on the Forge platform. This is a collection of software and systems for developing and deploying large scale web applications. It's worthy of a blog post of its own, but in the meantime here's a link to some slides from a presentation about it last year (pdf):
Complain about this comment (Comment number 11)
Comment number 12.
At 27th Jan 2010, drt wrote:Thanks Mark for your reply, looking forward to the update.
Complain about this comment (Comment number 12)
Comment number 13.
At 25th Mar 2010, SNO man wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 13)
Comment number 14.
At 27th Mar 2010, U14390976 wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 14)
Comment number 15.
At 12th May 2010, U14460911 wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 15)
Comment number 16.
At 20th May 2010, talat wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 16)
Comment number 17.
At 24th May 2010, hd2010 wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 17)
Comment number 18.
At 1st Jul 2010, Ronnie wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 18)