成人论坛 Weather: changes to technical architecture
The 成人论坛 Weather website has recently been relaunched after a public beta. Peter Deslandes has blogged about the public beta and response, and Mel Seyer has explained the UX journey.
The technical architecture for the 成人论坛 Weather site in a simplfied diagram
I work as a Technical Architect with the Weather team, and this post talks about some of the changes we've made to the architecture of the site to ensure the new site stays reliable, performant and able to scale to the traffic levels that 成人论坛 Weather attracts
Weather as a service
The previous version of the 成人论坛 Weather website ran on a dedicated two-tier architecture, with business and presentation logic wrapped up in a PHP front-end that communicated directly with a MySQL database. These machines sat behind the 成人论坛 News Apache mod_cache head-end servers, providing the scaling necessary for Weather traffic spikes.
For the new site we've moved to a three-tier architecture, keeping presentation logic in a PHP front-end but moving business logic and DB access down to a Java/Spring mid-tier service layer that presents HTTPS APIs back to the PHP front-end for data reads, and to the 成人论坛 Weather Centre's data ingest system for writes.
This move is part of the 成人论坛's wider strategy to move to a (SOA) to increase cross-product interoperability and data reuse. The diagram at top left gives an idea of how the tiers relate to each other in Weather's case (in practice this arrangement is replicated over two data centres)
The mid-tier REST API makes data available to the presentation layer as , providing separate feeds for different page components. We set different Cache-control:max-age headers for different feeds according to how frequently the data is updated, and combine this with to make subsequent requests more lightweight when the feed's max-age has been reached.
Scaling
The new Weather site now runs on the same dynamic platform that hosts the 成人论坛 Homepage and iPlayer. To protect this shared platform from spikes in Weather traffic (up to six million users/per day when it snows, and around 2 million on a typical day) and provide a responsive user experience we've introduced a multi-tier caching strategy:
- the caches fully rendered pages in front of the PHP tier;
- stores JSON within the PHP tier and reduce calls to the Java service layer
- Apache in in front of the Java service layer to cache data requested by the PHP tier and other clients
- within the Java tier to cache database and third-party service responses
Weather pages leaving the PHP tier will typically carry 'Cache-Control: public, max-age=180, stale-while-revalidate=30' to enable caching in Varnish (and beyond) for UK users. The total caching time across the three tiers is around 10 minutes, to enable staff at the Weather Centre to make edits to the data and the public see them on the live environment soon after. As well as protecting the shared platform from load spikes the front-end Varnish caches also provide users with a highly responsive experience.
When the shared platform is under very high load (for example during high profile news events or when it snows) we fail over to a forcing pages to be cached outside of the 成人论坛's servers. In previous versions of the Weather website this led to a loss of personalisation (favourite locations stored in a cookie) but in this release we handle personalisation on the client-side, using to make follow-up requests once the basic page has loaded and the favourite locations cookie has been read; both the basic page and the location data fragments are cacheable in the CDN
Location data
The previous 成人论坛 Weather site used its own location gazetteer, one of several location datasets around the 成人论坛. For the new site we've moved to using a new mid-tier service for location data. This new service (called Locator) lets front-end products get data from a REST API that the 成人论坛 has associated with that place (weather forecast ID, TV and radio region, local news area, etc). This service is now used by the new Weather site and the new 成人论坛 Homepage, and will soon be used by other 成人论坛 products that need to associate data with a place.
Locator draws its gazetteer from the open dataset. As a result the URL for a location on the new 成人论坛 Weather website will be /weather/:geoID, so for example has the 成人论坛 weather forecast page URL www.bbc.co.uk/weather/2655984. I was cautious about using these geoIDs as they aren't web-scale identifiers, plus anyone who's tried it will tell you that managing 3rd-party IDs can be a headache. But for the 20,000 or so populated places that the 成人论坛 provides forecast data for they will make it easier for us to integrate with other 成人论坛 (and non-成人论坛) services, using the geoID as a link - for example the 成人论坛's semantic data publishing platform used in the World Cup will use geoIDs as identifiers for locations in its event models.
Behaviour Driven Development
Another key aspect of the new site development was a focus on code quality. We were fortunate to have had a Developer In Test embedded in the team for the first few months of the project, who helped get the developers up to speed with the practices of and worked with the Product owner to describe the requirements as testable features and scenarios.
We used for , running unit tests on code commits from our integration environment. Cucumber tests run at scheduled intervals on our testing environment so we had a good chance of catching integration bugs quickly. Code review happened through a combination of frequent pair-programming and scheduled review sessions, where developers talked through what they had been working on with each other.
I have no doubt that these practices helped the project to deliver to schedule and specification, and have provided a more stable, maintainable and extendable codebase.
Jeremy Tarling is Technical Architect, 成人论坛 Weather, 成人论坛 News & Knowledge
Comment number 1.
At 14th Dec 2011, Jason wrote:Really interesting article Jeremy. It's great how the 成人论坛 share their experiences with techniques such as BDD.
Now that the project is complete, will the team disband and go on to work on other areas; or will they stay together and support further enhancements to the site?
I was also wondering whether the final architecture you describe evolved during the course of the project or was it mainly decided up front?
Complain about this comment (Comment number 1)
Comment number 2.
At 14th Dec 2011, tinman wrote:Nice gobbildygook in the blog above. However the new weather page now has less info and is less easy to use and looks cluttered. With too much white space which gives you a a headache. The only reason to go there now is watch the video forecast. It is really worrying that this type of poor design now seems to be permeating through all the 成人论坛 web sites. I think a new design team has to be found before the 成人论坛's web presence goes further down the tubes.
Complain about this comment (Comment number 2)
Comment number 3.
At 15th Dec 2011, lucas42 wrote:Interesting post. Out of curiosity, are the Java services available externally for developers to play with? Or are they only for 成人论坛 sites using Forge?
Also, you mentioned that you're using Hudson. Most people I've talked to recently seem to have switched to Jenkins (including where I work). Is there a particular reason you chose Hudson over Jenkins? (To be honest, I don't really know much about their differences other than the whole Oracle vs Open source community disagreement)
Complain about this comment (Comment number 3)
Comment number 4.
At 15th Dec 2011, MartinLA wrote:Really interesting to learn that big organizations like 成人论坛 uses BDD in their development. I've never user Cucumber for Java - only for Ruby apps.
Complain about this comment (Comment number 4)
Comment number 5.
At 15th Dec 2011, glowin wrote:I'm not exactly sure what the purpose of this blog is apart from providing a shopping list of technologies used. Maybe some justification about why the methodologies where used would be more appropriate? I would certainly be more interested about the 鈥渨hys鈥 rather than a focus on the 鈥渨hats鈥.
Complain about this comment (Comment number 5)
Comment number 6.
At 15th Dec 2011, Ian McDonald wrote:@tinman9898 and @glowin
Thanks for your comments.
The technical architecture will be of more interest to other professionals in the field. The design was the subject of Melanie Seyer's blog post, and is off-topic here.
Cheers,
Ian
Complain about this comment (Comment number 6)
Comment number 7.
At 15th Dec 2011, jeremytarling wrote:Thanks for the comments.
> 1. At 21:38 14th Dec 2011, Jason wrote:
> Now that the project is complete, will the team disband and go on to work on
> other areas; or will they stay together and support further enhancements to the
> site? I was also wondering whether the final architecture you describe evolved
> during the course of the project or was it mainly decided up front?
Hi Jason, no the team will stay together to work through the backlog of features, there's a lot still to do. Some fundamental architectural decisions were made up front but we tried to avoid too much up-front design, preferring early performance testing to guide the process (e.g. trying different caching set-ups and timings).
> 3. At 00:38 15th Dec 2011, lucas42 wrote:
> Interesting post. Out of curiosity, are the Java services available externally for
> developers to play with? Or are they only for 成人论坛 sites using Forge?
> Also, you mentioned that you're using Hudson. Most people I've talked to recently
> seem to have switched to Jenkins (including where I work). Is there a particular
> reason you chose Hudson over Jenkins?
Hi lucas42, we are currently working on making the feeds available externally too, initially as RSS to replicate the previous site's functionality but hopefully in other formats too. The choice of CI tool was made by my colleagues in the Platform Engineering team, they may wish to comment on that.
> 5. At 10:25 15th Dec 2011, glowin wrote:
> I'm not exactly sure what the purpose of this blog is apart from providing a
> shopping list of technologies used. Maybe some justification about why the
> methodologies where used would be more appropriate? I would certainly be more
> interested about the 鈥渨hys鈥 rather than a focus on the 鈥渨hats鈥.
Hi glowin, the purpose was to provide some information about how the architecture differs from the previous site, and some insight in to the 成人论坛's technical direction around SOA. If the Editor's ok with it I'd be happy to write a follow-up piece around the decision-making process we went through.
Complain about this comment (Comment number 7)
Comment number 8.
At 15th Dec 2011, glowin wrote:@Ian McDonald
Although Melanie Seyer's blog post is very informative it doesn't go into any detail about the implementation choices such as SOA and the justifications behind why they were chosen, which I had hoped that this blog would have done. In my opinion this amounts to this blog post being pretty pointless for that reason.
Complain about this comment (Comment number 8)
Comment number 9.
At 16th Dec 2011, dneilsen49 wrote:Interesting post, good to learn the technology choices the 成人论坛 makes in its approach to scaling.
You mention integrating with other services via GeoID, does the 成人论坛 plan to make weather forecast data available as Linked Data?
Complain about this comment (Comment number 9)
Comment number 10.
At 16th Dec 2011, mantolama wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 10)
Comment number 11.
At 19th Dec 2011, jeremytarling wrote:Hi dnelisen49
> You mention integrating with other services via GeoID, does the 成人论坛 plan
> to make weather forecast data available as Linked Data?
we'll certainly look at this; perhaps not full RDF representations of forecast data but something like a minimal set of RDFa or microformat data that linked (for example) a five day forecast fragment for a given location to the associated Geonames URI.
Complain about this comment (Comment number 11)