#bbcblackout outage maintenance work
Following on from the ³ÉÈËÂÛ̳ Online technical failure on 29 March, I want to make people aware of further work currently scheduled to take place in the early hours of tomorrow morning (Tuesday 5th April), and explain what will happen.
Between the hours of 0200-0500 (the quietest time for ³ÉÈËÂÛ̳ Online), the equipment that failed on Tuesday night will be replaced, and if you happen to be on the ³ÉÈËÂÛ̳ website at that time, you may experience some disruption in service.
Here's what's happening, in case you want more details.
Part of the network will be shut down to enable the faulty equipment to be replaced, and this will result in some route re-convergence as the core network works out the best path from the hosting centres to the internet. This will happen twice: once when the equipment is shut down, and a second time when full resilience is restored. Each re-convergence could take a few minutes, and while this is happening you may experience some interruption to service.
Despite some , the technical teams within the ³ÉÈËÂÛ̳ and Siemens worked very well on Tuesday night to rectify the situation swiftly and effectively. There are incident management processes in place to handle failures such as this one, and while we hope not to have to use them too often (!), this was a good example of close collaborative working.
Richard Cooper is Controller, Digital Distribution, ³ÉÈËÂÛ̳ Future Media.
Comment number 1.
At 4th Apr 2011, Squirrel wrote:Thanks for the "heads-up" but you can't really talk about "resilience" if a single point of failure can bring down the whole internet facing web site.
Complain about this comment (Comment number 1)
Comment number 2.
At 4th Apr 2011, dennisjunior1 wrote:Richard,
thanks for the great "Heads-Up".......
Complain about this comment (Comment number 2)
Comment number 3.
At 5th Apr 2011, Debbie Rockford wrote:I hope this maintenance work includes the commenting system - it seems to hang each time I post a comment or try to preview one before submitting it. I've tried it on a couple of browsers (IE/ Mozilla) and on different comps over the last 2 days with the same issue (just get the round busy processing signal which never ends - sometimes the comment goes through, other times it just freezes!)
Complain about this comment (Comment number 3)
Comment number 4.
At 11th Apr 2011, Riz wrote:It is all good to hear that the "technical teams within the ³ÉÈËÂÛ̳ and Siemens worked very well on Tuesday night to rectify the situation swiftly and effectively" but what is becoming obvious is that bbc.co.uk is not a resilient website.
One pipe, one router and one set of servers all in one building. THAT is not what you call a resilient system. Especially when the whole world is talking about the cloud and using multiple data centres.
It all boils down to cost we understand but do not please tell us that "there are incident management processes in place to handle failures such as this one" because this will happen again and you have no backup, no way to protect it.
Equipment WILL fail.
Complain about this comment (Comment number 4)