Page 1 of 1

Jowett.net down

Posted: Fri Apr 27, 2012 11:16 am
by Forumadmin
Some of you have noticed that Jowett.net has been down for most of yesterday. I detected failure at about 9 a.m and it was still not up at 3.30 pm. As I went out last night and arrived back home at 01.30 I did not check the system thoroughly. Although I did notice something wrong on the train coming home.

It appears the hosting company has managed to loose about 3 or 4 days postings, even though I am paying for a daily back-up service. Please can you all try to remember what went missing and repost. Note that I do transfer all changes every night to another server with another hosting company; but it will be a pain to now try to recover all those posts back into the current JowettTalk. If I had not been out I could have locked the board until it had been checked. Perhaps I will have to design an automated system that does that.

Needless to say I was onto the hosting company and they replied at 10:58.
Dear customer,

We rebooted the server and ran FSCK.
It will be online soon.
We are very sorry for this unexpected outage.
We are doing our best to bring the server back online ASAP.
Thank you,
Yaroslav
Then at 19:12
Hi,

Unfortunately FSCK is still running
%80 were accomplished
Sorry but we can't speed up the system process
Please wait several hours more

Thank you,
Daniel
Then today at 01:23 after I complained when on the train
These server problems are not a common but extraordinary thing. I assure you our techs are doing their best to resolve the problem asap.

Please be patient and wait for a while.
Thank you,
Maxim
Not good enough!! I await their response to my latest complaint.

Re: Jowett.net down

Posted: Fri Apr 27, 2012 10:47 pm
by ian Howell
I don't know if this ties in with your findings but I made a couple of posts that seem to have disappeared AFTER they had been accepted and 'published' in the relative forum. In other words they were erased from the 'memory' rather than not being able to access the site. Strange - well to me anyway.

Also I thought that only about three forums (fora?) were inaccessable, Jowett Talk, Joining the Club and Search Jowett Net.

At the time I assumed that the Administrator was active in our interests as usual!

Re: Jowett.net down

Posted: Sat Apr 28, 2012 3:25 am
by Keith Andrews
The odd down time on a server, I can acept, and that tends to have at least 24 hr notice...apparantly didnt happen.
Normal good industry practice BEFORE ANY upgrade OR maintance is a FULL backup of the server, files and data bases.
Normal good practice is a server and files and data bases are backed up at least every 24hrs, usually every 12 hrs.

As i have said many times here and privately, my opinion of 95% of my industry I place WELL below politians, used car sales men .....they are full of tecno speak BS their way thru stuff that if a storeman did the equivient would be fired on the spot.

The trouble is they get away with it.

Re: Jowett.net down

Posted: Sat Apr 28, 2012 5:20 am
by Andrew Henshall
Keith,

There was a new topic raised by a David Kemp of Brisbane on Thursday 26 April regarding using a Javelin as a daily driver, that has dissappeared. I even can't find David as a forum member now.

I posted a longish reply, as the Jowett Car Club of Australia will be able to help David, and we might gain a new member. His post looked like this (I snipped his post into an email I sent to the JCCA President).

I saved my draft of my reply, but it too has dissapeared. Can you help by finding David or his post?
D_Kemp Post.pdf
(12.45 KiB) Downloaded 4 times
Kind regards,

Andrew Henshall
Victorian State Representative
Jowett Car Club of Australia

Re: Jowett.net down

Posted: Sat Apr 28, 2012 10:21 am
by Forumadmin
Thanks Andrew there are days' posts missing. David Kemp joined in 2008 and is still showing in the user list. I have emailed David.

I will try to recover what I can. But I was caught out this time. Normally when this happens I save off the previous back-up. I was out that day and did not catch it in time.

The challenge in any back-up scheme is accomodating various scenarios. In this case the hosting company seems to have tried a number of times to recover; but I am waiting for an explanation.

I may try to modify my back-up scheme to copy changes to the database off-site so that I can replay to a certain point. Up till now I thought the most we would loose was one day of updates.

Re: Jowett.net down

Posted: Sun Apr 29, 2012 12:32 am
by Forumadmin
I have clocked up 65 mails so far with the hosting company and have tonight enhanced the back-up on jowett.org to save a couple of days of the databases for theJowettTalk and JowettGallery. I could add more this covers the majority of scenarios. Unfortunately this particular situation caught me out as I did not realise how many things could 'go wrong' at the same time.

1. The RAID 5 disk subsytem that hosts jowett.net did not repair iteself when a disk failed or (more likely) the operators failed to see or act on the warning messages when failure of a second disk 'crashed ' the system.
2. the Operators failed to rebuild the system properly.
3. the daily back-up the hosting company was supposed to use was 3 days out of date.
4. the rebuild replaced the back-up I make on jowett.net every day (at 4 am GMT) with an older one.
5 I failed (as I was out) when the sytem was restored to halt backup to jowett.org which overwrote the 'good' backup with the now out-of-date state of jowett.net. (This situation is now covered with a back-up of a back-up.)

Such a sequence of events (and the timing of each step) is indeed extraordinary. Note that the Gallery files always remain intact, it is just the database references to them might get corrupt. So the archive itself is always safe.

There are ways to improve the resilience of the system but there is always a compromise between protecting the back-up from corruption and recovery time should the live database fail. One way is log shipping which allows you to roll back to any point. But this can take many hours to recover. My recovery point objective is 1 day and my recovery time objective is 4 hours from when the hosting company resumes service or if there is no likelihood of the hosting company rebuilding the system then 2 days to get jowett.org up to speed. The latter I will be improving by automating the switch over between sites one day.

The next time the live database fails a decision will have to be made whether to use the one restored by the hosting company or the one from the jowett.org back-up. Either could be corrupt or either could be more up to date. Forty five years of using and designing computer systems has taught me that there is no 100% reliable system even if you spend a £1M.

Re: Jowett.net down

Posted: Sat May 05, 2012 12:37 pm
by Forumadmin
I have finished the conversation with the jowett.net hosting company and as a result set up a back up twice a day at 4:00 and 16:00 hrs GMT off site to jowett.org. I have also set up a rotating daily back up of the databases on jowett.org. This gives a week of off-site back ups to choose from in the event of multiple restorations of service which is what happened last week that caused recent back-ups to be overwritten and consequently loss of three days of transactions. I have also done a full back up to my home computer. This is now approaching 70000 files and 3 GB which is mainly contained in the Gallery.

This scheme does not provide complete reliability but I will not detail why as that might give even more info to a potential hacker or errant administrator on jowett.net. Note that the website has fairly strong protection, the databases are also strongly protected,and the Gallery files are very well protected from hacking. It is most important that security (confidentiallity, integrity and availability) are considered as we move to more reliance on the site for subscriptions , spares ordering and the like which are planned.

The reason for this post is to ask you to note if you have any problems around the times of the back-up.

Re: Jowett.net down

Posted: Sun May 06, 2012 6:31 am
by Keith Andrews
Just as well u posted that..I had forgotten about my own servers and web sites maile servers etc...
Checked the main backup server partition Opps full :shock: 3 days ago, delteted near 100 gig of 2x daily backups.
I dont 'just backup' data bases, I do it the easy way, just backup the whole damn partition the servers are on..files data bases everything.
That is the 1st time in nearly 10yrs that has happened.
Advantage of hosting on a local machine...no ftp vunerbilties, and everything is done on the local network...and no bloody hosting companies to worry about, and cheaper.

Re: Jowett.net down

Posted: Sun May 06, 2012 9:39 am
by Keith Clements
There are many vulnerabilities on having a club website at home.
1. There is usually a single point of failure of the site in the network connection to the Internet. That extends to the provider of that connection as well and the ongoing connection to the core of the Internet.
2. The site itself may only have rudimentary protection against fire, flood, theft, earthquake.
3. There is usually a single point of failure in the administration of the site for both daily and longer term support.
4. All aspects of site administration cannot be kept up to date and secure by one person.
5. Usually the bandwidth and proximity to the core of the Internet affects performance


Those negatives may be counterbalanced by more control on the security of the site itself, but the true cost of providing the service at home is actually quite high if you consider cost of power, space, refresh of hardware and software, and the admin's time.

I have been considering justifying the cost of our own server (or, even better, virtual server) provided by the hosting company rather than sharing the server with others, thus removing the main advantage of a home server. Virtual servers provide increased resilience against breakdown and fast recovery. As I have said before, I am closely watching 'cloud' services that shift the argument to a whole new dimension.

Re: Jowett.net down

Posted: Sun May 06, 2012 10:44 pm
by Keith Andrews
1. There is usually a single point of failure of the site in the network connection to the Internet. That extends to the provider of that connection as well and the ongoing connection to the core of the Internet.
In the time we have had BW in NZ..going back to the 90s, had 3 periods totaling 2 1/2 day internet down, and several days where intermittent connects, which where so intemittent was not of any great concern...issue was a capacitor in the router.
2. The site itself may only have rudimentary protection against fire, flood, theft, earthquake.
And some host sitting in an industial office has more? If a tidle wave goes thru here or a volcano erupts what is the difference to the office block down the road?..or in Christchurch. or Japan, or San Fransico or texas?... At least I can grab a hard drive and have better reason to do so ....
3. There is usually a single point of failure in the administration of the site for both daily and longer term support.
If u own admin the whole lot why dose one then need support, yur are it and anything come up...google, no different to a host who doesnt google just blames everything onto a load of BS.
4. All aspects of site administration cannot be kept up to date and secure by one person.
Rubbish, Once setup, KNOWING FULL BACKUPS are made and if anything breaks, server mother board, hacked whatever everything is back on line in around 10 to 30 mins...And one has full access to server logs, Operating system logs etc therefore advoiding dealing with idiots full of BS at the hosting company.
5. Usually the bandwidth and proximity to the core of the Internet affects performance
I have no idea what providers are like in the UK but my internet is in the top 5 per cental of the world and it is only a basic plan.. plus I dont have a string of servers and BS to go thru...And Im still on copper.

Now of these in practice are negatives..If they where i would have moved to a hosting company along with the other web sites I admin for other people....The only sort of down side is I cant host everyones web sit mail servers etc because of limitations on BW and hardware...but I have no intention, I run my sites, couple local club sites....thats it.
I have been considering justifying the cost of our own server
My 1st server, yrs ago (days of win 95) was an obsolete P 2 gaming machine , then a P3 obsolete gaming server which I still use for general email /surfing like now. Then a P4, an old gaming machine.. I have personally built all these machines...
hence the reliabity....They are just old machines laying around doing nothing....All the software is open source, apachie, peril, squiral, php,mysql, netpbm, hmailer etc.....Cost, well the only costs are left over bandwidth that I dont use but payfor anyway, and domain registration each yr....cost to the club From memory about $US 50

And Clouds..to much 'faith' in their servers, their security, their connections, basically clouds are cheaper because our IT industry got way to greedy in charges to small business/school server administration. And clouds stepped into the 'breach' taking all their data off shore....Basically the same principle as a huge multi national company running there internatioal braches from a central server(s)....except they are still FULLY incharge of their OWN data.
Yep cloud is the Hot in thing these days, just because the propaganda/marketing is hot, is cost effective (a very bottom line issue in a world ressession)..doesnt make it the best (other than financual) route to take in the long term.

A big issue in theChCh earthquakes is the day to day operation of companies that used hosts and clouds based in the city, and a lot of it not able to be accessed for weeks and months, even then destoried because of no off site back up.

All the customers I admined, I had uptodate back ups of operating systems of workstations, servers, server OS etc...
So if a main web/mail/active directory, isa whatever went down or got blown up by a volcano....buld new machines and simply throw the hard drives in and basically ready to go in 12to 24 hrs.....do that with a cloud? nope

Running ones own modest server for personal and club hosting full control, easy of control update, know how where when and the quality of admin...and access to ALL logs