Coming Soon… Updated Status Pages!

TCH-BillGeneral

As some of you know last week TCH was the victim of a ddos attack on two different days. We have learned quite a bit about the attacks and how to fight them in the future.  I think our biggest lessons came in the way of customer support during any sort of outage.  What I learned from this past attack is that customers start calling us about 10 minutes into the outage.  As everyone knows, we simply have had no major outages at TCH in several years.  The last network outage that lasted more than 15 minutes was at least 3 years ago.

Enough of my rambling, everyone that hosts at TCH knows about our budget hosting with excellent up time and services.  Even with the two outages of last week our network up time is still 99.982%.  In fact, TotalChoice is ranked #1 by Hyperspin for hosts with more than 50 servers monitored.  Number #1 uptime of all monitored hosting companies.  This is quite a feat for a budget hosting company.  I am really proud of TCH and its staff.

Whilst our up time is excellent , we have learned something from our short periods of down time.  We need to move our network status page and blog to another Data Center.  Currently if the TCH Data Center goes off line, so do our status pages.  This is going to change.  We are in the process of preparing to do just that.  I have a friend that owns and operates another data center on the West Coast.  He was looking to do the same and wanted to host his status pages off network.  So we have struck a deal.  A server for server type of barter.

We expect to roll out these changes in the next 48-72 hours.  I will be sending out a mass email to our client base informing them of this after we have done a bit of testing on it to make sure everything is working as it should.

Happy Hosting!



Joomla, DL() & You

TCH-RyanGeneral

I am not going to beat around the bush on this, the last couple of days have been a little hectic here a TCH while working to deal with a series of web application vulnerabilities that are being taken advantage of by attackers. The purpose of this post is to explain a bit about what is going on, how these attacks effect you and what we have done to prevent further abuse.

The first thing we need to understand is what is being attacked; as the post subject implies, it is primarily Joomla being attacked as the software has had a series of 9 vulnerabilities released since the 1st of September of which a number of more in depth attacks have formed around. The intended purpose of most of these attacks is to taint web sites with injected javascript, that code takes advantage of a number of client side browser vulnerabilities that if not patched or stopped by an antivirus can cause further issues for web site visitors.

Now, at a glance you might be thinking that if someone fails to patch web site software then it is there own problem, how does this affect me? That is where the dl() function comes into play, the dl() function is essentially a dynamic loader for php modules or 3rd party extensions. To simplify this a bit, the dl() function when enabled allows anyone to add extensible features onto php, generally these are all well to do features but if someone so desires they can create a dynamic loader module with malicious intent.

The scenario we are looking at is that attackers have gained entry to vulnerable web sites, primarily through joomla then they upload a series of malicious scripts including a dynamic loadable module for php that once enabled through dl() has the ability to inject javascript code into pages. The code usually finds itself placed before the body tags and executes its payload on a visitors first visit to a site, a cookie is then set that expires every 2hours then the payload executes itself again on a new visit.

This attack though had far reaching implications, only affected 4 servers on our network (denver, dantooine, alderaan, chewbacca) of which only about half the sites on the given servers or in some cases less were being tainted by the attack. As alarming as this situation is, we need to stress that no content was actually modified on sites except the joomla sites themselves that were compromised.

The way we have come to deal with this situation is a layered approach, we have first and foremost made increased efforts to identify compromised sites on our servers and suspend/remove them. The next step was to cut off the enabling function of the attack, which is the dl() function. This function was actually something we used to disable on servers for its malicious implications but over time that procedure was phased out in the interest of allowing users to install custom dynamic loadable modules from their home directories such as ioncube. However, now that ioncube is standard server-wide on all servers, there is little in the way of other commonly installed packages that depend on dl(), php.net has even went as far as to declare dl() deprecated as of php 5.3.

With dl() disabled on servers, the effects were immediate and all reports of tainted sites stopped, now when I say stopped I do not just mean that that lightly. We literally sat around all evening bashing the f5 key on our keyboards trying to get the javacode injections to reappear on sites, between myself, Bill and Dick we must have done over 6 hours of combined keyboard kungfu in this effort. It was with great relief that we were not seeing anymore reports or issues ourselves first hand but it was still not quite enough to actually be confident that we had done enough.

We are continuing to be extra vigilant with compromise assessment on the servers to prevent any further malicious content from being injected into sites, in addition to this we have on some servers started to use suPHP as a basis for new php security standards. Essentially, by using suPHP we enforce php code to run as the user who executed it instead of as the web server but it goes beyond that by enforcing strict permissions on content and not allowing anything to run above mode 755 (such as world writable data) and also making sure that executed content is owned by the user. This might seem problematic however since the code is now executing as the user, there is no longer a need for data to be set to mode 777 (world writable) or its ownership set as the web server user, which reduces support issues and vastly increases security. The suPHP changes are something we have only rolled out to about 6 servers so far but the support issues it has generated are minimal for the advantages it provides, in the future we will be looking to roll this change out to more servers on a slow but steady basis.

That is where we are at, if you have any questions or concerns regarding this blog or the topics discussed please feel free to comment or head to the TCH forums for further dialog.

Emergency Server Updates

TCH-RyanGeneral

We have the last 3 days been tracking a local root vulnerability in the Linux kernel, the core element of all Linux operating systems. This vulnerability is unprecedented in scope, effecting Linux versions going as far back as 8 years which prompted extra consideration in how we handle it.

Here at TCH we operate a network that is dominated by Linux, to say we took this matter very seriously would be an understatement. It was decided after evaluating the threat this vulnerability poses to our network, dedicated servers, and shared/reseller clients, that waiting any longer on an upstream update was not reasonable. Originally there was an estimate of Saturday 1900GMT for upstream updates but this fell through prompting us to take action. In addition to a lack of a reliable upstream update for this issue is the fact that this vulnerability is being actively exploited in the wild with publicly available attack code on many security and underground web sites.

At this moment, we are rolling out to all Linux servers on our network an updated kernel version that will close this vulnerability while maintaining version compatibility with future upstream software updates. This effort in retaining version support will allow our dedicated clients in addition to our own support team to resume normal update practices with tools such as ‘yum’ or ‘apt-get’ and not have to worry about conflicting versions against our in-house kernel update.

Please do not be alarmed if you experience an outage temporarily on dedicated, shared or reseller servers, we thank everyone for understanding the urgency of this matter and if you have any questions or comments please feel free to submit a help desk ticket at https://www.tchhelp.com.

UPDATE:Aug 18, 2009
We will be conducting reboots again this evening to push out a revised version of last nights kernel that corrects issues with r1backup agent, local firewall services and the network driver on certain servers. In addition, this new kernel revision is binary compatible with CentOS/RHEL 4 kernels being that it was built off the same kernel source tree as the standard kernels.

Authorize.net Outage

TCH-DickGeneral

The payment portal Authorize.net, which is used by TotalChoice Hosting and many of our clients, has been offline for hours, meaning that its merchants have been unable to process credit card payments through their web sites. Your account will not be affected due to this incident and we will be extending our payment grace period to insure your sites stay online.

Update: Authorize.net is now back online. You will be able to make payments. We will update you if this or any other changes arise.

Network Issue

TCH-DickGeneral

At 3:34 PM EST, we begin to see intermittent packet loss to some of our network due to an inbound Denial of Service attack. We then experienced a core router crash while working to mitigate the attack. This resulted in a wide spread network outage until 3:46 PM EST, when we were able to switch over to our redundant core router. We continued to experience intermittent outages until 4:03 PM EST, at which time all services were returned to normal.

We are sorry for any inconvenience this causes you.