Helping a colleague this week, we uncovered some odd behavior with a site whose performance he was analyzing. Upon first glance, it was clear that this site had a performance issue – they had HTTP persistence disabled. Immediate red flag in the areas of network overhead and geographic latency.

Further digging exposed something more sinister. It seems that HTTP persistence was only disabled for browsers with MSIE in the user-agent string. Even if the user-agent string was just MSIE, HTTP persistence was off.

The customer was very forthcoming and sent us their standard httpd.conf file. This showed no sign of the standard (and frustrating) global disabling of persistence for Internet Explorer.

Finally, it came to us. The customer had provided a simple network diagram, and there, just before packets hit the Internet, was a Layer 7 firewall. How did we know the Layer 7 firewall was the likely cause? Because this device was also the one that provided compression for the content going out to customers.

A Layer 7 firewall happily rewrites HTTP headers to reflect the nature of the compressed content (content-length or transfer-encoding: chunked) and to add the gzip flag (accept-encoding:gzip). Since this device was already doing this, it was pretty clear to us that it also had a rule that disabled HTTP persistence for anything with MSIE in the user-agent string.

This was a fine example of the complexity of the modern Web application infrastructure. In effect, there were two groups with different ideas of how Internet Explorer should be handled at the network layer, and neither of them seems to have talked to the other.

When you have a Web performance problem, indulge in a thought experiment. Create an imaginary incoming Web request and try to see if you can follow it through all the systems it touches on your system. Put it on a whiteboard, a mindmap, whatever works.

Then invite the system architects and network engineers in and get them to fill in the gaps.

No doubt that will lead to the “ah ha!” moment. If nothing else, it’s a good excuse to put pizza on the company card. But I have no doubt that you will walk away with a better understanding of your systems, which will make it easier for you to talk to all the people responsible for keeping your systems running.

TAKEAWAY: Just because the part of the Web application you work on is working fine, it may be affected by other components that are not tuned or configured for performance. Get to know the entire application at a high level.

The title is a question I ask because I hear so many different views and perspectives about HTTP compression from the people I work with, colleagues and customers alike.

There appears to be no absolute statement about the compression capabilities of all current (or in-use) browsers anywhere on the Web.

My standard line is: If your customers are using modern browsers, compress all text content — HTML (dynamic and static), CSS, XML, and Javascript. If you find that a subset of your customers have challenges with compression (I suggest using a cross-browser testing tool to determine this before your customers do), write very explicit regular expressions into your Web server or compression device configuration to filter the user-agent string in a targeted, not a global, way.

For example, last week I was on a call with a customer and they disabled compression for all versions of Internet Explorer 6, as the Windows XP pre-SP2 version (which they say you could not easily identify) did not handle it well. My immediate response (in my head, not out loud) was that if you had customers using Window XP pre-SP2, those machines were likely pwned by the Russian Mob. I find it very odd that an organization would disable HTTP compression for all Internet Explorer 6 visitors for the benefit of a very small number of ancient Windows XP installations.

Feedback from readers, experts, and browser manufacturers that would allow me to compile a list of compatible browsers, and any known issues or restrictions with browsers, would go a long way to resolving this ongoing debate.

UPDATE: Aaron Peters pointed me in the direction of BrowserScope which has an extensive (exhaustive?) list of browsers and their capabilities. If you are seeking the final word, this is a good place to start, as it tests real browsers being used by real people in the real world.

Ken Burns’ tale of the US National Parks reminds me of a heritage that I have, for most of my life, taken for granted. It was in another country, but it is a heritage that I have assumed will always be there.

I grew up amongst the Canadian Rocky Mountain Parks. Dead center amongst them you might say. Within two hours drive, there were five spectacular parks – Yoho, Banff, Jasper, Kootenay, Glacier, and Mt. Revelstoke.

All of these parks played a part in my childhood, adolescence, and young adult life. It has been nearly 20 years since I spent any time in these parks, but the experience I had there have shaped how I see the world around me. But only now can I really appreciate what these parks mean to us all, in all places.

The parks are a powerful reminder of the transitory effect that man has. Each of them contains some amount of ruins as a visible reminder of man’s failed attempts to exploit and tame the parks. The carcasses of hotels, remains of viaducts, the skeletons of towns litter these refuges.

A part of that failed heritage is something I carry with me, as I am descended from one of the last group permanent residents of an industrial town in a Canadian National Park, as my grandfather lived for a time in the now abandoned town of Bankhead Alberta. My family took me to this place as a child and told me that ‘Grandpa lived here’, a concept I could not understand, as I was in a National Park, wasn’t I? I had no idea of the conflict over what it meant to be a Canadian National Park at the time, as I saw them as the refuges and preserves they had become.

Growing up amongst these special places has left with a certain jaded perspective on beauty in the world. Yosemite does not awe the way it does others, as I was raised surrounded by beauty comparable to Yosemite, and perhaps exceeding it. But now I give my unrestrained thanks to those who made the effort to preserve, protect, and conserve these places.

Within the gently protective walls of the Canadian Mountain Parks, I have seen the sublime and the ridiculous. The commercial and the ethereal. Untouched wilderness and unabashed capitalism. And despite protests on both sides, it is clear that they work together, for without the treasure and largesse of one type of visitor, the other would not have a place to go.

Banff is the greatest eyesore amongst those who see the parks as the preserve of untrammeled wilderness. However, if Banff had not existed, the desire and initiative needed to protect the other four parks would not have gained ground. So a commercial pit keeps the wilderness protected, a balance that we can accept in a day of far greater compromises.

So though the idea of a National Park may have been originated in the US, Canada has done well to develop the idea on its own terms. Only now that I am many thousands of miles removed from them, can I appreciate what they have done to to shape me. These memories leave me breathless in the realization of the great privilege I have taken for granted for all of these years.

I just did a quick experiment to validate my hunch, and it’s true – WP Super Cache can cut your HTML load time in half in your WP deployment. Just check out the GrabPERF Measurement that backs this up.

Posted via email from Newest Industry Express

Steve Souders is the current king of Web performance gurus. His mantra, which is sound and can be borne out by empirical evidence, is that 80% of performance issues occur between the Web server and the Web browser. He offers a fantastically detailed methodology for approaching these issues.

But fixing the 80% of performance issues that occur on the front-end of a Web site doesn’t fix the 80% of the problems that occur in the company that created the Web site.

Huh? Well, as Inigo Montoya would say, let me explain.

The front-end of a Web site is the final product of a process, (hopefully) shaped by a vision, developed by a company delivering a service or product. It’s the process, that 80% of Web site development that is not Web site development, that let a Web site with high response times and poor user experience get out the door in the first place.

Shouldn’t the main concern of any organization be to understand why the process for creating, managing, and measuring Web sites is such that after expending substantial effort and treasure to create a Web site, it has to be fixed because of performance issues detected only after the process is complete?

Souders’ 80% will fix the immediate problem, and the Web site will end up being measurably faster in a short period of time. The caveat to the technical fix is that unless you can step back and determine how a Web site that needed to be fixed was released in the first place, there is a strong likelihood that the old habits will appear again.

Yahoo! and Google are organizations that are fanatically focused on performance. So, in some respects, it’s understandable how someone (like Steve Souders) who comes out of a performance culture can see all issues as technical issues. I started out in a technical environment, and when I locked myself in that silo, every Web performance issue had a technical solution.

I’ve talked about culture and web performance before, but the message bears repeating. A web performance problem can be fixed with a technical solution. But patching the hole in the dike doesn’t stop you from eventually having to look at why the dike got a hole in the first place.

Solving performance Web problems starts with not tolerating them in the first place. Focusing on solving the technical 80% of Web performance leaves the other 80% of the problem, the culture and processes that originally created the performance issues, untouched.

The GrabPERF database server failed sometime early this morning. The hosting facility is working to install a new machine, and then will begin the long process of restoring from backups and memory.

Updates will be posted here.

UPDATE – Sep 4 2009 22:00 GMT: The database listener is up and data is flowing into the database and can be viewed in the GrabPERF interface. However, I have lost all of the management scripts that aggregate and drop data. These will be critical as the new database server has a substantially smaller drive. There is a larger attached drive, and I will try and mount the data there.

It will likely take more time than I have at the moment to maintain and restore GrabPERF to its pre-existing state. You can expect serious outages and changes to the system in the next few weeks.

[Whining removed. Self-inflicted injuries are always the hardest to bear.]

UPDATE – Sep 5 2009 03:30 GMT: The Database is back up, and absorbing data. Attempts to move it to the larger drive on the system failed, so the entire database is running on an 11GB partition. <GULP>.

The two most vital maintenance scripts are also running the way they should be. I had to rewrite those from very old archives.

Status: Good, but not where I would like it. I will work with Technorati to see if there is something that I’m missing in trying to use the larger partition. Likely it comes down to my own lame-o linux admin skillz.

I want to thank the ops team from Technorati for spending time on this today. They did an amazing job of finding a machine for this database to live on in record time.

I have also learned the hard lesson of backups. May I not have to learn it again.

UPDATE – Sep 5 2009 04:00 GMT: Thanks again to Jerry Huff at Technorati. He pointed out that if I use a symbolic link, I can move the db files over to the large partition with no problem. Storage is no longer an issue.

[And, why you ask, is Tara Hunt (@missrogue) on this post. Hey, when I asked Tagaroo for Technorati images, this is what it gave me. It was a bit of a shock after 8 hours of mind-stretching recovery work, but hey, ask and ye shall receive.]

UPDATE – Sep 7 2009 01:00 GMT: Seems that I got myself into trouble by using the default MySQL configuration that came with the CentOS distro. As a result, I ran out of database connections! Something that I have chided others for, I did myself.

The symptom appeared when I reactivated my logging database, which runs against the same MySQL installation, just in a separate database. It started to use up the default pool of connections (100) and the agents couldn’t report in.

This has been resolved and everything is back to normal.

One of the traditional areas of frustration for Operations and Development teams in the Web world is that their performance, Web performance, is measured from the outside-in.

The resistance of this camp is strong, and they will appear without warning, even from amongst the most enlightened of companies.

How can they be recognized?

You will hear their battle-cry, their mantra, their fundamental belief that their application, their infrastructure is a misunderstood victim. That if they could only get their one idea across, the whole of the company would be enlightened.

The fundamental tenet of this group is simple and short.

How can we manage the Internet?

The obvious fallacy of this argument is clear to any Web performance professional or business analyst: Customers get to our business across the Internet, not via psychic modem. In order to keep close tabs on the experience of our customers, the site, application, code must be measured from the outside-in.

In order to prevent making enemies and perpetuating already ossified corporate silos, take the initiative. Gently steer the discussion in a new direction by making this incredibly vast problem into one everyone in the company can understand. By adding a single word to the initial question, the fearful and reactive perspective can be dramatically shifted to one that could make the members of this camp see the light.

Make the question:

How can we manage for the Internet?

Now the focus of the discussion is now proactive – is there something we are missing that could reduce the problems and/or prevent them from ever happening?

Taking the all-encompassing and awe-inspiring challenge that is the Internet and turning it into a Boy Scout moment may reinvigorate the internal conversation, and give people a sense of purpose. Now they will be galvanized to consider whether everything in their power is being done to prevent performance issues before bits hit the Internet.

Effective Web performance hinges on taking the obvious challenges that face all Web sites, and turning them into solutions that mitigate these challenges as much as possible. So, in the next team meeting, the next time you hear someone say that it’s just the Internet, ask what can still be done to manage the application more effectively for the Internet.

When it comes to the T-Mobile Dash 3G, I have some simple advice.

Don’t.

The longer I have this phone, the more of a clunker it becomes. My list of complaints include:

  • In the last 24 hours, the battery has started to drain for no apparent reason – and yes, WiFi, Bluetooth, and background apps are all off. There appears to be no reason or logic behind this. The phone drained itself sitting on my bedside table last night, supposedly doing it’s standby routine
  • Windows Mobile 6.1 is underpowered and ancient. There are a lack of (Social Media) apps for the Windows Mobile platform. All development seems to be focused on the iPhone, Android, and Blackberry platforms. And with Windows Mobile 6.5/7.0 delayed or underwhelming, it’s not going to get any better anytime soon
  • It has weird behavior with bluetooth headsets. For every call, you have to manually tell the phone that “Hey! How about setting this to handsfree?”
  • I got it for Active Sync, but frankly I could do better by hacking my way to near-Active Sync using Google Sync and routing my work email through a GMail account
  • 3G Maybe. The TMobile 3G network is definitely not developed outside of major metro centers. I spend most of my time in EDGE mode, so the upgrade I thought I was going isn’t really there

Overall, this phone gets a monstrous thumbs-down from me. But, I’m stuck with it. I can’t afford to replace it, and the more I handle it, the crankier I get. I’m to the point that I may drag my old, underpowered, EDGE Blackberry 8100 out of the drawer and stop using the Dash 3G altogether.

Buyer’s Remorse is sometimes hard to swallow. Looks like I have to swallow it for another 23 months.

A hallway conversation this morning brought up a very interesting point about the relationship between Web performance measurements and Content Delivery Networks (CDNs). When choosing between a Web performance measurement solution and a CDN, which service should come first?

Companies facing dire and obvious Web performance issues will want immediate results, leading them to fall into the CDN-First camp. Deploying a CDN will have a positive effect on response times, increase user satisfaction, and may even increase customer conversions, in the short term.

In six months, deeper questions may start to be asked. A core question that will need to be answered by CDN-First organizations will be “Are we using the CDN effectively and efficiently?“.

A company that makes the leap to CDN deployment without assessing the overall performance environment of their Web site may be faced with a situation where they can’t tell if they need more, less, or different CDN strategies in order to continue to succeed.

As a result of the buyers remorse that can result from the leap directly to a CDN, I highly recommend the Measurement-First approach when selecting a CDN.

To help you become an advocate for the Measurement-First approach, come to the table during the CDN discussions and ask three questions. The answers will allow your organization to make the best and most appropriate CDN decision.

1. Is the CDN necessary?

In most cases, the answer to this is a resounding yes. But what can happen with a sudden shift to the CDN is that a organization overlooks those things that they can do themselves to gain some initial performance improvements.

Baselining the existing site before deploying a CDN will allow items and elements that need to be improved to be clearly identified. In some cases, an organization can fix some of these on their own to improve performance before investing in a CDN. In other cases, measuring the performance of a site may clearly indicate that third-party content is responsible for the performance issues, which would likely not be fixed by a CDN deployment.

Measurement-First policy helps clearly identify the geographies that have the worst performance before deploying the CDN. If performance in the US is acceptable, while performance in Europe or Asia-Pacific is intolerable, then the CDN deployment may initially be targeted to respond to the greatest pain first.

Understanding the current performance of your existing site can reduce the cost of the initial deployment and maximize the the long term effectiveness of the deployment.

2. Which CDN is best for us?

For a complex modern Web site, content comes in many different shapes, sizes, and formats. The thing is, so do CDNs. As I’ve discussed before, understanding what the CDNs vying for your business do and do well is as critical as the process of vetting their effectiveness compared to delivering the site yourself. The performance boost given to you site by a CDN may vary by region, leading your team to select one CDN for Europe and another for the Asia-Pacific region.

CDN performance can also vary based on the content you are asking them to accelerate. One CDN may be good at streaming media, while another may be better at static content (JS, CSS, Images, etc.), while yet another is better at accelerating the delivery of dynamic content.

Choose your CDN(s) based on what you need them to deliver. In some cases, one size does not fit all.

3. Is the CDN delivering?

This may look like a question for after the purchase has been completed and the solution deployed, but you will never know if the solution is working effectively unless you have a baseline of your performance before the deployment, and from your origin servers after deployment.

Measuring the performance of the CDNs under all conditions and from all perspectives (Datacenter, Last Mile, and from within the Browser) doesn’t stop with the selection of a CDN(s). It becomes even more critical once the CDN solution(s) is rolled into production in order to ensure that the level of service that was promised during the sales cycle is delivered once you become a customer.

Constantly validate the performance of the CDN-accelerated site with the performance of the non-accelerated origin site. Have regular meetings with, and channels of communication into, your CDN(s) to discuss not only existing performance, but how changes you and/or the CDN provider are planning may affect performance in the future.

Takeaway

CDNs are a critical component for any Web business that wants to scale and deliver services to a national or global audience. But selecting a CDN should come after you have a very strong understanding of the current performance of your own Web site.

After you have measured and identified the items you can do to improve your own performance, your team will have greater insight into the areas of your site where the services of a CDN(s) can have the greatest impact.

The Measurement-First approach to selecting a CDN will ensure that you select a set of services that exactly meets the unique performance challenges of your site.

For those who have been following our experiences with Gutter Helmet over the last five years (collected articles here), it’s time for an update.

About three weeks ago, we were contacted by the regional Gutter Helmet franchisee for New England, trying to find out what they could do to make us happy.

Why would they do this?

Well, if you do a Google search for “gutter hemet”, one of my blog posts detailing our negative experience back in 2005 is the third unpaid item on the list. It seems that they have gotten wise to the effect my simple blog posts were having on their reputation and brand.

Just for reference, I went back and looked in my logs for as long as I have them. I get on average 500 distinct page views a month for my Gutter Helmet posts and this number goes up substantially during peak gutter installation season.

“Gutter Helmet” is the number one search term coming into my site. And what people see when they get to my site is likely not the message the Gutter Helmet folks want to get across.

[Editor's Note: As someone who writes mainly about Web performance and social media, this traffic trend is disturbing. But it also shows that you never know what will resonate with your audience until you write it!]

This past Friday, Gutter Helmet sent someone out to repair the situation. He found incorrectly installed (or missing) flashing and improperly installed helmet on both sides of the house. In the end, the entire helmet installation was replaced, and new flashing was installed.

Until we get Fall rains and the Winter snow returns, we won’t really know how successful this new install was, so stayed glued to your browsers for further updates.

About this blog

Stephen Pierzchala is one of a 10-year veteran of the Web performance field who also writes on topics that interest his non-linear world-view.

Contact

stephen@pierzchala.com

+1 (508) 410-3865