January 2004 – a crank's progress

the cost of optimization

So between using MTOptimizeHTML and mod_gzip, my server is taking a beating.

last pid: 17424; load averages: 3.96, 2.94, 1.84 up 9+08:17:32 21:19:44 93 processes: 5 running, 88 sleeping CPU states: 99.2% user, 0.0% nice, 0.0% system, 0.8% interrupt, 0.0% idle Mem: 168M Active, 17M Inact, 41M Wired, 10M Cache, 35M Buf, 12M Free Swap: 1027M Total, 440M Used, 587M Free, 42% Inuse

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 11208 www 62 0 301M 44564K RUN 13:08 24.07% 24.07% httpd 11134 www 62 0 85152K 12724K RUN 5:57 24.07% 24.07% httpd 16618 www 63 0 83352K 49232K RUN 1:13 23.78% 23.78% httpd 17038 www 62 0 82584K 44184K RUN 1:55 23.63% 23.63% httpd 335 mysql 2 0 29320K 376K poll 28:02 0.00% 0.00% mysqld

The top 5 processes are chewing all my CPU time and for long periods of time (see the graphic: the red line is at 100% of CPU and ideally, utilization doesn’t exceed that).

The load average, as displayed by uptime(1), is over 5 now: that’s 4 too many. As you can see, it’s all httpd processes (Apache with an embedded mod_perl interpreter). The mysql process is dormant, for all intents and purposes.

I have to make the call here on how much I want to pay for optimization and what kind makes the most sense. mod_gzip seems like a very elegant solution and the load is distributed over the full day, while the MTOptimizeHTML hit takes place at every rebuild, ie, every post or comment.

on monetizing weblogs

ongoing — You Can Get Paid For This

Tim Bray analyzes his recent foray into Google AdSense. His experiences mirror my own (some ads are worth more than others was one thing I learned early on).

He’s doing a little (!!) better on this than I am, but then I don’t get linked from the tech section of CNN.com either.

Also, he’s dead right about ads on pages with a single piece do better than on a page with multiple, different pieces. I shoulda remembered that from my startup days and the follow-on experiences with WayPath: you can’t determine contextual relevance against an assortment of different items.

But as he says, there’s some organic growth at work here, so while I may never do quite as well as he’s doing (I’m covering my cable modem bill this month – yay!), there’s some upside.

another robot

Welcome to PubSub.com

Found this in my logfiles just now . . .

PubSub Concepts provides real-time, content based publish and subscribe systems at internet scale. This site is a Beta version of our home page, which will provide a PubSub interface for weblogs and other information sources.

[ . . . ]

PubSub.com reads over 100,000 weblogs in real time, and generates new feeds containing information specific to particular issues.

This chart shows what people are talking about – in all the weblogs and RSS feeds we monitor, how many people are talking about each candidate.

This page has more information: if it was me, I’d put it first, since it has more than one datapoint and a lot of RSS feeds for the infojunkies amongst us.

orkut as Google’s data-mining/personalization Trojan Horse?

Jeremy Zawodny’s blog: Why Google needs Orkut:

“Then, one day down the road, they quietly decide to “better integrate” Orkut with Google and start redirecting all Orkut requests to orkut.google.com.

Bingo!

Suddenly they’re able to set a *.google.com cookie that contains a bit of identifying data (such as your Orkut id) and that would greatly enhance their ability to mine useful and profitable data from the combination of your profile and daily searches.”

This is no conspiracy theory: I think he may be close to the truth of it.

Of course, dropping your stored cookies would be enough to break this, so it’s not clear it’s all that invasive or predatory.

[Posted with ecto]

funny if it weren’t sad

AndrewBlog: In Safari:

“Mac’s are beginning to grow up. They are coming to be associated with power and ease of use.”

Whippersnappers like this probably think the Mac OS is a poor copy of Win95 . . . . instead of t’other way round.

[Posted with ecto]

now with mod_gzip

SourceForge.net: Project Info – mod_gzip:

“mod_gzip is an Internet Content Acceleration module for the popular Apache Web Server. It compresses the contents delivered to the client. “

The first thing to look at in the weightwatching game is on-the-fly file compression: easily done. My index.html page went from 103,791 bytes to a more svelte 24,931 — a more than 75% savings. Since all modern browsers (ie, able to converse in http/1.1) can work with content-encoding-enabled servers, this was simplicity itself.

Simply add the mod_gzip module, and swipe someone else’s configuration 😉 :

mod_gzip_on yes
mod_gzip_dechunk yes
mod_gzip_can_negotiate yes
mod_gzip_keep_workfiles no
mod_gzip_temp_dir /tmp

mod_gzip_min_http 1000

mod_gzip_minimum_file_size 300
mod_gzip_maximum_file_size 0
mod_gzip_maximum_inmem_size 100000

mod_gzip_command_version modgzip_info
mod_gzip_add_header_count yes

mod_gzip_item_include file \.htm$
mod_gzip_item_include file \.html$

I suppose I could add .css and .js files to the list of compressible files. I’ll have to check and see how many errors are generated by this (does anyone not use a modern-enough browser??).

[Posted with ecto]

these social networks

Marc’s Voice:

“So I apparently broke some rules on Orkut and have been banned. that is – my account is in ‘jail’.”

I took a tour through the wonderland that is Orkut myself last night: I feel too old to be there. All the listings indicate your relationship status (married, single, committed, open) which is something I rarely care about in a casual or more sustained relationship.

But more to the point, I think Scoble sums it up:

As a business person, my blog is far far far more useful than filling out a form of made up BS. Why? Because you can’t fake a blog for a long time if you use it for business purposes. If you lack integrity. If you are a jerk. If you are dishonest. It’ll be found out here.

What’s the old Bill Cosby joke about cocaine? He asked what was so great about it, and someone told him “it intensifies your personality.” Upon which point he pondered, “but what if you’re an asshole?”

Seriously, I can’t see the point of any of these: I think your network is an extension of what you do and who you are. If you contribute code to or write user docs for a project, your network/community stems from that involvement. If you’re an online commentator with some clue or insight on a topic (Daring Fireball or Freedom to Tinker come to mind), your network is drawn from your readers.

I’m not sure you can build one to order.

[Posted with ecto]

weightwatching

Web Page Speed Report – WebSiteOptimization.com: “Analysis and Recommendations

A report on the index page of this weblog . . .

TOTAL_OBJECTS – Warning! The total number of objects on this page is 23 – consider reducing this to a more reasonable number. Combine, refine, and optimize your external objects. Replace graphic rollovers with CSS rollovers to speed display and minimize HTTP requests.
TOTAL_IMAGES – Warning! The total number of images on this page is 20, consider reducing this to a more reasonable number. Combine, refine, and optimize your graphics. Replace graphic rollovers with CSS rollovers to speed display and minimize HTTP requests.
TOTAL_CSS – Congratulations, the total number of external CSS files on this page is 1. Because external CSS files must be in the HEAD of your HTML document, they must load first before any BODY content displays. Although they are cached, CSS files slow down the initial display of your page.
TOTAL_SIZE – Warning! The total size of this page is 187665 bytes, which will load in 42.00 seconds on a 56Kbps modem. Consider reducing total page size to less than 30K to achieve sub eight second response times on 56K connections. Pages over 100K exceed most attention thresholds at 56Kbps, even with feedback. Consider contacting us about our optimization services.
TOTAL_SCRIPT – Congratulations, the total number of external script files on this page is 1. External scripts are less reliably cached than CSS files so consider combining scripts into one, or even embedding them into high-traffic pages.
HTML_SIZE – Warning! The total size of this HTML page is 115134 bytes, which is over 100K! Consider optimizing your HTML and eliminating unnecessary content and features.
IMAGES_SIZE – Warning! The total size of your images is 65522 bytes, which is over 30K. Consider optimizing your images for size, combining them, and replacing graphic rollovers with CSS.
SCRIPT_SIZE – Caution. The total size of your scripts is 2287 bytes, which is above 1160 bytes and less than 4K. Consider optimizing your scripts and eliminating features to reduce this to a more reasonable size.
CSS_SIZE – Caution. The total size of your external CSS is 5169 bytes, which is above 1160 bytes and less than 8K. For external files, try to keep them less than 1160 bytes to fit within one higher-speed TCP-IP packet (or an approximate multiple thereof). Consider optimizing your CSS and eliminating features to reduce this to a more reasonable size.”

Some valid points here. Need to give this some thought . . .

[Posted with ecto]

I’m a 40 percenter

You are 40% geek

You are a geek liaison, which means you go both ways. You can hang out with normal people or you can hang out with geeks which means you often have geeks as friends and/or have a job where you have to mediate between geeks and normal people. This is an important role and one of which you should be proud. In fact, you can make a good deal of money as a translator.

A new career opportunity . . . . ? Actually, I’ve done this (still do) and it doesn’t feel like work. Isn’t that what you’re supposed to look for?

[Posted with ecto]

be prepared

From Kevin Kelly’s Cool Tools list:

Abstract:
Individual preparedness is an important element of our nation’s strategy for homeland security. This report adopts a scenario-driven approach that provides a rigorous way to identify actionsÂ— linked specifically to terrorist attacks Â—individuals can take to protect their health and safety. The result is an individual’s strategy across four types of terrorist attacksÂ— chemical, radiological, nuclear, and biologicalÂ— consisting of overarching goals and simple and directive response and preparatory actions. The actions are appropriate regardless of likelihood of an attack, scale of attack, or government alert level; designed to be sensitive to potential variations; and defined in terms of simple rules that should be easy for individuals to adopt.

Click for the Quick Guide.
Full Version PDF

You don’t have to be paranoid to print these out and review them. Do you?