housecleaning, weblog edition

These terribly slow posting times for new posts and comments have been bugging me long enough. The first tip I read was to remove all your plugins. Well, I don’t use many, so I decommissioned the WayPath related content plugin to see if that helped and it cut the time in half. Hmm, needs work.

I also found a plugin that optimizes the HTML and lightens the page load. The docs are at the site: I’m just using a very basic configuration. This line goes before the opening <html> tag and the magic happens when you rebuild your pages.

<MTOptimizeHTML lowercasetags="1" emptytags="b i span center" comments='1" entities="1">

I’ve also enabled related entry display on individual entry pages, thank to the information I gleaned from here.

So the pages should load faster and display more meaningful stuff (assuming you find any of this meaningful).

[Posted with ecto]

where do you want to go today? Hope you’re a good typist

833786 – Steps that you can take to help identify and to help protect yourself from deceptive (spoofed) Web sites and malicious hyperlinks:

“The most effective step that you can take to help protect yourself from malicious hyperlinks is not to click them. Rather, type the URL of your intended destination in the address bar yourself. By manually typing the URL in the address bar, you can verify the information that Internet Explorer uses to access the destination Web site. To do so, type the URL in the Address bar, and then press ENTER. ”

Take a look at the URL for the above-quoted page:
http://support.microsoft.com/default.aspx?scid=kb;%5Bln%5D;833786
Do you really want to type that in by hand? Weren’t hypertext links specifically devised to abstract away this kind of arcane gibberish?

I saw on some website or other the truism that MSFT’s lasting gift to computing is the assumption that computers are unreliable. Now we have this great step backward . . . . .

If you need any more reasons to use Mozilla or any other browser, I can’t help you.

[Posted with ecto]

on training computer software

James Seng, developer of the spam blocking “captcha” plugin for movabletype, also has a Bayesian filter plugin that tags potential spam comments.

needs_ training

This seems to work a little aggressively: I would think a post where the word “spam” appears would have a low likelihood of being spam . . . .
I guess it’s still in training mode if the comments are being published, even if they are labelled as spam.

[Posted with ecto]

clarification/amplification on spam comment interdiction

In response to this

I like the second of the two, since it’s not exclusionary. How hard would it be to defeat? And what countermeasures could be written into it (present the string as HTML entities that have to be decoded by a parser? present the word reversed? don’t use real words at all? make the letter position the result of a simple equation [what letter is in the 2^2 position in the string uiwplkg?]?)

a friend writes:

None of those countermeasures would be effective against a computer parser; most of that stuff doesn’t even matter to a computer, like whether it’s a real word or if there’s an expression to evaluate. That’s all stuff that a computer is really good at.

On the other hand, you could describe the operation to be performed in such a way that it’s hard to get the gist without fully grokking the English:

“In an attempt to verify that you are a living, breathing human being and not a mindless computer program, and not having the time or resources to arrange a Turing Test, we would like you to enter, in the blank below, the answer indicated by the following paragraph.

The previous paragraph contains words of several parts of speech: prepositions, articles, nouns, verbs, pronouns, adjectives. Locate the first word which belongs in that last category and enter the letter which appears in it twice.

Enter answer here:[ ]”

Of course, that’s a bad example because it exclude people who were ignorant of the intricacies of English parts of speech (when is a verb form an adjective?). But it’s the right kind of example, I feel. The idea is to make the statement of the problem as hard to parse as possible. Avoid using digits; spell out numbers and require them to be spelled out. Pull together various parts of the text with references that are unambiguous but not computationally precise. That sort of thing. And of course, there has to be a very, very large set of potential problems and answers so that it doesn’t boil down to capturing all of the questions and memorizing the correct responses with no grokkage required at all.

So randomness (to create a large problem set) and high degree of difficulty in parsing, in effect negating simple parsing, are the specifications.

[Posted with ecto]

learning to spin

Took another turn on the bike today, another 15 or so miles to Tracy Owen Station in Kenmore and back. Pretty uneventful, though the replacement pedals I just got were harder to get into than the old ones. I lost some time on the outbound leg messing with the left one.

For today’s adventure, I decided to forego speed and concentrate on cadence, or spinning. I find a cadence of 85-90 to be pretty comfortable so I tried to keep it there and just use the gears to keep me on track. That worked pretty well: I was pretty tired when I was done, so it must have done me some good. My speed on the outbound was about 17 mph and on the return into the wind, about 14. My cadence was not as good in the last couple of miles, but it gives me something to work on.

[Posted with ecto]

thoughts on comment spam/prevention strategies

There’s a lot of effort going into how to prevent comment spam in movabletype and other weblogs. The key seems to be finding something that approximates a crude Turing test: the post request must meet some challenge that only a human can meet.

There are a few coping strategies in the field. One is a “captcha” engine that creates a gif of a number string and requires the numbers to be keyed as a kind of authentication: the gif is somewhat obscured, making it a problem for the sight-impaired and, we suppose, OCR software.

Another idea, not yet fielded, is a challenge-response where the question is something like “what is the X letter of word Y”? The arguments against this seem to be that a parser could be written to sort that out . . . I suppose so.

I like the second of the two, since it’s not exclusionary. How hard would it be to defeat? And what countermeasures could be written into it (present the string as HTML entities that have to be decoded by a parser? present the word reversed? don’t use real words at all? make the letter position the result of a simple equation [what letter is in the 2^2 position in the string uiwplkg?]?)

With the understanding that no scheme is perfect, what makes the bar sufficiently high as to dissuade all but the morally bankrupt with a lot of time on their hands?

[Posted with ecto]

producers, not consumers, or how will you use Garageband?

TeledyN: Where have all the Listeners Gone:

“Right on, people: Make music, not war; keep on rockin’ in the freeworld!”

I haven’t read the piece he links to (go there for the link: I’m not stealing all his thunder), but the bottomline is that musical instruments sales are up in the midst of the RIAA’s sales slump.

Could it be that people are so fed up that they’d rather make their own music?

This has been one of my hopes with the emergence of broadband, increasing powerful home computers, and now sophisticated software that we would see a resurgence of creativity. There are a lot of work processing programs to crank out the Great American Novel/Play/Screenplay, weblog publishing tools and services, art and design tools, photoediting software, etc. Perhaps we’re getting to a point where we can all share our creative ideas around a softly flickering electronic campfire . . . .

[Posted with ecto]

United Artists, redux

Wired News: Just Say ‘No’ to Record Labels:

“CANNES, France — Rock veterans Peter Gabriel and Brian Eno are launching a provocative new musicians’ alliance that would cut against the industry grain by letting artists sell their music online instead of only through record labels.”
[ . . . ]
“I’m an artist who works incredibly slowly,” Gabriel said. “If some of those (songs) could be made available, you don’t have to be so trapped into this old way of being confined only by the album cycle.”

The former Genesis singer and world music promoter is interested in putting multiple versions of the same song online. He’s also looking forward to being able to hear unfinished music from other artists.

“We tend at the moment … to try to find a moment when a song is right. You stick the pin in the butterfly and put it in the box and you sell the box,” he said. “Music is actually a living thing that evolves.”

Find more here: the NPR clip is worth listening to. And there’s a reference to music-based United Artists here as well.

[Posted with ecto]