The Good Life... a weblog about life, technology, and the Opera web browser

Posts from the “Meta” Category

Behind My Site Relaunch

As part of my site relaunch, I did some major behind the scenes work on the software used to run this site. I had been using Drupal 4.6, which was released in April 2005. Since then, Drupal has had three major releases, the most recent in February. In addition to using the core Drupal software, I use several third-party add-ons (called modules). Since modules usually take some time to be compatible with new core releases, I decided to base my site on Drupal 5, which works with a large number of modules.

When I first launched my site using Drupal, I used the core blog module, since I figured it made sense to blog using blog module (a fair conclusion). However, since I was the only user blogging, I found the extra bits of UI added by the blog module got in the way rather than helped. As part of the upgrade and based on the Dag Wieërs' advice, I decided to ditch the blog module and use the generic story module. It's been a seamless transition so far.

I enjoy the dialogue that sometimes occurs in post comments, so I wanted to see if upgrading could improve the usability of comments. Previously, I allowed site visitors to register at the site, which gave them a couple additional benefits, such as e-mail notification of new comments and posts. Unfortunately, the vast majority of these registrations were used to post spam and very few registered users actually used the e-mail notification, so I decided to disallow new registrations and delete all old registrations. The site now automatically remembers commenter information and I added the ability for users to receive e-mail notification of new comments to a specific post if they've commented on it. Additionally, I added several new RSS feeds: all comments, comments to specific categories, and comments to specific posts.

As if that weren't enough, I also changed the way comment subjects work. By default, Drupal will take a chunk of the comment body and use it as the comment subject if a commenter doesn't fill in a subject. In practice, this leads to a bunch of duplicate text between the comment subjects and bodies. Instead, my site will use the post title as the default comment subject. The alternative was disabling comment subjects, but some of the other functionality I added requires comment subjects, so this ended up being a good compromise.

Spam is one of the big concerns for anyone allowing comments to their site. There's a great third-party spam module for Drupal that I used on my old site. That caught a large amount of spam, but I'd like to prevent spam from being posted in the first place. So, in addition to the spam module, I decided to require commenters to enter a CAPTCHA. Instead of using the CAPTCHA module's default arithmetic CAPTCHA, I decided to use the reCAPTCHA service. reCAPTCHA, a project of Carnegie Mellon University, uses words that couldn't be understood by OCR software when books were scanned. So, the CAPTCHAs that are solved on my site are helping to archive books digitally. It's a really neat idea and I'm glad I can be a part of the project. As an additional measure, I'm using the commentcloser module to stop comments on all posts a set amount of time from their publication.

I decided the relaunch was a good time to give posts more recognizable URLs, rather than just using "/node/1234". I wanted the URLs to be as meaningful as possible, so using the pathauto module, they are now title- and date-based, i.e. "archive/2008/03/02/title". Many sites that use this format don't have true path hierarchies for the URL, meaning that visiting "archive/2008/03/02/" would return a "Page Not Found" error. I'm a big fan of path hierarchies, so I decided to use the archive module to help out. By default, the archive module uses the "all" keyword in the URL to determine which types of content to display. Since I only have one type of content I want displayed in the archives, I hacked the archive module to remove the keyword handling, so it automatically handles the path hierarchies for me. Now post URLs are meaningful and allow easy access to archives.

Of course, I would be committing the cardinal sin of web sites if I changed all my URLs without also making them accessible from the old URL. Drupal has built-in functionality for this using path aliases. If I used this functionality, my content would available both at the old and the new. Unfortunately, Google dislikes such duplicate content. Global redirect to the rescue! This module uses redirects to forward the reader to the new URL rather than allowing content to be accessible at both URLs.

Another new addition to the site is a copyright. Previously, my site didn't indicate how the content was copyrighted. Though, I never saw evidence of misuse, I figured I would rather be safe than sorry.

Here's a list of the modules I found essential for this site; these modules should be seriously considered for any Drupal-based blog:

  • archive: date-based post archives
  • captcha: anti-spam measure: require users to enter a captcha to post a comment
  • comment_info: remember anonymous commenter information
  • comment_notify: e-mail notification of new comments to previous commenters
  • comment_subject: default comment subjects based on node title
  • commentcloser: anti-spam measure: close comments after a set period of time
  • commentmail: e-mail notification about new comments for site administrators
  • commentrss: provide various RSS feeds for comments
  • copyright: add copyright notices
  • elf: add an image indicating off-site links in posts
  • globalredirect: uses 301 redirects to stop duplicate content from arising when path module is enabled
  • pathauto: creates memorable URLs
  • pathfilter: filter for referencing internal paths, e.g. previous posts
  • quote: node and comment quoting functionality
  • recaptcha: anti-spam measure: use the reCAPTCHA service for CAPTCHAs
  • spam: anti-spam measure: apply Bayesian filtering to submitted comments
  • xmlsitemap: provide a sitemap for search engine spiders to improve search engine ranking

I'd be happy to give additional information about my site setup as requested.

Welcome to The Good Life

For the past couple months, I've been working on upgrading my site to a newer version of Drupal and switching to a new theme. Tonight, I've finally launched the updated site. Some time soon, I'll write more about the upgrade process, but for now I'd just like to welcome everyone to the new site.

As part of the upgrade, I've changed the site name and slogan. It was time to put "Stranger, yet..." to rest. That name, while unique, has been making less and less sense to me. This past week, I've been trying to come up with a unique, fun, and descriptive name for the site and I settled on "The Good Life". Initially, I thought the name might sound a bit like bragging, but my wife has convinced me otherwise.

In any case, I have several new posts in the pipeline and I hope to be updating more frequently now that I'm happy again with the way my site looks and functions. Special thanks to Eirik and Moose for their help with the design. It really wouldn't have turned out nearly so well without them.

Disabled Trackbacks

I disabled trackbacks just now. The majority of mail I've gotten in the past week has been notification of spam trackbacks on my web site. I haven't really used them that much and I don't think it's a big loss. It's just not worth the hassle.

Apology in advance

I'd like to apologize in advance to those readers that like the non-technical posts that sometimes appear here for the large number of posts about Opera likely to appear this week. I have a lot about Opera I want to say. We'll be back to intermittent Opera posts shortly. We appreciate your patience.

Gallery upgrade

I've just finished upgrading my image gallery to Gallery 2. Please let me know if you run into any problems.

Access Denied

Apparently my site has been denying access to everyone since about 9:30pm Saturday night completely unbeknowst to me. I don't really know what happened, but this certainly wasn't on purpose. Sorry. Everything should be up and working again now. Please let me know if that isn't the case.

The War on Spam

About two months ago, I installed Bad Behavior for Drupal to ward of some of the spam that was plaguing my site. I had previously installed spam.module to hide spam when it was posted. Bad Behavior tries to prevent spam from being posted at all. I can tell you that Bad Behavior has greatly reduced the amount of spam posted to my site, but not without costly side-effects.

A couple weeks after I installed Bad Behavior, I got an e-mail from an Opera user that was having problems accessing my site. He was getting HTTP 412 errors when browsing to the site and was also having problems with my RSS feed. I looked into it and found that Bad Behavior was blocking him because it was expecting an HTTP header that wasn't present in his page request. We never found the root of the problem, but it definitely seemed like it was out of his control.

At the time, I decided that a user here or there that couldn't access my site was an acceptable causality of my personal war on spam. Here's what I told him:

The war on spam is a balance of inconvenience. I have information I want to present to others. Others want that information. Still others are trying to feed off our desire to spread information. Some might say it's my responsibility to keep my site clear of such rubbish. To do so, I must make hard decisions based on time (cleaning up spam), money (increased hosting costs becwause of bots trawling my site), and the quality of my site (if spam is visible and for how long). Weighing these factors, I decided to install Bad Behavior. I've already seen a drop in spam, so it's paid off. Additionally, you're the first one that's complained. I'd rather deal with complaints one-by-one and try to improve Bad Behavior than deal with spam as I was before.

That might sound a bit high and mighty, but it's how I felt at the time.

I've since periodically browsed my Bad Behavior logs and found a lot of requests from spam bots, but also a lot of requests that looked pretty innocent. When I came across a post in the Opera forums about problems accessing my site this morning, I decided it was time for a change. A couple days ago, I installed comment_closer.module, which closes old comments after a specified period of time. That should help cut down on spam. As of this posting, I've disabled Bad Behavior, so no one should have problems accessing my site now. I decided there are too many legitimate requests being blocked.

Spam on my site is the same as unsolicited phone calls or mail: it's a pain to put up with, but we do it because we want the legitimate phone calls and mail. I want people to have access to the information here and that's more valuable to me than dealing with the current load of spam. I'll see how it goes without Bad Behavior for a bit. Hopefully I won't need to do anything else just yet.

To anyone that has had problems accessing my site over the past two months: I'm sorry, that was never my intention. And welcome back!

Bad Behavior for Drupal

I've just installed Bad Behavior for Drupal to try to ward off some spammers and bots. It seems like it's catching some requests for the RSS feed as invalid. Let me know if you're having problems.

PHP-based Genealogy Software?

Anyone know of some good PHP-based genealogy software? Rebekah started working on our family tree, but doesn't like the site she's using. I'd also like to have it at our disposal, so we can customize it and have full control of the information.

I found TUFaT and TNG 5, but I'm a big fan of try-before-you-buy (or not buying, but contributing back to the project). I just ran across PhpGedView, which we'll probably give a try. It seems like it's under active development and has been around for a couple of years.

Updated spam module

I've just updated the spam module on my site to take advantage of the rewrite, so please let me know if you run into any problems. I had turned on trackback moderation because I found a bunch of trackback spam that hadn't been caught. I've turned it back off now, so I hope things work out.