The Good Life... a weblog about life, technology, and the Opera web browser

Posts from the “Drupal” Category

Behind My Site Relaunch

As part of my site relaunch, I did some major behind the scenes work on the software used to run this site. I had been using Drupal 4.6, which was released in April 2005. Since then, Drupal has had three major releases, the most recent in February. In addition to using the core Drupal software, I use several third-party add-ons (called modules). Since modules usually take some time to be compatible with new core releases, I decided to base my site on Drupal 5, which works with a large number of modules.

When I first launched my site using Drupal, I used the core blog module, since I figured it made sense to blog using blog module (a fair conclusion). However, since I was the only user blogging, I found the extra bits of UI added by the blog module got in the way rather than helped. As part of the upgrade and based on the Dag Wieërs' advice, I decided to ditch the blog module and use the generic story module. It's been a seamless transition so far.

I enjoy the dialogue that sometimes occurs in post comments, so I wanted to see if upgrading could improve the usability of comments. Previously, I allowed site visitors to register at the site, which gave them a couple additional benefits, such as e-mail notification of new comments and posts. Unfortunately, the vast majority of these registrations were used to post spam and very few registered users actually used the e-mail notification, so I decided to disallow new registrations and delete all old registrations. The site now automatically remembers commenter information and I added the ability for users to receive e-mail notification of new comments to a specific post if they've commented on it. Additionally, I added several new RSS feeds: all comments, comments to specific categories, and comments to specific posts.

As if that weren't enough, I also changed the way comment subjects work. By default, Drupal will take a chunk of the comment body and use it as the comment subject if a commenter doesn't fill in a subject. In practice, this leads to a bunch of duplicate text between the comment subjects and bodies. Instead, my site will use the post title as the default comment subject. The alternative was disabling comment subjects, but some of the other functionality I added requires comment subjects, so this ended up being a good compromise.

Spam is one of the big concerns for anyone allowing comments to their site. There's a great third-party spam module for Drupal that I used on my old site. That caught a large amount of spam, but I'd like to prevent spam from being posted in the first place. So, in addition to the spam module, I decided to require commenters to enter a CAPTCHA. Instead of using the CAPTCHA module's default arithmetic CAPTCHA, I decided to use the reCAPTCHA service. reCAPTCHA, a project of Carnegie Mellon University, uses words that couldn't be understood by OCR software when books were scanned. So, the CAPTCHAs that are solved on my site are helping to archive books digitally. It's a really neat idea and I'm glad I can be a part of the project. As an additional measure, I'm using the commentcloser module to stop comments on all posts a set amount of time from their publication.

I decided the relaunch was a good time to give posts more recognizable URLs, rather than just using "/node/1234". I wanted the URLs to be as meaningful as possible, so using the pathauto module, they are now title- and date-based, i.e. "archive/2008/03/02/title". Many sites that use this format don't have true path hierarchies for the URL, meaning that visiting "archive/2008/03/02/" would return a "Page Not Found" error. I'm a big fan of path hierarchies, so I decided to use the archive module to help out. By default, the archive module uses the "all" keyword in the URL to determine which types of content to display. Since I only have one type of content I want displayed in the archives, I hacked the archive module to remove the keyword handling, so it automatically handles the path hierarchies for me. Now post URLs are meaningful and allow easy access to archives.

Of course, I would be committing the cardinal sin of web sites if I changed all my URLs without also making them accessible from the old URL. Drupal has built-in functionality for this using path aliases. If I used this functionality, my content would available both at the old and the new. Unfortunately, Google dislikes such duplicate content. Global redirect to the rescue! This module uses redirects to forward the reader to the new URL rather than allowing content to be accessible at both URLs.

Another new addition to the site is a copyright. Previously, my site didn't indicate how the content was copyrighted. Though, I never saw evidence of misuse, I figured I would rather be safe than sorry.

Here's a list of the modules I found essential for this site; these modules should be seriously considered for any Drupal-based blog:

  • archive: date-based post archives
  • captcha: anti-spam measure: require users to enter a captcha to post a comment
  • comment_info: remember anonymous commenter information
  • comment_notify: e-mail notification of new comments to previous commenters
  • comment_subject: default comment subjects based on node title
  • commentcloser: anti-spam measure: close comments after a set period of time
  • commentmail: e-mail notification about new comments for site administrators
  • commentrss: provide various RSS feeds for comments
  • copyright: add copyright notices
  • elf: add an image indicating off-site links in posts
  • globalredirect: uses 301 redirects to stop duplicate content from arising when path module is enabled
  • pathauto: creates memorable URLs
  • pathfilter: filter for referencing internal paths, e.g. previous posts
  • quote: node and comment quoting functionality
  • recaptcha: anti-spam measure: use the reCAPTCHA service for CAPTCHAs
  • spam: anti-spam measure: apply Bayesian filtering to submitted comments
  • xmlsitemap: provide a sitemap for search engine spiders to improve search engine ranking

I'd be happy to give additional information about my site setup as requested.

Pet project: CreativeBits

My pet project over the past couple of months finally launched on April 11. CreativeBits is a design and Mac enthusiast community, featuring design critiques, advice, and tips. Ivan, the site owner, approached me in December to work with him to convert the Blogger and phpBB site to Drupal. Between January and April, most of my free time was spent preparing the site for launch.

I wrote an import script for MovableType previously, so I already had a lot of the know-how for the Blogger and phpBB import scripts. Blogger has very straight-forward template documentation, which made it easy to create a template that slurps up all of the blog posts and comments for import. On the other hand, I had to grope around in the dark to figure out the phpBB database structure. The finished products are available in my Drupal.org sandbox.

While I worked on the import scripts, Ivan designed two themes for the site, giving it a distinct, fresh look. We converted the site over a weekend, choosing to take the old content offline to ease importation. The site launched to rave reviews from previous users, but wasn't without its hiccups. Due to a problem with the way I setup the new site, some of the posts imported from phpBB were attributed to the wrong user. I spent a lot of time hand-editing the database to correct user IDs, but everything turned out fine in the end.

Since its relaunch, CreativeBits has doubled its user base. The site is easier to use than ever, thanks to features like a single login for the entire site (compared to separate logins for Blogger and phpBB). Original content keeps pouring in and it looks like Drupal is giving them everything they need to turn into a mature, sophisticated community.

I've posted various other details about the conversion in the CreativeBits forum and the Drupal showcase forum. All-in-all, a job well done.

New pictures, Drupal, and other ramblings

Rebekah and I have added two new albums in our image gallery: Weekend in the Woods and honeymoon pictures. Enjoy!

While working on these pictures, I was struck by how much of a hassle it is to use different software for my image gallery and weblog. I'm going to reinvestigate the possibility of using Drupal for my image gallery in addition to my weblog. I mean, it is a content management framework. Images are content. It seems like a no-brainer.

I know there's an image module for Drupal, but I think I tried it out and found it lacking features I wanted. I'll have to have another look. Of course, if I end up deciding that a Drupal-based solution can work, I'll have to import all my pictures. And there's probably no Gallery import script, so necessity will again force me to write one. Oh well, at least I'm bringing something unique to the Drupal community.

Speaking of importers, I've been working on a Blogger importer. I'm almost finished, save for an issue with importing comments. I've had to hack my way to access the Drupal functions for adding nodes, comments, paths, and users, since no formal method to do so exists. I may work on that (again) once I'm finished with this importer work. Ideally, there would be an API to access these things.

There are so many more things I want to write, yet I feel bad for hijacking my own post. And I don't seem to have the courage to sit down and get the words through my finger tips. I miss staying up until two or three in the morning, posting about whatever came to mind. I don't feel like I have that freedom anymore (and said freedom-deficiency has very little to do with being married, thankyouverymuch).

Sometimes, I'll be experiencing life and think to myself, Self, this would be a fantastic topic to write about! I'll begin to think through the opening sentences in my head, working on the wording, etc. Then, I either get caught up in the reality of my full-time job or computer-less apartment and lose track of the whole thing. Part of the reason it's taken so long to upload the honeymoon pictures is that I wanted them to accompany a story I was going to write about our trip. That story never materialized, so neither did the pictures. I finally gave in and just posted the pictures, hoping that I'll get to the story sometime. We shall see.

A lot of Things seem to be building together to try to sour my life. Well, it ain't working, Things. You're going to have to try harder. Or, better yet, give up your futile attempts to give me a nervous breakdown! I will not succumb! Muhahaha!

Anyway, I hope more will be coming your way from Oslo soon. 7:30 comes pretty early, so I must be on my way. And yes, I was a bit exclamation happy this evening. Toodles.

Drupal 4.5.2 Released

Drupal 4.5.2 has been released. As well as fixing bugs, it also fixes a security issue. So, if you're running Drupal, you should upgrade. More details are available at the Drupal release page.

Comment Spam

I've temporarily disabled anonymous comments due to a recent onslaught of comment spam affecting Drupal sites. I hope to have a solution to this problem soon.

UPDATE: Alright, I've installed a spam module to see if this helps. It's going to take a while to learn from the spam, so some stuff may show up, but should be eradicated shortly there after.

Site upgrade

I've just updated my site to Drupal 4.5.0, so things might be a bit wonky. Please let me know if you have any problems.

The Plunge

Well, I've finally taken the plunge and updated my site to Drupal. I'm sure there are some broken things, so if you run into problems, add a comment to this post or send me an e-mail.

With this new software, you can sign up for an account if you post comments often. Otherwise, you'll get stuck in the moderation queue before comments will be posted.

Migrating from Movable Type to Drupal

When moving between content management systems, one of the first things you have to think about is whether you can import your data from one system to the next. Importing from Blogger to MovableType was easy. They had straightforward, easy to find and follow instructions. They even document their import file format.

Drupal—what I'm planning on using here soon—is another story. They have documents scattered all over the web. And it's mostly technical jargon that's over my head. So, I'm going to detail the process I used, hopefully in a way that'll help someone in the future.

All the conversion scripts I found required Perl 5.8.2+, which none of the machines I have access to have installed (they all have 5.6.1). The conversion scripts also require the XML::SAX module, so I had to install and configure that. This, of course, is easier said than done.

Installing Perl for Windows

The version of Perl available for Windows is called ActivePerl. ActivePerl comes in two installation packages, an MSI installer and a ZIP file with a batch file installer. They recommend the MSI installer, but that might require the installation of an update version of the Windows Installer (which is linked to from the ActivePerl download page). Detailed installation notes are also available.

Installing XML::SAX

In order to run the script, you need to have the XML::SAX module installed. Using ppm to install it didn't seem to work (according to the error message and documentation I found about the error message), so I used cpan instead. To use cpan, you have to install nmake, then install cpan. Then, finally, install XML::SAX.

(Optional) Install XML::SAX::Expat

The Perl-XML documentation recommends installing a faster XML parser, so I installed XML::SAX::Expat. While in cpan, this should be a simple matter of typing install XML::SAX::Expat.

Export your Movable Type entries

At this point, I started following the Drupal documentation, though not exactly. The first step is to add a new Movable Type template (call it drupal.xml), which you'll then use as input for the convertor script. The Drupal documentation has a sample template. After trying this the first time, I noticed that my entries appeared in Drupal's blog management page backwards (they appeared correctly on my public blog page). I decided that was enough of a pain to do the import again. So, if you want your newest posts at the top, change <MTEntries lastn="1000" sort_order="ascend"> to <MTEntries lastn="1000" sort_order="descend">. This file should then be available for download from your web site.

Making the conversion script

Step 2 according to the Drupal documentation is to create and run the conversion script. This was one of the major stumbling blocks for me due to an error in the Drupal docs. So, save the conversion script to a file named convert.pl. The fourth line from the bottom should be my $filename = $ARGV[0];, otherwise you'll run into a parse error. This change and another change (setting the value for the changed column in the table) should be visible in the Drupal documentation soon.

Running the conversion script

The moment of truth. Now it's time to run the conversion script. Put convert.pl and drupal.xml in the same directory. Assuming Perl is in your path (type path in DOS to check the value of your path variable), navigate (in a DOS window) to the directory where convert.pl and drupal.xml reside and type perl convert.pl drupal.xml > nodes.mysql. Assuming this runs correctly, you'll now have a file full of SQL insert statements which you can then import into Drupal.

Importing into Drupal

After the conversion script runs successfully, upload nodes.mysql to your drupal directory. Now, from a shell session, navigate to your drupal directory and type mysql -h hostname -u username -p database name < nodes.mysql (where you've replaced the emphasized text with the appropriate values). Now all your entries should be imported into Drupal. Almost done!

Fixing Drupal's incrementer

Drupal has an auto-incrementer that normally counts which entry ID to use next. But when using raw SQL insert statements, this doesn't work. So, now we have to update the incrementer manually. If you're using phpMyAdmin, this is as simple as opening the database, checking the number of rows in the nodes table, then updating node_nid in the sequences table to the number of rows in the nodes table plus 1. If you don't have phpMyAdmin, you'll have to run three SQL queries:

  1. select max(nid) from node;
  2. select * from sequences;
  3. update sequences set node_nid = result from first query plus 1;

That's it! Now, you have all your entries imported. Unfortunately, you've left behind your comments and categories. I'm still working on those. I'll let you know what I figure out.