TODO: Refactor this

Anyone who’s worked on a sufficiently large code base will have probably come across a comment like this one.

// TODO: Refactor this

No doubt left there by some well-meaning developer and probably ignored by everybody else since.

Sometimes it will be accompanied by a rant about exactly what’s wrong with the code in question.

Comments like this serve no useful purpose!

They may serve as a bit of a stress-relief for the person inspired to write it but other than that are a waste of time. If the person leaving it wasn’t inclined to do the refactoring then what makes them think someone else will?

Or maybe they optimistically think they can come back when they get some free time and do it. They almost never will find that free time – the comment will sit there until the developers on the team no longer even notice it anymore.

Here’s what I like to do when I come across some code that I think could do with some refactoring…

Consider working on it right away

If I have time and/or it’s in scope I’ll try to refactor it there and then.

If I can’t do it right away…

Raise a tech debt ticket in the bug tracking software

This creates an actionable task that can be discussed by the team and tracked.

Leave a TODO comment referencing the bug tracker number

This ties the code in question back to the ticket.

Add my name/initials to the comment

This gives people someone to talk to if they happen across the comment.

Oh yeah, and I save the rants for down the pub!

Fixing UTF-8 encoding on my Tomcat websites

Just spent a few hours fixing some UTF-8 encoding problems on my blog.

I had a problem with non-ascii character being displayed incorrectly.

Turns out that I had a number of different problems to solve.

First I read through Cagan Senturk’s (very useful) UTF-8 Encoding fix (Tomcat, JSP, etc) post.

Fortunately I’d already read Joel Spolsky’s epic unicode post so I had the theory.

First off I needed to make sure all my JSPs had the correct pageEncoding at the top.

I also added the ‘Content-Type’ meta header to my template file.

Next I needed to wire in the EncodingFilter that Cagan so kindly provided.

That meant that non-ascii characters in my JSPs rendered fine but I still had two problems.

Any text that I entered into a form was still being screwed up, as was anything read from the database.

Stack Overflow had the solution (as usual) for the form input.

I needed to amend my Tomcat config to ensure my connector had ‘URIEncoding=”UTF-8″ ‘ added to it.

That fixed the form input problem.

That just left my Postgres database.

I first used ‘psql -l’ to see what encoding my database had.

It was set to ‘LATIN1’ – obviously it needed to be ‘UTF-8’.

To fix this I needed to drop and recreate my database.

Luckily this was only my local development database (my production one was already UTF-8) so that was simple enough.

Finally, after all that was done, I had proper UTF-8 support on my site.

And to prove it – here’s some non-ascii content from the UTF-8 SAMPLER website.

¥ · £ · € · $ · ¢ · ₡ · ₢ · ₣ · ₤ · ₥ · ₦ · ₧ · ₨ · ₩ · ₪ · ₫ · ₭ · ₮ · ₯

Adding Sphinx to your Java website with jsphinx

I’ve been using Sphinx on my FilmDev website to search user’s recipes and it’s been working really well.

So well that I wanted to add it to my Java websites too.

Setting up Sphinx on a rails site is made very easy thanks to the Thinking Sphinx plugin.

Unfortunately there is no such plugin for Java so setting it up requires a little more work (though not too much).

First off I downloaded and configured Sphinx until I could call search on the command line and get results back from my database.

I then grabbed the sphinxapi.jar from the downloaded package and dropped it into my WEB-INF/lib directory.

The Java source for that jar is included in the downloaded package – plus a file called “” that I used as the starting point for my own code.

The code works but is fairly basic, I’ve expanded upon it a fair bit and have put it in a github project called jsphinx.

Feel free to grab this code and use and amend as appropriate for your own site.

I encourage you to share any changes you make by forking it on github.

Bear in mind it’s coded against the 0.9.9-release version, I have no idea if it works with the 2.0.1-beta version.

The code includes examples for doing weighting, filtering and ordering.

The command object also supports pagination.

I’m using the code on this blog right now and it works great.

The final thing in that code is something to handle delta indexing.

That’s enough of an involved topic to warrant another blog post…

Zsh completion of arbitrary commands

I spent a good few hours over the weekend trying to figure out how do something with zsh completion that I figured would probably be quite simple but I just couldn’t find an example of anywhere.

I wanted to do tab completion based on the output of an arbitrary command.

This was so that I could make full use of AndyA‘s very useful directory shortcut pind script.

Essentially I was porting the _cdpin function from bash to zsh.

function _cdpin() {
$( hasle ~/.pind -cx $cur )

complete -F _cdpin cdpin

It looked like it should be simple but after reading sections of the zsh manual, searching online, looking at the various completion scripts in my zsh installation and trawling Stack Overflow’s zsh content I couldn’t find a succinct example.

Eventually I found the answer in a book called From Bash to Z Shell which I was able to read on my Safari Books Online subscription.

The solution turned out to be ludicrously simple:

function _cdpin() {
compadd $(hasle ~/.pind -cx)

compdef _cdpin cdpin

I could make it even simpler if I stuck if in a file called _cdpin but I wanted to keep it in the same place as the functions.

Now when I type cdpin and hit tab I get a list of my existing entries and I can complete from there.

Annoying that something so simple was so hard to find so I’m posting it here in the hope it helps others.

And as AndyA was smart enough to share his scripts on github I’ve forked it and will add zsh support (via a conditional) and see if I can get him to pull in my changes (I’m sure he’ll do it if I buy him a beer).

As an aside, gotta say I’m loving github – I joined a few months back but have only just started uploading some code to it.

You can see my stuff at

Always be Debugging

Back when I was a Junior Programmer at my first programming gig I used to have conversations like this with my mentor.

Me: See, I enter the value here and then I get back the correct result. So, it works.

Mentor: Did you run it through in the debugger and see what the code was doing?

Me: I don’t need to, it works.

Mentor: It might be working just by luck, run it through and check.

I’d do as he said and run it through and around 50% of the time I’d spot something that could have gone wrong (we were writing C++, there was an awful lot that could go wrong).

So, gradually, I got into the habit of always stepping through any new code I had written in the debugger.

Now I’m more experienced I don’t do that so often (I tend not to make those silly mistakes so much – plus we have unit tests for picking up on said silly mistakes, oh yeah, and I no longer code in C++).

But one habit that has stayed with me is to always run my application in debug mode in Eclipse (I develop web apps in Java now).

That way if I do see something dodgy I can set a breakpoint, refresh the browser and immediately find out what’s going on.

The option to run my application in non-debug mode may as well not exist in Eclipse for me – I think I’ve only ever clicked that button by accident.

However, when I am asked by another team mate to help them out with a coding issue at some point we might have a conversation like this.

Me: Stick a breakpoint on that line there.

Team mate: I’m not running in debug mode, I’ll need to restart.

Me: Hmm, you should always run in debug mode, it’s great for things like this.

Team mate: Debug mode is slower.

Me: Restarting every time you have a problem isn’t exactly fast either.

I try not to go off on one at this point and lecture on the benefits of running in debug mode but I do think that stepping through code and seeing what’s going on can help can make people better programmers.

It’s not just the scenario described above.

If you’re running your application and see something not quite right it’s very easy to add a breakpoint and find out what’s going on.

If you’re not running in debug mode the temptation is to think “I’ll take a look at that later” and then forget all about it.

If you don’t already do this then you might want to try it for a week or two.

My betting is that you won’t switch back.

Automatically adding photos to Flickr photosets

I’m quite lazy when it comes to organising my photos into photosets on Flickr.

The whole process has always been a bit too manual for my liking.

It’s been on my todo list to find a way of automating it so this weekend I tried to do just that.

My thinking was to somehow link my photosets to the tags I already use for my photos. These are set when I upload from my photo database (photodb).

I know Flickr Set Manager already does this but I wanted something integrated into my photo database.

I’d already decided I didn’t want to store details of the photosets in my database as it would be a maintenance pain if I removed a set.

Plus I’d need to write some code and web pages for managing it all.

As I was pondering alternatives I had the idea to add some metadata to a photoset description on Flickr then parse and match on that in my app.

Cluttering up my set description with such metadata was a little messy but as you can’t add tags to sets it seemed the simplest way.

The basic plan then was to load all my photosets from Flickr when I chose to upload a photo.

Then parse the set descriptions for my metadata and match that against my photo’s tags.

This would then pre-select those sets in a multi-choice select box displayed on my upload page.

I could then de-select any incorrect choices and choose additional sets too.

Once I had knocked together a little prototype it occurred to me that as I store lots of other metadata about my photos I could automatically add to sets based on all sorts of criteria.

So I set about feeding location data, camera and film information into it too.

The really nice part about this solution is that if I want to create a new set based on a particular location or new camera I just need to add an entry into the set description and it all “just works”.

I mentioned above my plan to use a multiple choice select box – I forgot to mention how much I hate them though. Luckily for me I’m not the only one who hates them.

This article talks about various alternatives – the best one for me being the jquery-asmselect plugin which provides a clean and elegant solution to the problem.

Of course; all this only works for newly added photos. What about the 2000+ photos I already have on Flickr?

I need some sort of batch process to re-organise my existing photos.

Fortunately I’ve already written something similar for tagging photos which I can re-use.

Finally, here’s how it looks on screen.

If you click through to the photo on Flickr you’ll see notes I’ve added to explain things in more detail.

Machine tags on Flickr

A friend recently alerted me to this post about machine tagging for film photos.

He suggested I might want to add similar tagging to FilmDev.

I did want to add machine tags to FilmDev (other than the obvious one it uses to link recipes to photos), but I was more interested in adding them to my own Flickr photos.

Because I always add my photos via the photo database webapp that I developed (photodb) it makes it fairly easy for me to add any style of machine tags I want to my photos.

Also, because I always store the Flickr photo id after uploading it makes it easy to re-tag all my existing Flickr photos too.

So I set about implementing it.

Using the really useful flickr machine tag browser (by Paul Mison) I was able to quickly determine the most sensible tags to add to my photos.

I have a mixture of film and digital photos on Flickr so had to come up with tags for both types of photos.

I ended up with this for digital photos:

camera:model=Canon EOS 20D

I didn’t bother with tags for apertures and shutter speeds (even though I hold that information in photodb) as it seemed overkill for my uses.

I’ll come back to the photodb:id tag shortly.

For film photos I have something like:

camera:model=Olympus XA3
film:name=Agfa Vista 100

I don’t necessarily know what lens I used for a particular photo so I don’t record that and many of my film cameras don’t have interchangeable lenses anyway.

As I mentioned above I hold the Flickr photo id in photodb whenever I upload a photo so I am theoretically able to re-tag all my photos.

To do this I wrote the back-end code to set tags on a photo (photodb is written in Java) then hooked up an Ajax action to it.

I dug around for something I could use to show me the status of my re-tagging action (I had about 1700 photos to tag) and found JQuery PeriodicalUpdater so I wired that up to give me a countdown.

The last thing to mention is the photodb:id machine tag – predictably enough it refers to the id of the photo in photodb.

I hacked together a quick greasemonkey script to check for this tag and generate a link to that photo in photodb.

This makes it super-easy to link between the two websites (the ability to do that has been on my todo list for ages).

Oh yeah, as to the request to add them to FilmDev that prompted all this, I am still working on that (along with a bunch of other features).

Unlike photodb, real people use FilmDev so I can’t just hack together any old crap. 🙂

The development of FilmDev (part one)

So, a while back I blogged about wanting to learn a new programming language.

A mere year after that I actually started doing something about it and a full year on I have some real progress to report!

I’d decided to learn Ruby and as I’m a Web developer by trade I decided to take a look at Ruby on Rails.

My first project was a very basic “point rails script at database” application to track the stupid amounts of photographic film that I keep buying on eBay.

It was simple to set up but as it was little more than a glorified spreadsheet I didn’t feel I had learned a lot about Ruby and/or Rails (it really helped me get my film addiction under control though).

I needed a meatier project to get stuck into…

Architecture – boncey

About this time I’d started developing my own black and white film.

I’d been umming and ahhing over doing it for quite a while and after reading dozens of “It’s easy!” type posts on various film groups on Flickr curiosity overcame my inertia and I decided to take the plunge.

Turns out they were right; it was quite easy. Even the dreaded “getting the film onto the damn reel in pitch darkness” bit seemed to go well (a combination of luck and a brand new Paterson reel I suspect).

As I only had Ilford film I followed their instructions, using Ilford DD-X as my developer.

Of course, there’s no rule saying you have to match a particular film with a particular developer – part of the fun of home developing is experimenting with different combinations.

It was whilst ruminating over this that I was struck by the idea that it would be really useful if there was a Website that provided an easy way to compare the results of developing a particular film with a particular developer.

The basic principle would be to allow people to sign in and describe what method they used to develop a particular film – a film developing “recipe”.

These recipes could then be linked to photos on Flickr via a special tag.

Thanks to the Flick API this would all be fairly straightforward to put together.

This then would be my first proper Rails project.

I started the project around the beginning of March 2008 and announced it to the world about 5 weeks later.

The site can be seen at

Although the initial version was basic, it fulfilled my brief, which was to learn some Ruby (and some Rails).

But at this point I still had a lot of gaps in my knowledge.

My hosting solution was a bit ropey – relying on Fast CGI (hey, it seemed like a good idea at the time!).

And such wonders as Capistrano were still a closed book to me.

I still had a lot to learn…


As a follow-up to yesterday’s post on importing CDs I’ve decided to add my CD importing code to github.

It lives here.

I suspect it’s more useful as source code for people to look at than as a project for people to use.

It’s a little bit too much “MeWare” at the moment for it to be generally useful.

Of course, it would not be that much work to make it more generally useful:

  • Put all the paths to binaries in a config file (they’re currently hard-coded in the source).
  • Write some documentation (it has JavaDoc but that is all).
  • Make it a little more flexible (it makes some assumptions about output files with specific names and in specific formats).
  • It’s only ever been run on Linux (I’m not sure if all the binaries it requires exist on Windows)

Plus, there are a million and one programs out there to rip CDs (this is the bit where I’m supposed to justify writing yet another one…).

Anyway, the code is available to browse so feel free to “check it out”.

“Source control ate my files!”

Everyone who has worked in Software Development for long enough must have heard somebody say that source control ate their files – it’s up there with “Works on my machine” and other such silliness.

Invariably source control didn’t eat their files at all – the problem boiled down to a (sadly) not too uncommon condition of “Fear of Source Control“.

Here are some of the symptoms of such a fear.

Unwillingness to do an update

I worked on a project years back with a tight deadline, not a huge amount going on in the way of process and about 6 developers all coding like hell.

Invariably, every other update broke your local build as someone had changed an interface or forgotten to check in a file (we had no daily build either), so yes, updating was a pain.

For two of the less experienced team members their solution to this problem was to hold off updating as long as possible (they’d happily go two weeks without doing an update).

When I queried their approach their answer was that doing an update “broke things” so was best avoided.

This symptom of course goes hand in hand with…

Unwillingness to do a commit

As anyone who has worked with source control systems long enough knows, they won’t let you commit a file if there are outstanding changes to be merged in.

So, leaving long gaps between updates almost always leads to massive problems when you finally commit your changes.

Long update intervals lead to long commit intervals.

My usual solution to this is to do an update every morning before I start any development work for that day.

Of course, I now work on much saner projects where things don’t break so often (or when they are likely to break things then someone has already warned you in advance).

That way I get the small amount of pain out of the way without huge disruption to whatever I am working on.

I then commit my work once it’s complete and passes its tests.

I also try to pay attention to what my colleagues are doing on the project too so I can avoid nasty surprises.

Deleting other people’s code

I have seen this happen, usually when someone gets a merge conflict.

Merge conflicts are what happens when two people work on the same code at the same time and the changes from one person’s work are merged into the other’s work.

Sometimes this happens smoothly and everybody is happy and sometimes not so smoothly and one person is very unhappy.

In the face of a merge conflict the correct approach is to fix the code by hand, which often involves talking to the other person who worked on the file to ensure that both their changes and your changes are preserved.

The incorrect approach (and yes, I’ve seen it done) is to remove the offending lines (someone else’s code), keep yours and commit away. This is of course a “bad thing”.

Merge conflicts are of course best avoided, even experienced developers strongly dislike them.

The way to do this is through communication with your fellow team members and knowing who is working on what area of the code (Scrum-like daily meetings are great for keeping up with who is doing what).

Often, said conflicts can be avoided by a bit of advance planning (you do your bit, test and commit, then they update and pick up your changes before they do their bit etc).

Commenting out unused code

If you’re lucky it will be accompanied by a vague comment along the lines of “I don’t think this is used any more” .

This stems from the fear of not knowing how to use source control to retrieve old versions of a file.

The correct thing to do of course is to remove it, then commit that change and mention said removal in the commit log.

If the code is being replaced then a comment along the lines of “Replaced foobar1 with foobar2 – foobar1 code lives in version 4.1.2 in CVS” would be most appreciated by future developers (which could of course be you).

Committing backup versions of files

I’ve just finished working on a project where a binary file that was kept in CVS had no less than 7 alternate versions (none of which were used) checked in alongside it.

I spent ages working out what each one did, seeing it did nothing, then removing it from CVS.

Again it stems from confusion and fear of using source control to get access to older verions of a file.

The solution

The solution to all of these problem is of course to “lose that fear” and learn to love your source control tool.

One good way to start doing this is to stop using a GUI to manage your source control tool and learn the command line instead (assuming your tool has that option).

That will remove a lot of the mystery of what is going on.

Learn the mechanics of your source control tool by reading the manual.

Eric Sink has an excellent series of posts on source control that highlight some of the many benefits of using it.

The final thing to realise is that correct use of source control will save your bacon.

The initial learning investment will be repaid time and time again.

And that, is worth its weight in gold.