about the CD list – part two – preparation and importing of data.

Continued from about the CD list – part one.

How – Preparation

The first step was to create the database tables and the model classes that would represent them.

I created a CD table and a Track table and the appropriate classes.

The plan was to populate the data from the cache files and keep a timestamp of when the CD was played, based on the timestamp on the cache file.

How – Importing data

Perl

I then had to consider how to get the data from the file system into a database.

My first instinct was to write it in Perl. Perl is normally my first choice when it comes to dealing with parsing data in files but I had no experience of talking to a database with Perl, plus I thought it might be good to do something in Java that I didn’t normally do.

Java

So I wrote a stand alone program in Java to parse a file created by libcdaudio.

I planned to call this in a script once for each cache file (libcdaudio creates one per CD), that way I wouldn’t need to write anything to deal with traversing the file system.

Parsing problems

The basis of the file was key value pairs separated by an equals sign, as follows

DTITLE=Stuart Hamm / Radio Free Albemuth

Great I thought, I can use the built in Java Properties stuff to parse that file.

I then realised that the values were sometimes truncated and followed on a second line with the key repeated, as follows

DTITLE=Tom Tom Club / The Good The Bad and the Funky [extended dub edit

DTITLE=ion]

Damn, the Properties code would probably barf on that.

I decided to try it just in case, but as suspected keys had to be unique so the first line in the example above would have been lost.

So, I wrote my own parser.

Uh-oh

Once I was happy with that I then put in the hooks to write the data to the database.

The plan was that entries that weren’t there would be added, and existing entries would have their timestamp updated.

That was when I hit my next problem.

libcdaudio didn’t update the timestamp when reading from the cache, quite sensible of course, but I needed it to do that.

It was time to "hack the source".

I modified libcdaudio to update the timestamp of the cache file whenever it read from the file.

This only involved adding one line of C code to the main source file thankfully.

I now had a program that could parse a libcdaudio cache file and write the results to the database.

It was a little rough around the edges but it worked.

Displaying the data

I then wrote the code to display the data on the site.

This consisted of a set of classes and JSP pages, one set for listing and one for showing an individual entry.

Continued in part three.

2 thoughts on “about the CD list – part two – preparation and importing of data.”

Leave a Reply

Your email address will not be published. Required fields are marked *