Since the CVS version worked less and less reliable for me over time, I finally got over it, threw away my gtktalog catalog file and re-read all DVDs to build a new catalog from scratch.
And because none of the gtktalog alike softwares I tried could convince me, I just wrote my own little python script, that reads in the directory structure of a cd/dvd, and creates a tab separated file to show file name, size, md5sum, and file type. I create one such file for every DVD and put all files into a directory, and the result is a file system based catalog and I can just use any tool I like (for example grep) to search for a specific file.
Including a simple shell script wrapper it's less than 100 lines of code altogether and does everything I want.
EDIT:
In case anyone's interested in this simplistic approach. You'll note that the path to read from (/mnt/cdrom in my case) is hardcoded in two places:
Code: Select all
#!/usr/bin/env python
import os, sys, getopt
import md5
import time
import subprocess
def md5sum(filepath):
"Calculate and return the md5 checksum for a file."
h = md5.new()
try:
f = open(filepath, "rb")
try:
s = f.read(1<<24)
while len(s):
h.update(s)
s = f.read(1<<24)
finally:
f.close()
except IOError:
print "Error reading file", filepath
return "IOError"
return h.hexdigest()
def filetype(filepath):
"Use the system tool 'file' to get a string that describes the type of a file."
output = subprocess.Popen(("file","-b", filepath), stdout=subprocess.PIPE)
output = output.stdout.read()
return output.strip().expandtabs(1)
def filesize(filepath):
"Return the size of a file as a human readable string."
_abbrevs = [ (1<<50L, 'P'),
(1<<40L, 'T'),
(1<<30L, 'G'),
(1<<20L, 'M'),
(1<<10L, 'k'),
(1, '') ]
size = os.path.getsize(filepath)
for factor, suffix in _abbrevs:
if size > factor:
break
return "%d%s" % (int(size/factor), suffix)
def filetime(filepath):
"Return the modification date of a file as a human readable string."
t = os.path.getmtime(filepath)
t = time.gmtime(t)
t = time.strftime("%Y-%m-%d %H:%M:%S", t)
return t;
def katalog_file(filepath):
"Return a line with all information about a file, separated by tabs."
filename = filepath.split("/", 3)[-1] # will work only for depth 3 base paths like /mnt/cdrom/filenamehere.
return "%s\t%s\t%s\t%s\t%s" % (filename, filesize(filepath), filetime(filepath), md5sum(filepath), filetype(filepath))
def dirwalk(dir):
"walk a directory tree, using a generator"
liste = os.listdir(dir)
liste.sort()
for f in liste:
fullpath = os.path.join(dir,f)
if os.path.isdir(fullpath) and not os.path.islink(fullpath):
for x in dirwalk(fullpath): # recurse into subdir
yield x
else:
yield fullpath
# lets go for a walk.
for x in dirwalk("/mnt/cdrom"):
print katalog_file(x)
And wrap it in a shell script:
Code: Select all
#!/bin/bash
i=$(ls 0* | tail -1 | sed -e s@^0*@@)
while [ 1 ];
do
i=$[$i+1]
cdrom_id=$(printf "%'0'5d" $i)
echo "Waiting for cdrom $cdrom_id. Press any key when ready."
read answer
sleep 15
mount /mnt/cdrom
./katalog_cdrom.py > $cdrom_id
umount /mnt/cdrom
eject /dev/hdd
done;