View previous topic :: View next topic |
Author |
Message |
PowerFactor Veteran
Joined: 30 Jan 2003 Posts: 1693 Location: out of it
|
Posted: Wed Apr 28, 2004 8:19 pm Post subject: Scan multi-page documents directly to pdf quickly. |
|
|
I don't know if anyone else has been as frustrated by the lack of easy to use software focused on document scanning on linux. I'm not talking about ocr, just scanning documents into a portable multi-page image format. In fact the only software that I've found that does exactly what I wanted was Adobe Acrobat on windows. But with that I had to go through the windows twain driver interface for my scanner which seems designed to make the process as slow and clumsy as possible. Still, that was the method I used for the last couple years on occasions when it was useful.
Finally, after I bought a new printer/scanner a couple month ago, (epson cx3200, its nice) I decided it was time figure out how to do the job on linux. By then I knew all the command-line tools to do what I wanted were available, it was "just" a matter of writing a little script to tie it all together. Being the amature I am it took me a couple days to figure it all out, but I got it working. This was back in January.
The other day I was playing around with controlling it with kdialog and I thought maybe someone else would find it usefull (the non kde dependant version that is) So I figured why not post it. I did try to make it a little more user friendly. It's still an ugly hack, but it works for me. Not only that, but once it's setup it's better at its specific purpose than anything else I've tried.
The script depends on the following packages.
sane-frontends
imagemagick
netpbm
ghostscript
If your scanner has a decent 1-bit(Lineart) mode (or if you can actually get convert's threshold function to work for you) then you can modify the script slightly and get rid of the netpbm dependency.
You need to know how to use the scanimage program with your scanner, as you will need to modify the SCANDEVICE and SCANCMD variables to fit. The rest of the configuration is pretty self explanatory I think.
To use it you just put you first page in the scanner then run the script with the name of the file to save as the argument. It will then immediately scan the first page then prompt you for more. The rest is gravy.
It's not very robust, if your scanner has a warm-up period then make sure it's finished before you start. Otherwise scanimage may timeout and the script gets a little confused then. And it's not designed to work with scanners that have an adf.
Anyway, hope someone can use it. Even if just for inspiration.
EDIT: Later versions posted further down the thread. Chrwei posted one that should work with an ADF(I don't have the hardware to try it) and I've posted the python version I've been using for a while. This version is left here mainly for reference.
Code: | #!/bin/bash
#
# scan-pdf version 2
# April 27, 2004
# Copyright 2004 Zacchaeus Pearsall (zap4260 at yahoo.com)
# Distributed under the terms of the GNU General Public License v2
# Set this to a value between 0 and 1
# You may have to play with it some to get good scans
# of colored paper
THRESHOLD=0.55
# Scan resolution
RES=300
# Set these for the size paper you are scanning
# defaults are X=212.5, Y=275; for US letter paper
X=212.5
Y=275
# Paper size for pdf output. See "man convert" for possibilites
PAGESIZE="Letter"
# Set this to the appropriate sane device for your scanner
SCANDEVICE="epson:/dev/usb/scanner0"
# Leave this as Gray unless you have a scanner that has
# a good useable 1-bit mode
SCANMODE=Gray
# If your scanner has a good 1-bit mode and you plan
# to use it then uncomment this. I have no way to test this.
# You may need to make some other modificatinos as well.
#SCANNER_HAS_BW="yes"
# Modify the scan command as needed to fit your scanner
SCANCMD="scanimage -d ${SCANDEVICE} --mode ${SCANMODE} \
-x ${X} -y ${Y} --resolution ${RES}"
if [ -z ${1} ]
then
echo "USAGE: scan-pdf file.pdf"
exit 0
fi
if [ -e ${1} ]
then
echo "${1}: file exists"
echo "Press any key to continue, Ctrl-c to quit"
read -s -n 1 junk
fi
PDFFILE=${1}
TMPDIR=`/bin/mktemp -td scan-pdf.XXXXXXXXXX`
trap "rm -fr ${TMPDIR}" 0
trap "exit 2" 1 2 3 15
i=1
MORE_PAGES="y"
echo "To scan more sheets press space; press q when done"
while [ -z ${MORE_PAGES} ] || [ ${MORE_PAGES} != "q" ]
do
if [ ${i} -lt 10 ]
then
PG="pg0${i}"
else
PG="pg${i}"
fi
if [ ${SCANNER_HAS_BW} ]
then
${SCANCMD} > ${TMPDIR}/${PG}.pbm
else
${SCANCMD} | pgmtopbm -threshold -value ${THRESHOLD} >\
${TMPDIR}/${PG}.pbm
fi
read -s -p "More?" -n 1 MORE_PAGES
echo ' '
let i++
done
convert ${TMPDIR}/pg*.pbm -adjoin -page ${PAGESIZE} ${TMPDIR}/pgs.ps
ps2pdf13 ${TMPDIR}/pgs.ps ${TMPDIR}/pgs.pdf
mv -i ${TMPDIR}/pgs.pdf ${PDFFILE} |
Last edited by PowerFactor on Sun Feb 11, 2007 4:42 pm; edited 1 time in total |
|
Back to top |
|
|
FatherBusa Apprentice
Joined: 21 Mar 2004 Posts: 166
|
Posted: Sat Jan 08, 2005 2:13 am Post subject: |
|
|
Dude, you're a genius. This is just what I was looking for. Thanks! |
|
Back to top |
|
|
chrwei n00b
Joined: 16 Feb 2005 Posts: 2
|
Posted: Wed Feb 16, 2005 5:20 am Post subject: |
|
|
very nice, here's my enhancements :)
summary:
- Added command line options with defaults
- Added ADF support with command line toggle to to use flatbed. can be set to use flatbed by default with command line toggle to use ADF.
- Changed to use scanimage's batch mode and prompt so that timeouts shouldn't be an issue. ADF doesn't use the prompt
- Made the scanner device name optional as scanimage will normaly detect your scanner automaticaly.
scanners tried:
- HP Officejet 6110
TODO:
- add more paper size options
- NetPBM says pgmtopbm is depreciated as of 7/2004 and to use pamditherbw instead. I plan on only doing color or full greyscale documents so I'm not touching this.
bugs:
- "mode" seems to be scanner specific, some want "Grey" others want "Greyscale". - needs testing
- might be an isue with providing -x and -y when using ADF, I need to test more
things-i-wish-worked-better:
- too many temp files!
and the code:
Code: |
#!/bin/bash
#
# scan-pdf version 3
# April 27, 2004
# Copyright 2004 Zacchaeus Pearsall (zap4260 at yahoo.com)
# Distributed under the terms of the GNU General Public License v2
#
# February 15, 2005
# Chris Weiss
# - Added ADF and color support
# - Changed to use sane's built in batch mode
# - Added command line options
###defaults - set these so you don't have to supply them
### on teh command line ever time
# Set this to a value between 0 and 1
# You may have to play with it some to get good scans
# of colored paper, only used for BW scans
THRESHOLD=0.55
# Scan resolution
RES=300
# Set this to the appropriate sane device for your scanner
# if you have more than one or sane doens't autodetect your scanner
# SCANDEVICE="epson:/dev/usb/scanner0"
SCANDEVICE=""
# use ADF in no-prompt batch mode. Add the options your printer needs for this
ADF=Y
ADFOPTS="--batch-scan=yes" #hpoj
# for black and white choose grey
# if you have a scanner with a good 1-bit mode choose lineart
# for full color PDF's choose color
SCANMODE="color"
# If your scanner has a good 1-bit mode and you plan
# to use it then change this to Y
SCANNER_HAS_BW=N
# Paper size for pdf output. See "man convert" for possibilites
PAGESIZE="Letter"
#TODO: add X and Y sizes for more paper
# additional options
ADDOPT=""
###end defaults - you shouldn't need to modify anything below here
myname=`basename "$0"`
usage() {
cat<<EOF
$myname scans documents from your flatbed or ADF scanner and stores them in a multi page pdf.
Usage: $myname [Options] filename.pdf
Options:
-page "size" Page size for the PDF. See "man convert" for possibilites
-mode "mode" lineart, greyscale, or color.
-1bit [Y/N] If you scanner has a good 1-bit more and you want lineart, use Y here.
-adf [Y/N] Use ADF in no-prompt batch mode (Y/N) - edit this script and set your scanners options
-res dpi Resolution to scan at in DPI
-opts "options" Additional option to pass to 'scanimage' program
-threshold 0.55 Value between 0 and 1 to pass to 'pgmtopbm'
-h Help This info.
Open $myname in your favorite editor to change the default values.
$myname requires: sane-frontends, imagemagick, netpbm, and ghostscript
EOF
exit 0
}
while [ $# -ne 0 ];
do
case "$1" in
-page) shift; PAGESIZE=$1 ;;
-mode) shift; SCANMODE=$1 ;;
-1bit) shift; SCANNER_HAS_BW=$1 ;;
-res) shift; RES=$1 ;;
-threshold) shift; THRESHOLD=$1 ;;
-opts) shift; ADDOPT=$1 ;;
-adf) shift; ADF=$1 ;;
-h) usage ;;
*) PDFFILE=${1} ;;
esac
shift
done
if [ -z ${PDFFILE} ]; then
usage
fi
if [ -e ${PDFFILE} ]
then
echo "${PDFFILE}: file exists"
echo "Press any key to overwrite, Ctrl-c to quit"
read -s -n 1 junk
fi
outdir=`dirname "$PDFFILE"`
OPTIONS=""
BITCONVERT=""
case "$SCANMODE" in
color)
OPTIONS=" --mode color"
;;
grayscale)
OPTIONS=" --mode Greyscale"
;;
lineart)
if [ "$SCANNER_HAS_BW" = "Y"] || [ "$SCANNER_HAS_BW" = "y"]; then
OPTIONS=" --mode Lineart"
else
BITCONVERT="Y"
OPTIONS=" --mode Greyscale"
fi
esac
OPTIONS="$OPTIONS --resolution $RES"
if [ "$SCANDEVICE" != "" ]; then
OPTIONS="-d $SCANDEVICE $OPTIONS"
fi
if [ "$ADF" = "Y" ] || [ "$ADF" = "y" ]; then
OPTIONS="$OPTIONS $ADFOPTS"
else
OPTIONS="$OPTIONS --batch-prompt"
fi
if [ -z $ADDOPT ]; then
OPTIONS="$OPTIONS $ADDOPT"
fi
case "${PAGESIZE}" in
Letter)
OPTIONS="$OPTIONS -x 212.5 -y 275" ;;
esac
origdir=`pwd`
cd "$outdir"
#echo "scanimage $OPTIONS -b "
scanimage $OPTIONS -b
if [ "$BITCONVERT" != "" ]; then
echo "Converting greyscale to lineart"
for f in "out*.pnm"; do
#echo "cat $f | pgmtopbm -threshold -value ${THRESHOLD} > $f.pbm"
cat $f | pgmtopbm -threshold -value ${THRESHOLD} > $f.pbm
rm -f $f
done
#echo "convert out*.pbm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps"
echo "creating postscript"
convert out*.pbm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps
rm -f out*.pbm
else
#echo "convert out*.pnm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps"
echo "creating postscript"
convert out*.pnm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps
rm -f out*.pnm
fi
#echo "ps2pdf13 ${PDFFILE}.ps ${PDFFILE}.ps"
echo "Convert postscript to PDF"
ps2pdf13 ${PDFFILE}.ps ${PDFFILE}
rm -f ${PDFFILE}.ps
cd "$origdir"
|
|
|
Back to top |
|
|
r.abbott Tux's lil' helper
Joined: 16 Aug 2004 Posts: 113 Location: Herat, Afghanistan
|
Posted: Sun Feb 20, 2005 2:34 am Post subject: |
|
|
This thing is great! Thanks. |
|
Back to top |
|
|
gcediel n00b
Joined: 27 Jul 2004 Posts: 21 Location: Madrid, Spain
|
Posted: Fri Apr 22, 2005 7:13 pm Post subject: |
|
|
One (maybe silly) question: How can I make scanimage stop scanning more pages? I have tried several keys, but I can't stop it. _________________ Best regards.
Guillermo |
|
Back to top |
|
|
r.abbott Tux's lil' helper
Joined: 16 Aug 2004 Posts: 113 Location: Herat, Afghanistan
|
Posted: Fri Apr 22, 2005 8:17 pm Post subject: |
|
|
Use <Ctrl-D> |
|
Back to top |
|
|
chrwei n00b
Joined: 16 Feb 2005 Posts: 2
|
Posted: Sat Apr 23, 2005 1:52 am Post subject: |
|
|
I haven't used it in a while, but I think it tells you that on screen, at least it did on mine. You should run it in a terminal and not just from a "run" dialog. |
|
Back to top |
|
|
gcediel n00b
Joined: 27 Jul 2004 Posts: 21 Location: Madrid, Spain
|
Posted: Sun Apr 24, 2005 10:37 am Post subject: |
|
|
Well, CTRL+D doesn't work for me.
BTW: very nice stuff! _________________ Best regards.
Guillermo |
|
Back to top |
|
|
djmaze n00b
Joined: 25 Jun 2003 Posts: 36 Location: Berlin, Germany
|
Posted: Sun Apr 24, 2005 11:09 am Post subject: |
|
|
CTRL+C works for me. (Try it two times, if it doesn't work.) |
|
Back to top |
|
|
gcediel n00b
Joined: 27 Jul 2004 Posts: 21 Location: Madrid, Spain
|
Posted: Sun Apr 24, 2005 3:49 pm Post subject: |
|
|
Thanks, it works, although not a clean way. _________________ Best regards.
Guillermo |
|
Back to top |
|
|
zatalian Apprentice
Joined: 27 Aug 2002 Posts: 179 Location: Gent, Belgium
|
Posted: Mon Sep 04, 2006 2:39 pm Post subject: |
|
|
this script used to work for me but now convert gives me trouble...
convert -page letter converts the original image to a blank postscript file. Converting without the -page option works but then the pdf document is not in the correct format. Is this happening to anybody else? Any sollutions? |
|
Back to top |
|
|
bludger Guru
Joined: 09 Apr 2003 Posts: 389
|
Posted: Thu Sep 21, 2006 7:36 am Post subject: |
|
|
My HP 3500c doesn't have the mode function at all. This means that it can only output colour images. How would you convert something like this to black and white?
Also I had a number of tiff files and managed to convert them into a multi page pdf with:
convert <tif1> <tif2> <tif3> file.pdf
Why not just convert like this, leaving out the intermediate ps stage? |
|
Back to top |
|
|
bludger Guru
Joined: 09 Apr 2003 Posts: 389
|
Posted: Fri Sep 22, 2006 4:29 pm Post subject: |
|
|
I solved my problem with the following:
scanimage -d <device> --resolution 150|ppmtopgm|pamthreshold -simple >tempscanfile1.pbm
convert -compress fax tempscanfile*.pbm outfile.pdf
This produced a 33kB file with resolution 150 and a 60kB file with resolution 300. The 150 res version was readable, but a bit ugly and the 300 version was excellent. |
|
Back to top |
|
|
bludger Guru
Joined: 09 Apr 2003 Posts: 389
|
Posted: Mon Jan 01, 2007 6:38 pm Post subject: |
|
|
I have been using the above method successfully and conveniently for the last few months now. One problem that I have found is that when I try to convert multiple pbm files into one multi page pdf, I can quickly run out of memory if I get above 6 pages or so. Does anyone have any suggestions as to how to get around this? |
|
Back to top |
|
|
martoss n00b
Joined: 09 Dec 2003 Posts: 25
|
Posted: Wed Jan 03, 2007 8:52 pm Post subject: ...xsane? |
|
|
Isn't xsane doing the same?
My xsane version has an option to scan "pages" to a pdf. Works pretty well AFAIR. I don't see a big difference.
Xsane has also other nice features like just "copying stuff" and "emailing stuff". Anyways, your script sounds also nice |
|
Back to top |
|
|
PowerFactor Veteran
Joined: 30 Jan 2003 Posts: 1693 Location: out of it
|
Posted: Sun Feb 11, 2007 3:55 pm Post subject: |
|
|
It seems I have been lax in keeping up with this. Better late than never I guess.
zatalian wrote: | this script used to work for me but now convert gives me trouble...
convert -page letter converts the original image to a blank postscript file. Converting without the -page option works but then the pdf document is not in the correct format. Is this happening to anybody else? Any sollutions? |
I've ran into this several times. Seems to be some interdependancy between imagemagick and ghostscript that caused a problem when I upgrade one or the other. Usually recompiling imagemagick after a ghostscript upgrades fixes it.
bludger wrote: | I have been using the above method successfully and conveniently for the last few months now. One problem that I have found is that when I try to convert multiple pbm files into one multi page pdf, I can quickly run out of memory if I get above 6 pages or so. Does anyone have any suggestions as to how to get around this? |
I ran into this a while back too. You need use the "-limit Memory" and possibly the "-limit Map" options for convert to limit it's ram usage. Usually 1/4 of my physcal ram seems to work well enough. It does take a long time to convert though.
martoss wrote: | Isn't xsane doing the same? ... |
It is now and I'm glad to see it. It didn't have those options 3 years ago though when I posted this. It still looks like this script might be more convenient in for some tasks. Xsane is probably less buggy though. |
|
Back to top |
|
|
PowerFactor Veteran
Joined: 30 Jan 2003 Posts: 1693 Location: out of it
|
Posted: Sun Feb 11, 2007 4:36 pm Post subject: |
|
|
I've also made some changes since that original version. I converted it to python and added some ncurses "eyecandy" using dialog. Also got rid of the netpbm dependency. I had intended to rewrite it as a "proper" modular program with a seperate config file and such but never got very far with it. It's not something I use very often anyway.
Anyhow, here's my latest working version. Plenty of bugs I'm sure but it mostly works when I need it.
Dependencies have changes a little:
dev-lang/python
media-gfx/sane-frontends
media-gfx/imagemagick
virtual/ghostscript
dev-util/dialog
Code: | #!/usr/bin/python
#
# scan-pdf version 8
# Sep 13, 2004
# Copyright 2004 Zack Pearsall (zap4260@yahoo.com)
# Distributed under the terms of the GNU General Public License v2
import sys
import os
import signal
import shutil
import math
from tempfile import mkdtemp
# Set this to a value between 0 and 1
# You may have to play with it some to get good scans
# of colored paper
THRESHOLD="0.55"
# Scan resolution
RES="300"
# Set these for the size paper you are scanning
# defaults are X=212.5, Y=275; for US letter paper
X="212.5"
Y="275"
# Paper size for pdf output. See "man convert" for possibilites
PAGESIZE="Letter"
# Set this to the appropriate sane device for your scanner
SCANDEVICE="epson:"
# Leave this as Gray unless you have a scanner that has
# a good useable 1-bit mode
SCANMODE="Gray"
# Modify the scan command as needed to fit your scanner
SCANCMD="scanimage -d " + SCANDEVICE + " --mode " + SCANMODE + \
" -x " + X + " -y " + Y + " --resolution " + RES
# ImageMagick limits
MEMLIMIT="128"
MAPLIMIT="256"
# End of configuration options
def cleanup(signum=0 , stkframe=0):
if os.path.isdir(TMPDIR):
shutil.rmtree(TMPDIR, 1)
sys.exit(signum)
def imove(src, dest, ask=True):
if ask and os.path.isfile(dest):
userin=raw_input("overwrite file '" + dest + "'? ")
if userin == "y" or userin == "yes":
shutil.move(src, dest)
else:
shutil.move(src, dest)
def scanpage(scancmd, pgname):
buffsize=1024
progress=os.popen("dialog --gauge \"Scanning...\" 6 60", "w")
scan=os.popen(scancmd)
magic=scan.readline()
if magic == "P4\n":
outfilename=pgname + ".pbm"
elif magic == "P5\n":
outfilename=pgname + ".pgm"
else:
outfilename=pgname + ".ppm"
outfile=file(outfilename, "w")
outfile.write(magic)
buff=scan.readline()
while buff.startswith("#") or buff.startswith("\n"):
outfile.write(buff)
buff=scan.readline()
hw=buff.split()
outfile.write(buff)
if magic == "P4\n":
bmsize=int(hw[0]) * int(hw[1]) / 8
elif magic =="P5\n":
buffsize=8192
buff=scan.readline()
bmsize=int(float(hw[0]) * float(hw[1]) * math.ceil(math.log(float(buff)+1, 2)) / 8)
outfile.write(buff)
bytescopyed=0
prevpercent=0
buff=scan.read(buffsize)
while len(buff) != 0:
bytescopyed+=len(buff)
if int(math.ceil(float(bytescopyed) / float(bmsize) * 100)) > prevpercent:
prevpercent=int(math.ceil(float(bytescopyed) / float(bmsize) * 100))
progress.write(str(prevpercent) + "\n")
progress.flush()
outfile.write(buff)
buff=scan.read(buffsize)
progress.close()
scan.close()
outfile.close()
# start of main program
if len(sys.argv) <= 1:
print "USAGE: scan-pdf file.pdf\n"
sys.exit(0)
PDFFILE=sys.argv[1]
if os.path.isfile(PDFFILE):
print '\a'
if os.system("dialog --yesno \"${1}: File exists! Continue?\" 15 60") != 0:
os.system("clear")
sys.exit(0)
elif os.path.isdir(PDFFILE):
print "\a'" + PDFFILE + "' is a directory!"
sys.exit(0)
TMPDIR=mkdtemp("scan-pdf")
signal.signal(signal.SIGHUP, cleanup)
signal.signal(signal.SIGINT, cleanup)
signal.signal(signal.SIGQUIT, cleanup)
signal.signal(signal.SIGTERM, cleanup)
os.system("dialog --msgbox \"Insert first page\" 5 24")
i=1
retval=0
while retval == 0:
PG="pg" + str(i).zfill(5)
scanpage(SCANCMD, TMPDIR + "/" + PG)
i+=1
retval=os.system("dialog --yesno \"Scan another page?\" 5 25")
os.system("dialog --infobox \"Converting...\" 3 23")
os.system("convert -limit Memory " + MEMLIMIT + " -limit Map " + MAPLIMIT + \
" -threshold " + str(int(float(THRESHOLD) * 65535)) + " " + \
TMPDIR + "/pg*.pgm -adjoin -page " + PAGESIZE + " " + TMPDIR + "/pgs.ps")
os.system("ps2pdf13 " + TMPDIR + "/pgs.ps " + TMPDIR + "/pgs.pdf")
os.system("clear")
imove(TMPDIR + "/pgs.pdf", PDFFILE)
cleanup()
|
|
|
Back to top |
|
|
bludger Guru
Joined: 09 Apr 2003 Posts: 389
|
Posted: Thu Feb 15, 2007 10:08 am Post subject: |
|
|
PowerFactor wrote: | I ran into this a while back too. You need use the "-limit Memory" and possibly the "-limit Map" options for convert to limit it's ram usage. Usually 1/4 of my physcal ram seems to work well enough. It does take a long time to convert though. | Thanks for this. I just found this out independantly today and was returning to the thread to post my results, but it appears that you beat me too it.
I have just one question though. I used only the memory limit option. What does the map limit option actually do? The documentation seems rather sparse. |
|
Back to top |
|
|
PowerFactor Veteran
Joined: 30 Jan 2003 Posts: 1693 Location: out of it
|
Posted: Fri Feb 16, 2007 3:41 am Post subject: |
|
|
As I understand it the Map limit option limits the amount of filespace that can be mmaped for pixel cache.
http://en.wikipedia.org/wiki/Memory-mapped_file
I think theres probably no need to use the Map limit on most systems. I think I just put it in mine because I had no clue how mmaping worked back then. It doesn't seem to make any performace difference when I remove it. |
|
Back to top |
|
|
bludger Guru
Joined: 09 Apr 2003 Posts: 389
|
Posted: Fri Feb 16, 2007 4:01 pm Post subject: |
|
|
To get the scan device, I had been performing the following:
SCANDEVICE=$(scanimage -L|grep hp3500|awk -F '`' '{print $2}'|awk -F \' '{print $1}')
(my device is an hp3500)
This would read the correct usb port. From your script, I see that it might be possible to use just "hp3500:". I'll give that a try. |
|
Back to top |
|
|
csim n00b
Joined: 13 Feb 2006 Posts: 23
|
Posted: Sat Jan 26, 2008 4:16 pm Post subject: |
|
|
Hi,
i have a small suggestion:
i think it would be cool to have the basic parameters accessible via some kind of menu for example:
scanimage -L lists all available devices, it would be cool to select them via dropdown menu...
Code: |
device `v4l:/dev/video0' is a Noname stk11xx virtual device
device `plustek:libusb:002:002' is a Canon LiDE25 USB flatbed scanner
|
scanimage --resolution=300 -x 210 -y 297 -d plustek:libusb:002:002 > /home/user1/image.pnm
Basically having a 2 dropdown menus specifying paper size (A4 would translate to -x210 -y297) and DPI would be also cool
Let me dream a bit about this, having such simple (preferably GTK+ based interface) with a Finish button and a place where you can name your pdf would be great.
Code: |
Select device: Select Paper Size: Select Resolution in DPI:
[Canon Lide 25] [A4] [300]
[Your Name of pdf] . pdf
[/your/location] [Choose location...]
[Scan] [Finish]
|
|
|
Back to top |
|
|
redwood Guru
Joined: 27 Jan 2006 Posts: 306
|
Posted: Mon Sep 19, 2011 10:37 pm Post subject: Another version based on Zacchaeus Pearsall's scan script |
|
|
I was googling for Zacchaeus Pearsall's original version of this script, when I found this page.
I too used his script as a starting point when writing a shell script for batch document scanning using scanadf.
My version "bscan" is available at http://www.acjlaw.net:8080/~jeremy/Ricoh/usage_bscan.html
It uses a configuration file, ~/.bscanrc
where one can list all your scanners in a bash array,
with devices names as shown by "scanimage -L"
and the default scanner being SCANDEVICE="${scanners[0]}"
Importantly, specifying the scanner names in ~/.bscanrc saves time
since the script then skips finding the scanners using "scanimage -L"
One can also specify which scanners are true duplex,
so the script will scan fake duplex mode when true duplex is not available.
One can also specify lp printer instances so one can scan direclty to printer;
e.g. if you scan a document in duplex mode on letter-sized paper,
it will be printed in duplex from the appropriate tray holding letter-sized paper.
By default the script scans from the ADF in grayscale @300dpi and saves to format PDF.
So to scan a letter-sized document from the ADF @300dpi grayscale,
then compress using lzw, binarize using djvu and save to OUTFILE.pdf
one would use:
bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw -BW OUTFILE
or for legal-sized paper
bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw -BW OUTFILE
or letter-sized paper from the FlatBed:
bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw --source=FB -BW OUTFILE
To simplify things, I usually define some aliases for black/white, grayscale and color scanning:
alias b='bscan --mode=1-bit --page=Letter' --comp='lzw'
alias bl='bscan --mode=1-bit --page=Legal --comp=lzw'
alias B='bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw'
alias BL='bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw'
alias C='bscan --mode=color --shades=32 --page=Letter --comp=lzw'
alias CL='bscan --mode=color --shades=32 --page=Legal --comp=lzw'
alias truecolor='bscan --mode=color --shades=truecolor --page=Letter --comp=lzw'
Then to scan in b/w from the ADF @300dpi grayscale a letter-sized document:
b OUTFILE
for legal-sized:
bl OUTFILE
To scan in grayscale and binarize using djvu wavelet compression:
For letter:
B -BW OUTFILE
For Legal:
BL -BW OUTFILE
For letter using pnmtools' truecolor shades:
truecolor -c44 --djvutopdf=25 OUTFILE
For letter using duplexing and djvu binarization:
B -duplex -BW OUTFILE
Or to rotate the document 180 degrees:
B --rot=r180 OUTFILE
To save to another format, use --format={pnm,tif,pdf,ps,djv} or alternatively,
-pnm <equivalent to --format=pnm>
-tif <equivalent to --format=tif>,
and similarly for the other output options:
-pdf, -ps, -djv
Shortcut options, like the above switches take a single '-'
and arguments requiring a value have the form '--option=value'
One can specify various binarization algorithms,
such as those from Fred Weinhaus http://www.fmwconcepts.com/imagemagick/index.html
using the option --thresh={bw, constant, 2color, fuzzy, isodata, kmeans, sahoo, triangle, }
where the various binarization scripts must be in your $PATH.
If you use xsane or gscan2pdf to scan some images because, e.g. you need to crop the image
or tweak the contrast/brightness/gamma settings,
you can save the images as OUTFILE.%d.pnm
e.g. OUTFILE.0001.pnm, OUTFILE.0002.pnm, ...
Then use can use bscan with the option "-noscan" to skip the scanning,
and instead just process the images:
e.g., to rotate the images 180degrees and binarize using djvu compression:
B -noscan -BW --rot=180 OUTFILE
which would process the series of images and create one multipage OUTFILE.pdf
One can also deskew images using unpaper from http://unpaper.berlios.de/
The options to "unpaper" are hardwired into bscan because the options are just too numerous
to specify on the commandline.
so it might be best to just make alocal copy of bscan,
and modify the line which runs unpaper using whatever unpaper options you need.
Alternatively, you could add an option for unpaper settings
so that you could scan, e.g. B --unpaper=setting1 -BW OUTFILE
where setting1 would be specified in ~/.bscanrc or hardwired into bscan.
To photocopy, i.e. scan the print to printer:
For letter printed to PRINTERLETTER
B -prn --n=<number of copies>
For legal printed to PRINTERLEGAL
BL -prn --n=<#copies>
Or for duplex letter to PRINTERLTRDUP
B -duplex -prn --n=<#copies>
And legal duplex to PRINTERLGLDUP
BL -duplex -prn --n=<#copies>
You just need to define the lp printer instances in /etc/cups/lpoptions or ~/.cups/lpoptions
However, I find that KDE keeps modifying/deleting any printer instances in ~/.cups/lpoptions
so I given up and just use /etc/cups/lpoptions, which KDE leaves untouched.
You can define lp printer instances using lpoptions,
but I find it easier to just directly edit /etc/cups/lpoptions
e.g. for my Xerox Phaser8860 print queue
I can define a letter,color,simplex queue:
Dest Phaser8860/letter Duplex=None fitplot=false InputSlot=Tray2 media=letter MediaType=Auto OutputMode=Enhanced PageRegion=letter PageSize=letter
And in my ~/.bscanrc, I add the name of the printer destination:
PRINTERLETTER="Phaser8860/letter"
And similarly for a color duplex-letter queue:
Dest Phaser8860/ltrdup Duplex=DuplexNoTumble fitplot=false InputSlot=Tray2 media=letter MediaType=Auto OutputMode=Enhanced PageRegion=letter PageSize=letter sides=two-sided-long-edge
with the destination
PRINTERLTRDUP="Phaser8860/ltrdup"
in my ~/.bscanrc
"bscan" will choose the appropriate letter/legal simplex/duplex printer destinations depending on whether the scan was letter/legal, simplex/duplex. |
|
Back to top |
|
|
undrwater Guru
Joined: 28 Jan 2003 Posts: 312 Location: Caucasia
|
Posted: Thu Oct 20, 2011 6:21 pm Post subject: Re: Another version based on Zacchaeus Pearsall's scan scrip |
|
|
redwood wrote: | I was googling for Zacchaeus Pearsall's original version of this script, when I found this page.
I too used his script as a starting point when writing a shell script for batch document scanning using scanadf.
My version "bscan" is available at... |
Thank you for this! Brother had provided some scripts but they used a tool that not longer works. I will have to see if I can use this. _________________ Open-mindedness is painful... |
|
Back to top |
|
|
|