Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Scan multi-page documents directly to pdf quickly.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
PowerFactor
Veteran
Veteran


Joined: 30 Jan 2003
Posts: 1693
Location: out of it

PostPosted: Wed Apr 28, 2004 8:19 pm    Post subject: Scan multi-page documents directly to pdf quickly. Reply with quote

I don't know if anyone else has been as frustrated by the lack of easy to use software focused on document scanning on linux. I'm not talking about ocr, just scanning documents into a portable multi-page image format. In fact the only software that I've found that does exactly what I wanted was Adobe Acrobat on windows. But with that I had to go through the windows twain driver interface for my scanner which seems designed to make the process as slow and clumsy as possible. Still, that was the method I used for the last couple years on occasions when it was useful.
Finally, after I bought a new printer/scanner a couple month ago, (epson cx3200, its nice) I decided it was time figure out how to do the job on linux. By then I knew all the command-line tools to do what I wanted were available, it was "just" a matter of writing a little script to tie it all together. Being the amature I am it took me a couple days to figure it all out, but I got it working. This was back in January.
The other day I was playing around with controlling it with kdialog and I thought maybe someone else would find it usefull (the non kde dependant version that is) So I figured why not post it. I did try to make it a little more user friendly. It's still an ugly hack, but it works for me. Not only that, but once it's setup it's better at its specific purpose than anything else I've tried. 8)

The script depends on the following packages.

sane-frontends
imagemagick
netpbm
ghostscript

If your scanner has a decent 1-bit(Lineart) mode (or if you can actually get convert's threshold function to work for you) then you can modify the script slightly and get rid of the netpbm dependency.

You need to know how to use the scanimage program with your scanner, as you will need to modify the SCANDEVICE and SCANCMD variables to fit. The rest of the configuration is pretty self explanatory I think.

To use it you just put you first page in the scanner then run the script with the name of the file to save as the argument. It will then immediately scan the first page then prompt you for more. The rest is gravy.

It's not very robust, if your scanner has a warm-up period then make sure it's finished before you start. Otherwise scanimage may timeout and the script gets a little confused then. And it's not designed to work with scanners that have an adf.

Anyway, hope someone can use it. Even if just for inspiration. :lol:
EDIT: Later versions posted further down the thread. Chrwei posted one that should work with an ADF(I don't have the hardware to try it) and I've posted the python version I've been using for a while. This version is left here mainly for reference.

Code:
#!/bin/bash
#
# scan-pdf version 2
# April 27, 2004
# Copyright 2004 Zacchaeus Pearsall (zap4260 at yahoo.com)
# Distributed under the terms of the GNU General Public License v2


# Set this to a value between 0 and 1
# You may have to play with it some to get good scans
# of colored paper
THRESHOLD=0.55

# Scan resolution
RES=300

# Set these for the size paper you are scanning
# defaults are X=212.5, Y=275; for US letter paper
X=212.5
Y=275

# Paper size for pdf output. See "man convert" for possibilites
PAGESIZE="Letter"

# Set this to the appropriate sane device for your scanner
SCANDEVICE="epson:/dev/usb/scanner0"

# Leave this as Gray unless you have a scanner that has
# a good useable 1-bit mode
SCANMODE=Gray

# If your scanner has a good 1-bit mode and you plan
# to use it then uncomment this. I have no way to test this.
# You may need to make some other modificatinos as well.
#SCANNER_HAS_BW="yes"

# Modify the scan command as needed to fit your scanner
SCANCMD="scanimage -d ${SCANDEVICE} --mode ${SCANMODE} \
    -x ${X} -y ${Y} --resolution ${RES}"


if [ -z ${1} ]
then
    echo "USAGE: scan-pdf file.pdf"
    exit 0
fi

if [ -e ${1} ]
then
    echo "${1}: file exists"
    echo "Press any key to continue, Ctrl-c to quit"
    read -s -n 1 junk
fi


PDFFILE=${1}
TMPDIR=`/bin/mktemp -td scan-pdf.XXXXXXXXXX`

trap "rm -fr ${TMPDIR}" 0
trap "exit 2" 1 2 3 15

i=1
MORE_PAGES="y"
echo "To scan more sheets press space; press q when done"
while [ -z ${MORE_PAGES} ] || [ ${MORE_PAGES} != "q" ]
do
    if [ ${i} -lt 10 ]
    then
        PG="pg0${i}"
    else
        PG="pg${i}"
    fi
   
    if [ ${SCANNER_HAS_BW} ]
    then
        ${SCANCMD} > ${TMPDIR}/${PG}.pbm
    else
        ${SCANCMD} | pgmtopbm -threshold -value ${THRESHOLD} >\
    ${TMPDIR}/${PG}.pbm
    fi
   
    read -s -p "More?" -n 1 MORE_PAGES
    echo  ' '
    let i++
done

convert ${TMPDIR}/pg*.pbm -adjoin -page ${PAGESIZE} ${TMPDIR}/pgs.ps
ps2pdf13 ${TMPDIR}/pgs.ps ${TMPDIR}/pgs.pdf
mv -i ${TMPDIR}/pgs.pdf ${PDFFILE}


Last edited by PowerFactor on Sun Feb 11, 2007 4:42 pm; edited 1 time in total
Back to top
View user's profile Send private message
FatherBusa
Apprentice
Apprentice


Joined: 21 Mar 2004
Posts: 166

PostPosted: Sat Jan 08, 2005 2:13 am    Post subject: Reply with quote

Dude, you're a genius. This is just what I was looking for. Thanks!
Back to top
View user's profile Send private message
chrwei
n00b
n00b


Joined: 16 Feb 2005
Posts: 2

PostPosted: Wed Feb 16, 2005 5:20 am    Post subject: Reply with quote

very nice, here's my enhancements :)

summary:
- Added command line options with defaults
- Added ADF support with command line toggle to to use flatbed. can be set to use flatbed by default with command line toggle to use ADF.
- Changed to use scanimage's batch mode and prompt so that timeouts shouldn't be an issue. ADF doesn't use the prompt
- Made the scanner device name optional as scanimage will normaly detect your scanner automaticaly.

scanners tried:
- HP Officejet 6110

TODO:
- add more paper size options
- NetPBM says pgmtopbm is depreciated as of 7/2004 and to use pamditherbw instead. I plan on only doing color or full greyscale documents so I'm not touching this.

bugs:
- "mode" seems to be scanner specific, some want "Grey" others want "Greyscale". - needs testing
- might be an isue with providing -x and -y when using ADF, I need to test more

things-i-wish-worked-better:
- too many temp files!

and the code:
Code:

#!/bin/bash
#
# scan-pdf version 3
# April 27, 2004
# Copyright 2004 Zacchaeus Pearsall (zap4260 at yahoo.com)
# Distributed under the terms of the GNU General Public License v2
#
# February 15, 2005
# Chris Weiss
# - Added ADF and color support
# - Changed to use sane's built in batch mode
# - Added command line options

###defaults - set these so you don't have to supply them
###                    on teh command line ever time

# Set this to a value between 0 and 1
# You may have to play with it some to get good scans
# of colored paper,  only used for BW scans
THRESHOLD=0.55

# Scan resolution
RES=300

# Set this to the appropriate sane device for your scanner
# if you have more than one or sane doens't autodetect your scanner
# SCANDEVICE="epson:/dev/usb/scanner0"
SCANDEVICE=""

# use ADF in no-prompt batch mode.  Add the options your printer needs for this
ADF=Y
ADFOPTS="--batch-scan=yes"  #hpoj

# for black and white choose grey
# if you have a scanner with a good 1-bit mode choose lineart
# for full color PDF's choose color
SCANMODE="color"

# If your scanner has a good 1-bit mode and you plan
# to use it then change this to Y
SCANNER_HAS_BW=N

# Paper size for pdf output. See "man convert" for possibilites
PAGESIZE="Letter"
#TODO: add X and Y sizes for more paper

# additional options
ADDOPT=""

###end defaults - you shouldn't need to modify anything below here

myname=`basename "$0"`

usage() {
cat<<EOF
$myname scans documents from your flatbed or ADF scanner and stores them in a multi page pdf.

Usage: $myname [Options] filename.pdf
   Options:
   -page "size"   Page size for the PDF.  See "man convert" for possibilites
   -mode "mode"   lineart, greyscale, or color.
   -1bit [Y/N]   If you scanner has a good 1-bit more and you want lineart, use Y here.
   -adf [Y/N]   Use ADF in no-prompt batch mode (Y/N) - edit this script and set your scanners options
   -res dpi   Resolution to scan at in DPI
   -opts "options"   Additional option to pass to 'scanimage' program
   -threshold 0.55   Value between 0 and 1 to pass to 'pgmtopbm'
   -h Help      This info.
   
Open $myname in your favorite editor to change the default values.

$myname requires: sane-frontends, imagemagick, netpbm, and ghostscript

EOF
exit 0
}

while [ $# -ne 0 ];
do
    case "$1" in
   -page)     shift;   PAGESIZE=$1 ;;
   -mode)      shift;   SCANMODE=$1 ;;
   -1bit)      shift;   SCANNER_HAS_BW=$1 ;;
   -res)      shift;   RES=$1 ;;
   -threshold)   shift;   THRESHOLD=$1 ;;
   -opts)      shift;   ADDOPT=$1 ;;
   -adf)      shift;   ADF=$1 ;;
   -h)      usage ;;
   *)      PDFFILE=${1} ;;
    esac
    shift
done

if [ -z ${PDFFILE} ]; then
   usage
fi

if [ -e ${PDFFILE} ]
then
    echo "${PDFFILE}: file exists"
    echo "Press any key to overwrite, Ctrl-c to quit"
    read -s -n 1 junk
fi

outdir=`dirname "$PDFFILE"`

OPTIONS=""
BITCONVERT=""

case "$SCANMODE" in
color)
   OPTIONS=" --mode color"
   ;;
grayscale)
   OPTIONS=" --mode Greyscale"
   ;;
lineart)
   
   if [ "$SCANNER_HAS_BW" = "Y"] || [ "$SCANNER_HAS_BW" = "y"]; then
      OPTIONS=" --mode Lineart"
   else
      BITCONVERT="Y"
      OPTIONS=" --mode Greyscale"
   fi
esac

OPTIONS="$OPTIONS --resolution $RES"
if [ "$SCANDEVICE" != "" ]; then
   OPTIONS="-d $SCANDEVICE $OPTIONS"
fi
if [ "$ADF" = "Y" ] || [ "$ADF" = "y" ]; then
   OPTIONS="$OPTIONS $ADFOPTS"
else
   OPTIONS="$OPTIONS --batch-prompt"
fi
if [ -z $ADDOPT ]; then
   OPTIONS="$OPTIONS $ADDOPT"
fi

case "${PAGESIZE}" in
Letter)
   OPTIONS="$OPTIONS -x 212.5 -y 275" ;;
esac


origdir=`pwd`
cd "$outdir"
#echo "scanimage $OPTIONS -b "
scanimage $OPTIONS -b

if [ "$BITCONVERT" != "" ]; then
   echo "Converting greyscale to lineart"
   for f in "out*.pnm"; do
      #echo "cat $f | pgmtopbm -threshold -value ${THRESHOLD} > $f.pbm"
      cat $f | pgmtopbm -threshold -value ${THRESHOLD} > $f.pbm
      rm -f $f
   done
   #echo "convert out*.pbm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps"
   echo "creating postscript"
   convert out*.pbm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps
   rm -f out*.pbm
else
   #echo "convert out*.pnm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps"
   echo "creating postscript"
   convert out*.pnm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps
   rm -f out*.pnm
fi

#echo "ps2pdf13 ${PDFFILE}.ps ${PDFFILE}.ps"
echo "Convert postscript to PDF"
ps2pdf13 ${PDFFILE}.ps ${PDFFILE}
rm -f ${PDFFILE}.ps

cd "$origdir"
Back to top
View user's profile Send private message
r.abbott
Tux's lil' helper
Tux's lil' helper


Joined: 16 Aug 2004
Posts: 113
Location: Herat, Afghanistan

PostPosted: Sun Feb 20, 2005 2:34 am    Post subject: Reply with quote

This thing is great! Thanks.
Back to top
View user's profile Send private message
gcediel
n00b
n00b


Joined: 27 Jul 2004
Posts: 21
Location: Madrid, Spain

PostPosted: Fri Apr 22, 2005 7:13 pm    Post subject: Reply with quote

One (maybe silly) question: How can I make scanimage stop scanning more pages? I have tried several keys, but I can't stop it.
_________________
Best regards.

Guillermo
Back to top
View user's profile Send private message
r.abbott
Tux's lil' helper
Tux's lil' helper


Joined: 16 Aug 2004
Posts: 113
Location: Herat, Afghanistan

PostPosted: Fri Apr 22, 2005 8:17 pm    Post subject: Reply with quote

Use <Ctrl-D> :)
Back to top
View user's profile Send private message
chrwei
n00b
n00b


Joined: 16 Feb 2005
Posts: 2

PostPosted: Sat Apr 23, 2005 1:52 am    Post subject: Reply with quote

I haven't used it in a while, but I think it tells you that on screen, at least it did on mine. You should run it in a terminal and not just from a "run" dialog.
Back to top
View user's profile Send private message
gcediel
n00b
n00b


Joined: 27 Jul 2004
Posts: 21
Location: Madrid, Spain

PostPosted: Sun Apr 24, 2005 10:37 am    Post subject: Reply with quote

Well, CTRL+D doesn't work for me.

BTW: very nice stuff!
_________________
Best regards.

Guillermo
Back to top
View user's profile Send private message
djmaze
n00b
n00b


Joined: 25 Jun 2003
Posts: 36
Location: Berlin, Germany

PostPosted: Sun Apr 24, 2005 11:09 am    Post subject: Reply with quote

CTRL+C works for me. (Try it two times, if it doesn't work.)
Back to top
View user's profile Send private message
gcediel
n00b
n00b


Joined: 27 Jul 2004
Posts: 21
Location: Madrid, Spain

PostPosted: Sun Apr 24, 2005 3:49 pm    Post subject: Reply with quote

Thanks, it works, although not a clean way.
_________________
Best regards.

Guillermo
Back to top
View user's profile Send private message
zatalian
Apprentice
Apprentice


Joined: 27 Aug 2002
Posts: 179
Location: Gent, Belgium

PostPosted: Mon Sep 04, 2006 2:39 pm    Post subject: Reply with quote

this script used to work for me but now convert gives me trouble...

convert -page letter converts the original image to a blank postscript file. Converting without the -page option works but then the pdf document is not in the correct format. Is this happening to anybody else? Any sollutions?
Back to top
View user's profile Send private message
bludger
Guru
Guru


Joined: 09 Apr 2003
Posts: 389

PostPosted: Thu Sep 21, 2006 7:36 am    Post subject: Reply with quote

My HP 3500c doesn't have the mode function at all. This means that it can only output colour images. How would you convert something like this to black and white?

Also I had a number of tiff files and managed to convert them into a multi page pdf with:
convert <tif1> <tif2> <tif3> file.pdf

Why not just convert like this, leaving out the intermediate ps stage?
Back to top
View user's profile Send private message
bludger
Guru
Guru


Joined: 09 Apr 2003
Posts: 389

PostPosted: Fri Sep 22, 2006 4:29 pm    Post subject: Reply with quote

I solved my problem with the following:

scanimage -d <device> --resolution 150|ppmtopgm|pamthreshold -simple >tempscanfile1.pbm
convert -compress fax tempscanfile*.pbm outfile.pdf

This produced a 33kB file with resolution 150 and a 60kB file with resolution 300. The 150 res version was readable, but a bit ugly and the 300 version was excellent.
Back to top
View user's profile Send private message
bludger
Guru
Guru


Joined: 09 Apr 2003
Posts: 389

PostPosted: Mon Jan 01, 2007 6:38 pm    Post subject: Reply with quote

I have been using the above method successfully and conveniently for the last few months now. One problem that I have found is that when I try to convert multiple pbm files into one multi page pdf, I can quickly run out of memory if I get above 6 pages or so. Does anyone have any suggestions as to how to get around this?
Back to top
View user's profile Send private message
martoss
n00b
n00b


Joined: 09 Dec 2003
Posts: 25

PostPosted: Wed Jan 03, 2007 8:52 pm    Post subject: ...xsane? Reply with quote

Isn't xsane doing the same?

My xsane version has an option to scan "pages" to a pdf. Works pretty well AFAIR. I don't see a big difference.
Xsane has also other nice features like just "copying stuff" and "emailing stuff". Anyways, your script sounds also nice :-)
Back to top
View user's profile Send private message
PowerFactor
Veteran
Veteran


Joined: 30 Jan 2003
Posts: 1693
Location: out of it

PostPosted: Sun Feb 11, 2007 3:55 pm    Post subject: Reply with quote

It seems I have been lax in keeping up with this. Better late than never I guess.


zatalian wrote:
this script used to work for me but now convert gives me trouble...

convert -page letter converts the original image to a blank postscript file. Converting without the -page option works but then the pdf document is not in the correct format. Is this happening to anybody else? Any sollutions?

I've ran into this several times. Seems to be some interdependancy between imagemagick and ghostscript that caused a problem when I upgrade one or the other. Usually recompiling imagemagick after a ghostscript upgrades fixes it.

bludger wrote:
I have been using the above method successfully and conveniently for the last few months now. One problem that I have found is that when I try to convert multiple pbm files into one multi page pdf, I can quickly run out of memory if I get above 6 pages or so. Does anyone have any suggestions as to how to get around this?


I ran into this a while back too. You need use the "-limit Memory" and possibly the "-limit Map" options for convert to limit it's ram usage. Usually 1/4 of my physcal ram seems to work well enough. It does take a long time to convert though.

martoss wrote:
Isn't xsane doing the same? ...

It is now and I'm glad to see it. It didn't have those options 3 years ago though when I posted this. It still looks like this script might be more convenient in for some tasks. Xsane is probably less buggy though. ;)
Back to top
View user's profile Send private message
PowerFactor
Veteran
Veteran


Joined: 30 Jan 2003
Posts: 1693
Location: out of it

PostPosted: Sun Feb 11, 2007 4:36 pm    Post subject: Reply with quote

I've also made some changes since that original version. I converted it to python and added some ncurses "eyecandy" using dialog. Also got rid of the netpbm dependency. I had intended to rewrite it as a "proper" modular program with a seperate config file and such but never got very far with it. It's not something I use very often anyway.

Anyhow, here's my latest working version. Plenty of bugs I'm sure but it mostly works when I need it.

Dependencies have changes a little:

dev-lang/python
media-gfx/sane-frontends
media-gfx/imagemagick
virtual/ghostscript
dev-util/dialog

Code:
#!/usr/bin/python
#
# scan-pdf version 8
# Sep 13, 2004
# Copyright 2004 Zack Pearsall (zap4260@yahoo.com)
# Distributed under the terms of the GNU General Public License v2


import sys
import os
import signal
import shutil
import math
from tempfile import mkdtemp

# Set this to a value between 0 and 1
# You may have to play with it some to get good scans
# of colored paper
THRESHOLD="0.55"

# Scan resolution
RES="300"

# Set these for the size paper you are scanning
# defaults are X=212.5, Y=275; for US letter paper
X="212.5"
Y="275"

# Paper size for pdf output. See "man convert" for possibilites
PAGESIZE="Letter"

# Set this to the appropriate sane device for your scanner
SCANDEVICE="epson:"

# Leave this as Gray unless you have a scanner that has
# a good useable 1-bit mode
SCANMODE="Gray"

# Modify the scan command as needed to fit your scanner
SCANCMD="scanimage -d " + SCANDEVICE + " --mode " + SCANMODE + \
    " -x " + X + " -y " + Y + " --resolution " + RES

# ImageMagick limits
MEMLIMIT="128"
MAPLIMIT="256"

# End of configuration options
   
def cleanup(signum=0 , stkframe=0):
    if os.path.isdir(TMPDIR):
        shutil.rmtree(TMPDIR, 1)
    sys.exit(signum)

def imove(src, dest, ask=True):
    if ask and os.path.isfile(dest):
        userin=raw_input("overwrite file '" + dest + "'? ")
        if userin == "y" or userin == "yes":
            shutil.move(src, dest)
    else:
        shutil.move(src, dest)

   
   
def scanpage(scancmd, pgname):
    buffsize=1024
   
    progress=os.popen("dialog --gauge \"Scanning...\" 6 60", "w")
    scan=os.popen(scancmd)
   
    magic=scan.readline()
    if magic == "P4\n":
        outfilename=pgname + ".pbm"
    elif magic == "P5\n":
        outfilename=pgname + ".pgm"
    else:
        outfilename=pgname + ".ppm"
       
    outfile=file(outfilename, "w")
    outfile.write(magic)
   
    buff=scan.readline()
    while buff.startswith("#") or buff.startswith("\n"):
        outfile.write(buff)
        buff=scan.readline()
       
    hw=buff.split()
    outfile.write(buff)
    if magic == "P4\n":
        bmsize=int(hw[0]) * int(hw[1]) / 8
    elif magic =="P5\n":
        buffsize=8192
        buff=scan.readline()
        bmsize=int(float(hw[0]) * float(hw[1]) * math.ceil(math.log(float(buff)+1, 2)) / 8)
        outfile.write(buff)
       
    bytescopyed=0
    prevpercent=0
    buff=scan.read(buffsize)
    while len(buff) != 0:
        bytescopyed+=len(buff)
        if int(math.ceil(float(bytescopyed) / float(bmsize) * 100)) > prevpercent:
            prevpercent=int(math.ceil(float(bytescopyed) / float(bmsize) * 100))
            progress.write(str(prevpercent) + "\n")
            progress.flush()
           
        outfile.write(buff)
        buff=scan.read(buffsize)
 
    progress.close()       
    scan.close()
    outfile.close()
   
   
# start of main program
if len(sys.argv) <= 1:
    print "USAGE: scan-pdf file.pdf\n"
    sys.exit(0)

PDFFILE=sys.argv[1]

if os.path.isfile(PDFFILE):
    print '\a'
    if os.system("dialog --yesno \"${1}: File exists! Continue?\" 15 60") != 0:
        os.system("clear")
        sys.exit(0)
elif os.path.isdir(PDFFILE):
    print "\a'" + PDFFILE + "' is a directory!"
    sys.exit(0)


TMPDIR=mkdtemp("scan-pdf")

signal.signal(signal.SIGHUP, cleanup)
signal.signal(signal.SIGINT, cleanup)
signal.signal(signal.SIGQUIT, cleanup)
signal.signal(signal.SIGTERM, cleanup)


os.system("dialog --msgbox \"Insert first page\" 5 24")

i=1
retval=0
while retval == 0:
    PG="pg" + str(i).zfill(5)
    scanpage(SCANCMD, TMPDIR + "/" + PG)
    i+=1
    retval=os.system("dialog --yesno \"Scan another page?\" 5 25")


os.system("dialog --infobox \"Converting...\" 3 23")
os.system("convert -limit Memory " + MEMLIMIT + " -limit Map " + MAPLIMIT + \
    " -threshold " + str(int(float(THRESHOLD) * 65535)) + " " + \
    TMPDIR + "/pg*.pgm -adjoin -page " + PAGESIZE + " " + TMPDIR + "/pgs.ps")
os.system("ps2pdf13 " + TMPDIR + "/pgs.ps " + TMPDIR + "/pgs.pdf")
os.system("clear")
imove(TMPDIR + "/pgs.pdf", PDFFILE)
cleanup()
Back to top
View user's profile Send private message
bludger
Guru
Guru


Joined: 09 Apr 2003
Posts: 389

PostPosted: Thu Feb 15, 2007 10:08 am    Post subject: Reply with quote

PowerFactor wrote:
I ran into this a while back too. You need use the "-limit Memory" and possibly the "-limit Map" options for convert to limit it's ram usage. Usually 1/4 of my physcal ram seems to work well enough. It does take a long time to convert though.
Thanks for this. I just found this out independantly today and was returning to the thread to post my results, but it appears that you beat me too it.

I have just one question though. I used only the memory limit option. What does the map limit option actually do? The documentation seems rather sparse.
Back to top
View user's profile Send private message
PowerFactor
Veteran
Veteran


Joined: 30 Jan 2003
Posts: 1693
Location: out of it

PostPosted: Fri Feb 16, 2007 3:41 am    Post subject: Reply with quote

As I understand it the Map limit option limits the amount of filespace that can be mmaped for pixel cache.

http://en.wikipedia.org/wiki/Memory-mapped_file

I think theres probably no need to use the Map limit on most systems. I think I just put it in mine because I had no clue how mmaping worked back then. It doesn't seem to make any performace difference when I remove it.
Back to top
View user's profile Send private message
bludger
Guru
Guru


Joined: 09 Apr 2003
Posts: 389

PostPosted: Fri Feb 16, 2007 4:01 pm    Post subject: Reply with quote

To get the scan device, I had been performing the following:
SCANDEVICE=$(scanimage -L|grep hp3500|awk -F '`' '{print $2}'|awk -F \' '{print $1}')
(my device is an hp3500)

This would read the correct usb port. From your script, I see that it might be possible to use just "hp3500:". I'll give that a try.
Back to top
View user's profile Send private message
csim
n00b
n00b


Joined: 13 Feb 2006
Posts: 23

PostPosted: Sat Jan 26, 2008 4:16 pm    Post subject: Reply with quote

Hi,

i have a small suggestion:
i think it would be cool to have the basic parameters accessible via some kind of menu for example:

scanimage -L lists all available devices, it would be cool to select them via dropdown menu...
Code:

device `v4l:/dev/video0' is a Noname stk11xx virtual device
device `plustek:libusb:002:002' is a Canon LiDE25 USB flatbed scanner

scanimage --resolution=300 -x 210 -y 297 -d plustek:libusb:002:002 > /home/user1/image.pnm
Basically having a 2 dropdown menus specifying paper size (A4 would translate to -x210 -y297) and DPI would be also cool

Let me dream a bit about this, having such simple (preferably GTK+ based interface) with a Finish button and a place where you can name your pdf would be great.

Code:


Select device:     Select Paper Size:   Select Resolution in DPI:
[Canon Lide 25]    [A4]                 [300]

[Your Name of pdf] . pdf   
[/your/location]                     [Choose location...]               


                                          [Scan]    [Finish]
Back to top
View user's profile Send private message
redwood
Guru
Guru


Joined: 27 Jan 2006
Posts: 306

PostPosted: Mon Sep 19, 2011 10:37 pm    Post subject: Another version based on Zacchaeus Pearsall's scan script Reply with quote

I was googling for Zacchaeus Pearsall's original version of this script, when I found this page.
I too used his script as a starting point when writing a shell script for batch document scanning using scanadf.

My version "bscan" is available at http://www.acjlaw.net:8080/~jeremy/Ricoh/usage_bscan.html

It uses a configuration file, ~/.bscanrc
where one can list all your scanners in a bash array,
with devices names as shown by "scanimage -L"
and the default scanner being SCANDEVICE="${scanners[0]}"

Importantly, specifying the scanner names in ~/.bscanrc saves time
since the script then skips finding the scanners using "scanimage -L"

One can also specify which scanners are true duplex,
so the script will scan fake duplex mode when true duplex is not available.
One can also specify lp printer instances so one can scan direclty to printer;
e.g. if you scan a document in duplex mode on letter-sized paper,
it will be printed in duplex from the appropriate tray holding letter-sized paper.

By default the script scans from the ADF in grayscale @300dpi and saves to format PDF.
So to scan a letter-sized document from the ADF @300dpi grayscale,
then compress using lzw, binarize using djvu and save to OUTFILE.pdf
one would use:

bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw -BW OUTFILE

or for legal-sized paper
bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw -BW OUTFILE

or letter-sized paper from the FlatBed:
bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw --source=FB -BW OUTFILE

To simplify things, I usually define some aliases for black/white, grayscale and color scanning:

alias b='bscan --mode=1-bit --page=Letter' --comp='lzw'
alias bl='bscan --mode=1-bit --page=Legal --comp=lzw'

alias B='bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw'
alias BL='bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw'

alias C='bscan --mode=color --shades=32 --page=Letter --comp=lzw'
alias CL='bscan --mode=color --shades=32 --page=Legal --comp=lzw'

alias truecolor='bscan --mode=color --shades=truecolor --page=Letter --comp=lzw'

Then to scan in b/w from the ADF @300dpi grayscale a letter-sized document:
b OUTFILE

for legal-sized:
bl OUTFILE

To scan in grayscale and binarize using djvu wavelet compression:
For letter:
B -BW OUTFILE

For Legal:
BL -BW OUTFILE

For letter using pnmtools' truecolor shades:
truecolor -c44 --djvutopdf=25 OUTFILE

For letter using duplexing and djvu binarization:
B -duplex -BW OUTFILE

Or to rotate the document 180 degrees:
B --rot=r180 OUTFILE

To save to another format, use --format={pnm,tif,pdf,ps,djv} or alternatively,
-pnm <equivalent to --format=pnm>
-tif <equivalent to --format=tif>,
and similarly for the other output options:
-pdf, -ps, -djv

Shortcut options, like the above switches take a single '-'
and arguments requiring a value have the form '--option=value'

One can specify various binarization algorithms,
such as those from Fred Weinhaus http://www.fmwconcepts.com/imagemagick/index.html
using the option --thresh={bw, constant, 2color, fuzzy, isodata, kmeans, sahoo, triangle, }
where the various binarization scripts must be in your $PATH.


If you use xsane or gscan2pdf to scan some images because, e.g. you need to crop the image
or tweak the contrast/brightness/gamma settings,
you can save the images as OUTFILE.%d.pnm
e.g. OUTFILE.0001.pnm, OUTFILE.0002.pnm, ...
Then use can use bscan with the option "-noscan" to skip the scanning,
and instead just process the images:
e.g., to rotate the images 180degrees and binarize using djvu compression:
B -noscan -BW --rot=180 OUTFILE
which would process the series of images and create one multipage OUTFILE.pdf

One can also deskew images using unpaper from http://unpaper.berlios.de/
The options to "unpaper" are hardwired into bscan because the options are just too numerous
to specify on the commandline.
so it might be best to just make alocal copy of bscan,
and modify the line which runs unpaper using whatever unpaper options you need.
Alternatively, you could add an option for unpaper settings
so that you could scan, e.g. B --unpaper=setting1 -BW OUTFILE
where setting1 would be specified in ~/.bscanrc or hardwired into bscan.


To photocopy, i.e. scan the print to printer:
For letter printed to PRINTERLETTER
B -prn --n=<number of copies>

For legal printed to PRINTERLEGAL
BL -prn --n=<#copies>

Or for duplex letter to PRINTERLTRDUP
B -duplex -prn --n=<#copies>
And legal duplex to PRINTERLGLDUP
BL -duplex -prn --n=<#copies>

You just need to define the lp printer instances in /etc/cups/lpoptions or ~/.cups/lpoptions
However, I find that KDE keeps modifying/deleting any printer instances in ~/.cups/lpoptions
so I given up and just use /etc/cups/lpoptions, which KDE leaves untouched.

You can define lp printer instances using lpoptions,
but I find it easier to just directly edit /etc/cups/lpoptions
e.g. for my Xerox Phaser8860 print queue

I can define a letter,color,simplex queue:
Dest Phaser8860/letter Duplex=None fitplot=false InputSlot=Tray2 media=letter MediaType=Auto OutputMode=Enhanced PageRegion=letter PageSize=letter

And in my ~/.bscanrc, I add the name of the printer destination:
PRINTERLETTER="Phaser8860/letter"

And similarly for a color duplex-letter queue:
Dest Phaser8860/ltrdup Duplex=DuplexNoTumble fitplot=false InputSlot=Tray2 media=letter MediaType=Auto OutputMode=Enhanced PageRegion=letter PageSize=letter sides=two-sided-long-edge

with the destination
PRINTERLTRDUP="Phaser8860/ltrdup"
in my ~/.bscanrc

"bscan" will choose the appropriate letter/legal simplex/duplex printer destinations depending on whether the scan was letter/legal, simplex/duplex.
Back to top
View user's profile Send private message
undrwater
Guru
Guru


Joined: 28 Jan 2003
Posts: 312
Location: Caucasia

PostPosted: Thu Oct 20, 2011 6:21 pm    Post subject: Re: Another version based on Zacchaeus Pearsall's scan scrip Reply with quote

redwood wrote:
I was googling for Zacchaeus Pearsall's original version of this script, when I found this page.
I too used his script as a starting point when writing a shell script for batch document scanning using scanadf.

My version "bscan" is available at...


Thank you for this! Brother had provided some scripts but they used a tool that not longer works. I will have to see if I can use this.
_________________
Open-mindedness is painful...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum