Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[Script] Handle concurrent jobs
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
konaran
n00b
n00b


Joined: 17 Aug 2006
Posts: 1

PostPosted: Fri Aug 18, 2006 11:39 am    Post subject: [Script] Handle concurrent jobs Reply with quote

Hi people,

After working a little bit to handle concurrent processes, I've made this script so you can fire up a list of processes, in a concurrent way by using "n" jobs. For instance, if you have 2 processors and a list of 30 jobs, you can do 4 at a time (the number 4 depends on the arquitecture mostly: 2 threads for each cpu is the most accepted one). In that way you have a balance between load and speed. I hope you can use it. It has been proved on Linux and bash over Solaris 2.6.

The test script, so you can see how it must be called.
Code:

#!/bin/bash


PARALELO=8  # How many processes are we going to run simultaneosly
DEBUG=true  # So you can see how it works (turn it off in a production environment)

source lib_cpu.sh  # we called the library with the script

# The jobs must be called using " (perhaps there is another better way)
# If the jobs are scripts, they have to be called in the same way (using ")
paralelo "sleep 5" "sleep 10" "sleep 1" "sleep 2" "sleep 2" "sleep 10" "sleep 1" "sleep 2"

exit 0


Then the file: lib_cpu.sh (the script that holds the "paralelo" function).
I translate the coments only. Sorry but I was so excited when I saw it works, that I couldn't wait. :)

Code:

##############################################
# Librerias de bajo nivel
# By Pablo Niklas - Bajo licencia GPL
##############################################

# paralelo(): Administrador de procesos paralelos
# By Pablo Niklas (pablo.niklas@gmail.com)
# Bajo Licencia GPL
#
# Uso: paralelo [<lista de procesos a disparar en forma concurrente>]
# Log de la corrida en /tmp/corrida_paralela.PID($$).<dia>.log
#
# Historico de cambios:
# 27/07/2006 - PSRN - Version Inicial.
# 28/07/2006 - PSRN - Agregado de logs.
#                     Reemplazo seq x myseq. ;)
# 29/07/2006 - PSRN - Autodeteccion de CPUs en Linux.
# 31/07/2006 - PSRN - Compatibilidad con Solaris y debugging.
# 01/08/2006 - PSRN - Autodeteccion de CPUs en Solaris 2.6 y 9.
# 05/08/2006 - PSRN - Anulacion de $DEBUG para que tome el del parent.
# 07/08/2006 - PSRN - $PARALELO se pasa por variable del parent.
#

function paralelo() {

# PARALELO:
# Cantidad de procesos en paralelo.
# Depende de el SO y/o la arquitectura. Puede ser:
# 1) Cant. de cpus + 1 (x86)
# 2) Cant. de cpus * 2 (x86)
# 3) Cant. de cpus     (SPARC)
if [ -z "$PARALELO" ]; then
   if [ "`uname -s`"="SunOS" ]; then
      # Para SunOS...
      if [ "`uname -r`" = "5.6" ]; then
         PARALELO=`/usr/platform/`uname -m`/sbin/prtdiag -v|grep "US-"|wc -l`   # Solaris 2.6
      else
         if [ "`uname -r`" = "5.9" ]; then
            PARALELO=`/usr/platform/`uname -m`/sbin/prtdiag -v|grep ^CPU|wc -l`   # Solaris 9
         else
            PARALELO=1   # Default
         fi
      fi
   else
      PARALELO=$((`cat /proc/cpuinfo |grep ^proces|wc -l`*2)) # Linux
   fi
fi

# DIRLOG:
# Directorio donde se depositan los logs temporarios de cada hilo ejecutado.
# Directory where the temporary logs of each thread are left.
DIRLOG=/tmp

############################ COMIENZO DEL ALGORITMO ################################

if $DEBUG; then
   echo "::: Se correran $# procesos en total ($PARALELO en forma concurrente)."
fi

# Inicializo variables del sistema
# Initiating variables.
I=0;SEQ=""
while [ $I -le $(($PARALELO-1)) ] ; do
   PID[$I]=0
   SEQ=$SEQ+"$I " # Como no tengo seq, lo genero :) (In Solaris we don't have seq)
   I=$(($I+1))
done
SEQ=`echo $SEQ|sed 's/+//g'` # Depuro los "+"

TERMINO=false
JOBLOGTMP="job.$$.`date +'%d'`"
TAREA=0
while [ $# != 0 ] || ! $TERMINO; do
   A=0
   for A in $SEQ; do
      # Asigno procesos si tengo lugar.
                # I will asign processes if I we have place.
      if [ ${PID[$A]} -eq 0 ] && [ $# != 0 ]; then
         TAREA=$(($TAREA+1))
         echo "::::: JOB #$TAREA INICIADO - `date +'%d/%m/%Y - %H:%M:%S'` - CPU: #$A" >> $DIRLOG/$JOBLOGTMP.`printf %.3d $TAREA`.log
         echo "::: COMIENZO detalle del Job." >> $DIRLOG/$JOBLOGTMP.`printf %.3d $TAREA`.log
         echo $1 >> $DIRLOG/$JOBLOGTMP.`printf %.3d $TAREA`.log
         echo "::: FIN detalle del Job." >> $DIRLOG/$JOBLOGTMP.`printf %.3d $TAREA`.log
         echo "::: COMIENZO salida del Job." >> $DIRLOG/$JOBLOGTMP.`printf %.3d $TAREA`.log
         $1 1>> $DIRLOG/$JOBLOGTMP.`printf %.3d $TAREA`.log 2>&1 &
         PID[$A]=$!
         TID[$!]=$TAREA

         if $DEBUG ; then
            echo "::::: JOB #$TAREA INICIADO - `date +'%d/%m/%Y - %H:%M:%S'` - CPU: #$A (PID: ${PID[$A]})"
         fi

         shift
      fi
      A=$(($A+1))
   done

   # Jobs end control cycle.
   TERMINO=true
   A=0
   for A in $SEQ; do

      # Los distintos *nix, manejan los procesos a su manera. :)
                # Each *nix handle the processes in their way.
      FINALIZO=false
      if [ "`uname -s`" = "SunOS" ] && [ ${PID[$A]} -gt 0 ]; then
         [ -z "`ps -p ${PID[$A]}|grep -v "   PID TTY      TIME CMD"`" ] && FINALIZO=true
      fi

      if [ "`uname -s`" = "Linux" ] && [ ${PID[$A]} -gt 0 ]; then
         [ -z "`ps --no-heading --pid ${PID[$A]}`" ] && FINALIZO=true
      fi

      if $FINALIZO ; then
         echo "::: FIN salida del Job." >> $DIRLOG/$JOBLOGTMP.`printf %.3d ${TID[${PID[$A]}]}`.log
         echo "::::: JOB #${TID[${PID[$A]}]} FINALIZADO - `date +'%d/%m/%Y - %H:%M:%S'` - CPU: #$A (PID: ${PID[$A]})" >> $DIRLOG/$JOBLOGTMP.`printf %.3d ${TID[${PID[$A]}]}`.log

         if $DEBUG ; then
            echo "::::: JOB #${TID[${PID[$A]}]} FINALIZADO - `date +'%d/%m/%Y - %H:%M:%S'` - CPU: #$A (PID: ${PID[$A]})"
         fi

         echo >> $DIRLOG/$JOBLOGTMP.`printf %.3d ${TID[${PID[$A]}]}`.log
         PID[$A]=0
      fi

      # Salgo del ciclo principal si todas las tareas fueron hechas.
                # Exit the main loop, when all the jobs have been done.
      if [ ${PID[$A]} -gt 0 ]; then
         TERMINO=false
         A=$(($PARALELO-1))
      fi

      A=$(($A+1))

   done
done
############################### FIN ALGORTIMO ################################

# Mergeo los archivos temporales en un solo log para toda la corrida
# Merge all the temporary logs in one, for all the running.
cat $DIRLOG/$JOBLOGTMP* > $DIRLOG/corrida_paralela.$$.`date +'%d'`.log
rm -f $DIRLOG/$JOBLOGTMP*

}


I hope this trick will be usefull to you.

Pablo.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum