Bash multiple files processing

前端 未结 5 1753
感动是毒
感动是毒 2021-01-23 19:21

I have a file named data_file with data: london paris newyork italy...50 more items

Have a directory with over 75 files, say dfile1, dfie2...afle75 in which i am perform

相关标签:
5条回答
  • 2021-01-23 20:01

    You could use grep's q option to stop searching after the first match and f option to obtain the patterns from a file:

    for f in $(find . -type f); do
        if $(grep -qf data_file "$f"); then
            ...
        fi
    done
    

    If data_file contains:

    xxx
    yyy
    zzz
    

    then grep -qf "$data_file" "$f" evaluates to true if either xxx, yyy, or zzz are found in $f.

    0 讨论(0)
  • 2021-01-23 20:07

    You can do it like this :

    files=$(find . -type f)
    
    for f in $files; do
       while read -r line; do
           {
               found=$(grep $line $f)      
    
                if [ ! -z "$found" ]; then
                    ## perform task here
                fi
           } &
       done < data_file 
    done
    wait
    

    It will execute the block within {} in the background. So basically it will open as many background processes as files you have. If you want finer control over how many processes are actually spawned you can instead use parallel.

    0 讨论(0)
  • 2021-01-23 20:08

    The following example is a full blown parallel execution method, that deals with:

    • Execution time (will warn after a certain execution time, and stop tasks after more time has passed)
    • Async logging (keeps logging what's going on while tasks being executed)
    • Parallelism (allows to specify the number of simultaneous tasks)
    • IO related zombie tasks (will not block the execution)
    • Does handle killing of grand children pids
    • Lots of more stuff

    In your example, your (hardened) code would look like:

    # Load the ExecTasks function described below (must be in the same directory as this one)
    source ./exectasks.sh
    
    directoryToProcess="/my/dir/to/find/stuff/into"
    tasklist=""
    
    # Prepare task list separated by semicolumn
    while IFS= read -r -d $'\0' file; do
        if grep "$line" "$file" > /dev/null 2>&1; then
             tasklist="$tasklist""my_task;"
    done < <(find "$directoryToProcess" -type f -print0)
    
    # Run tasks
    ExecTasks "$tasklist" "trivial-task-id" false 1800 3600 18000 36000 true 1 1800 true false false 8
    

    Here we used a complex function ExecTasks that will deal with parallel queueing the tasks, and let you keep control of what's going on without fear to block the script because of some hanged task.

    Quick explanation of ExecTasks arguments:

    "$tasklist" = variable containing task list
    "some name" trivial task id (in order to identify in logs)
    boolean: read tasks from file (you may have passed a task list from a file if there are too many to fit in a variable
    1800 = maximum number of seconds a task may be executed before a warning is raised
    3600 = maximum number of seconds a task may be executed before an error is raised and the tasks is stopped 
    18000 = maximum number of seconds the whole tasks may be executed before a warning is raised 
    36000 = maximum number of seconds the whole tasks may be executed before an error is raised and all the tasks are stopped
    boolean: account execution time since beginning of tasks execution (true) or since script begin
    1 = number of seconds between each state check (accepts float like .1)
    1800 = Number of seconds between each "i am alive" log just to know everything works as expected
    boolean: show spinner (true) or not (false) 
    boolean: log errors when reaching max times (false) or do not log them (true)
    boolean: do not log any errors at all (false) or do log them (true)
    
    And finally
    8 = number of simultaneous tasks to launch (8 in our case)
    

    Here's the source to exectasks.sh (which you can also copy paste directly into your script header instead of source ./exectasks.sh):

    function Logger {
        # Dummy log function, replace with whatever you need
    
        echo "$2: $1"
    }
    
    # Nice cli spinner so we now execution is ongoing
    _OFUNCTIONS_SPINNER="|/-\\"
    function Spinner {
        printf " [%c]  \b\b\b\b\b\b" "$_OFUNCTIONS_SPINNER"
        _OFUNCTIONS_SPINNER=${_OFUNCTIONS_SPINNER#?}${_OFUNCTIONS_SPINNER%%???}
        return 0
    }
    
    # Portable child (and grandchild) kill function tester under Linux, BSD and MacOS X
    function KillChilds {
        local pid="${1}" # Parent pid to kill childs
        local self="${2:-false}" # Should parent be killed too ?
    
        # Paranoid checks, we can safely assume that $pid should not be 0 nor 1
        if [ $(IsInteger "$pid") -eq 0 ] || [ "$pid" == "" ] || [ "$pid" == "0" ] || [ "$pid" == "1" ]; then
            Logger "Bogus pid given [$pid]." "CRITICAL"
            return 1
        fi
    
        if kill -0 "$pid" > /dev/null 2>&1; then
            if children="$(pgrep -P "$pid")"; then
                if [[ "$pid" == *"$children"* ]]; then
                    Logger "Bogus pgrep implementation." "CRITICAL"
                    children="${children/$pid/}"
                fi
                for child in $children; do
                    Logger "Launching KillChilds \"$child\" true" "DEBUG"   #__WITH_PARANOIA_DEBUG
                    KillChilds "$child" true
                done
            fi
        fi
    
        # Try to kill nicely, if not, wait 15 seconds to let Trap actions happen before killing
        if [ "$self" == true ]; then
            # We need to check for pid again because it may have disappeared after recursive function call
            if kill -0 "$pid" > /dev/null 2>&1; then
                kill -s TERM "$pid"
                Logger "Sent SIGTERM to process [$pid]." "DEBUG"
                if [ $? -ne 0 ]; then
                    sleep 15
                    Logger "Sending SIGTERM to process [$pid] failed." "DEBUG"
                    kill -9 "$pid"
                    if [ $? -ne 0 ]; then
                        Logger "Sending SIGKILL to process [$pid] failed." "DEBUG"
                        return 1
                    fi  # Simplify the return 0 logic here
                else
                    return 0
                fi
            else
                return 0
            fi
        else
            return 0
        fi
    }
    
    
    function ExecTasks {
        # Mandatory arguments
        local mainInput="${1}"              # Contains list of pids / commands separated by semicolons or filepath to list of pids / commands
    
        # Optional arguments
        local id="${2:-base}"               # Optional ID in order to identify global variables from this run (only bash variable names, no '-'). Global variables are WAIT_FOR_TASK_COMPLETION_$id and HARD_MAX_EXEC_TIME_REACHED_$id
        local readFromFile="${3:-false}"        # Is mainInput / auxInput a semicolon separated list (true) or a filepath (false)
        local softPerProcessTime="${4:-0}"      # Max time (in seconds) a pid or command can run before a warning is logged, unless set to 0
        local hardPerProcessTime="${5:-0}"      # Max time (in seconds) a pid or command can run before the given command / pid is stopped, unless set to 0
        local softMaxTime="${6:-0}"         # Max time (in seconds) for the whole function to run before a warning is logged, unless set to 0
        local hardMaxTime="${7:-0}"         # Max time (in seconds) for the whole function to run before all pids / commands given are stopped, unless set to 0
        local counting="${8:-true}"         # Should softMaxTime and hardMaxTime be accounted since function begin (true) or since script begin (false)
        local sleepTime="${9:-.5}"          # Seconds between each state check. The shorter the value, the snappier ExecTasks will be, but as a tradeoff, more cpu power will be used (good values are between .05 and 1)
        local keepLogging="${10:-1800}"         # Every keepLogging seconds, an alive message is logged. Setting this value to zero disables any alive logging
        local spinner="${11:-true}"         # Show spinner (true) or do not show anything (false) while running
        local noTimeErrorLog="${12:-false}"     # Log errors when reaching soft / hard execution times (false) or do not log errors on those triggers (true)
        local noErrorLogsAtAll="${13:-false}"       # Do not log any errros at all (useful for recursive ExecTasks checks)
    
        # Parallelism specific arguments
        local numberOfProcesses="${14:-0}"      # Number of simulanteous commands to run, given as mainInput. Set to 0 by default (WaitForTaskCompletion mode). Setting this value enables ParallelExec mode.
        local auxInput="${15}"              # Contains list of commands separated by semicolons or filepath fo list of commands. Exit code of those commands decide whether main commands will be executed or not
        local maxPostponeRetries="${16:-3}"     # If a conditional command fails, how many times shall we try to postpone the associated main command. Set this to 0 to disable postponing
        local minTimeBetweenRetries="${17:-300}"    # Time (in seconds) between postponed command retries
        local validExitCodes="${18:-0}"         # Semi colon separated list of valid main command exit codes which will not trigger errors
    
        local i
    
        # Expand validExitCodes into array
        IFS=';' read -r -a validExitCodes <<< "$validExitCodes"
    
        # ParallelExec specific variables
        local auxItemCount=0        # Number of conditional commands
        local commandsArray=()      # Array containing commands
        local commandsConditionArray=() # Array containing conditional commands
        local currentCommand        # Variable containing currently processed command
        local currentCommandCondition   # Variable containing currently processed conditional command
        local commandsArrayPid=()   # Array containing commands indexed by pids
        local commandsArrayOutput=()    # Array containing command results indexed by pids
        local postponedRetryCount=0 # Number of current postponed commands retries
        local postponedItemCount=0  # Number of commands that have been postponed (keep at least one in order to check once)
        local postponedCounter=0
        local isPostponedCommand=false  # Is the current command from a postponed file ?
        local postponedExecTime=0   # How much time has passed since last postponed condition was checked
        local needsPostponing       # Does currentCommand need to be postponed
        local temp
    
        # Common variables
        local pid           # Current pid working on
        local pidState          # State of the process
        local mainItemCount=0       # number of given items (pids or commands)
        local readFromFile      # Should we read pids / commands from a file (true)
        local counter=0
        local log_ttime=0       # local time instance for comparaison
    
        local seconds_begin=$SECONDS    # Seconds since the beginning of the script
        local exec_time=0       # Seconds since the beginning of this function
    
        local retval=0          # return value of monitored pid process
        local subRetval=0       # return value of condition commands
        local errorcount=0      # Number of pids that finished with errors
        local pidsArray         # Array of currently running pids
        local newPidsArray      # New array of currently running pids for next iteration
        local pidsTimeArray     # Array containing execution begin time of pids
        local executeCommand        # Boolean to check if currentCommand can be executed given a condition
    
        local functionMode
        local softAlert=false       # Does a soft alert need to be triggered, if yes, send an alert once
        local failedPidsList        # List containing failed pids with exit code separated by semicolons (eg : 2355:1;4534:2;2354:3)
        local randomOutputName      # Random filename for command outputs
        local currentRunningPids    # String of pids running, used for debugging purposes only
    
        # fnver 2019081401
    
        # Initialise global variable
        eval "WAIT_FOR_TASK_COMPLETION_$id=\"\""
        eval "HARD_MAX_EXEC_TIME_REACHED_$id=false"
    
        # Init function variables depending on mode
    
        if [ $numberOfProcesses -gt 0 ]; then
            functionMode=ParallelExec
        else
            functionMode=WaitForTaskCompletion
        fi
    
        if [ $readFromFile == false ]; then
            if [ $functionMode == "WaitForTaskCompletion" ]; then
                IFS=';' read -r -a pidsArray <<< "$mainInput"
                mainItemCount="${#pidsArray[@]}"
            else
                IFS=';' read -r -a commandsArray <<< "$mainInput"
                mainItemCount="${#commandsArray[@]}"
                IFS=';' read -r -a commandsConditionArray <<< "$auxInput"
                auxItemCount="${#commandsConditionArray[@]}"
            fi
        else
            if [ -f "$mainInput" ]; then
                mainItemCount=$(wc -l < "$mainInput")
                readFromFile=true
            else
                Logger "Cannot read main file [$mainInput]." "WARN"
            fi
            if [ "$auxInput" != "" ]; then
                if [ -f "$auxInput" ]; then
                    auxItemCount=$(wc -l < "$auxInput")
                else
                    Logger "Cannot read aux file [$auxInput]." "WARN"
                fi
            fi
        fi
    
        if [ $functionMode == "WaitForTaskCompletion" ]; then
            # Force first while loop condition to be true because we don't deal with counters but pids in WaitForTaskCompletion mode
            counter=$mainItemCount
        fi
    
    
        # soft / hard execution time checks that needs to be a subfunction since it is called both from main loop and from parallelExec sub loop
        function _ExecTasksTimeCheck {
            if [ $spinner == true ]; then
                Spinner
            fi
            if [ $counting == true ]; then
                exec_time=$((SECONDS - seconds_begin))
            else
                exec_time=$SECONDS
            fi
    
            if [ $keepLogging -ne 0 ]; then
                # This log solely exists for readability purposes before having next set of logs
                if [ ${#pidsArray[@]} -eq $numberOfProcesses ] && [ $log_ttime -eq 0 ]; then
                    log_ttime=$exec_time
                    Logger "There are $((mainItemCount-counter+postponedItemCount)) / $mainItemCount tasks in the queue of which $postponedItemCount are postponed. Currently, ${#pidsArray[@]} tasks running with pids [$(joinString , ${pidsArray[@]})]." "NOTICE"
                fi
                if [ $(((exec_time + 1) % keepLogging)) -eq 0 ]; then
                    if [ $log_ttime -ne $exec_time ]; then # Fix when sleep time lower than 1 second
                        log_ttime=$exec_time
                        if [ $functionMode == "WaitForTaskCompletion" ]; then
                            Logger "Current tasks still running with pids [$(joinString , ${pidsArray[@]})]." "NOTICE"
                        elif [ $functionMode == "ParallelExec" ]; then
                            Logger "There are $((mainItemCount-counter+postponedItemCount)) / $mainItemCount tasks in the queue of which $postponedItemCount are postponed. Currently, ${#pidsArray[@]} tasks running with pids [$(joinString , ${pidsArray[@]})]." "NOTICE"
                        fi
                    fi
                fi
            fi
    
            if [ $exec_time -gt $softMaxTime ]; then
                if [ "$softAlert" != true ] && [ $softMaxTime -ne 0 ] && [ $noTimeErrorLog != true ]; then
                    Logger "Max soft execution time [$softMaxTime] exceeded for task [$id] with pids [$(joinString , ${pidsArray[@]})]." "WARN"
                    softAlert=true
                    SendAlert true
                fi
            fi
    
            if [ $exec_time -gt $hardMaxTime ] && [ $hardMaxTime -ne 0 ]; then
                if [ $noTimeErrorLog != true ]; then
                    Logger "Max hard execution time [$hardMaxTime] exceeded for task [$id] with pids [$(joinString , ${pidsArray[@]})]. Stopping task execution." "ERROR"
                fi
                for pid in "${pidsArray[@]}"; do
                    KillChilds $pid true
                    if [ $? -eq 0 ]; then
                        Logger "Task with pid [$pid] stopped successfully." "NOTICE"
                    else
                        if [ $noErrorLogsAtAll != true ]; then
                            Logger "Could not stop task with pid [$pid]." "ERROR"
                        fi
                    fi
                    errorcount=$((errorcount+1))
                done
                if [ $noTimeErrorLog != true ]; then
                    SendAlert true
                fi
                eval "HARD_MAX_EXEC_TIME_REACHED_$id=true"
                if [ $functionMode == "WaitForTaskCompletion" ]; then
                    return $errorcount
                else
                    return 129
                fi
            fi
        }
    
        function _ExecTasksPidsCheck {
            newPidsArray=()
    
            if [ "$currentRunningPids" != "$(joinString " " ${pidsArray[@]})" ]; then
                Logger "ExecTask running for pids [$(joinString " " ${pidsArray[@]})]." "DEBUG"
                currentRunningPids="$(joinString " " ${pidsArray[@]})"
            fi
    
            for pid in "${pidsArray[@]}"; do
                if [ $(IsInteger $pid) -eq 1 ]; then
                    if kill -0 $pid > /dev/null 2>&1; then
                        # Handle uninterruptible sleep state or zombies by ommiting them from running process array (How to kill that is already dead ? :)
                        pidState="$(eval $PROCESS_STATE_CMD)"
                        if [ "$pidState" != "D" ] && [ "$pidState" != "Z" ]; then
    
                            # Check if pid hasn't run more than soft/hard perProcessTime
                            pidsTimeArray[$pid]=$((SECONDS - seconds_begin))
                            if [ ${pidsTimeArray[$pid]} -gt $softPerProcessTime ]; then
                                if [ "$softAlert" != true ] && [ $softPerProcessTime -ne 0 ] && [ $noTimeErrorLog != true ]; then
                                    Logger "Max soft execution time [$softPerProcessTime] exceeded for pid [$pid]." "WARN"
                                    if [ "${commandsArrayPid[$pid]}]" != "" ]; then
                                        Logger "Command was [${commandsArrayPid[$pid]}]]." "WARN"
                                    fi
                                    softAlert=true
                                    SendAlert true
                                fi
                            fi
    
    
                            if [ ${pidsTimeArray[$pid]} -gt $hardPerProcessTime ] && [ $hardPerProcessTime -ne 0 ]; then
                                if [ $noTimeErrorLog != true ] && [ $noErrorLogsAtAll != true ]; then
                                    Logger "Max hard execution time [$hardPerProcessTime] exceeded for pid [$pid]. Stopping command execution." "ERROR"
                                    if [ "${commandsArrayPid[$pid]}]" != "" ]; then
                                        Logger "Command was [${commandsArrayPid[$pid]}]]." "WARN"
                                    fi
                                fi
                                KillChilds $pid true
                                if [ $? -eq 0 ]; then
                                     Logger "Command with pid [$pid] stopped successfully." "NOTICE"
                                else
                                    if [ $noErrorLogsAtAll != true ]; then
                                    Logger "Could not stop command with pid [$pid]." "ERROR"
                                    fi
                                fi
                                errorcount=$((errorcount+1))
    
                                if [ $noTimeErrorLog != true ]; then
                                    SendAlert true
                                fi
                            fi
    
                            newPidsArray+=($pid)
                        fi
                    else
                        # pid is dead, get its exit code from wait command
                        wait $pid
                        retval=$?
                        # Check for valid exit codes
                        if [ $(ArrayContains $retval "${validExitCodes[@]}") -eq 0 ]; then
                            if [ $noErrorLogsAtAll != true ]; then
                                Logger "${FUNCNAME[0]} called by [$id] finished monitoring pid [$pid] with exitcode [$retval]." "ERROR"
                                if [ "$functionMode" == "ParallelExec" ]; then
                                    Logger "Command was [${commandsArrayPid[$pid]}]." "ERROR"
                                fi
                                if [ -f "${commandsArrayOutput[$pid]}" ]; then
                                    Logger "Truncated output:\n$(head -c16384 "${commandsArrayOutput[$pid]}")" "ERROR"
                                fi
                            fi
                            errorcount=$((errorcount+1))
                            # Welcome to variable variable bash hell
                            if [ "$failedPidsList" == "" ]; then
                                failedPidsList="$pid:$retval"
                            else
                                failedPidsList="$failedPidsList;$pid:$retval"
                            fi
                        else
                            Logger "${FUNCNAME[0]} called by [$id] finished monitoring pid [$pid] with exitcode [$retval]." "DEBUG"
                        fi
                    fi
                fi
            done
    
            # hasPids can be false on last iteration in ParallelExec mode
            pidsArray=("${newPidsArray[@]}")
    
            # Trivial wait time for bash to not eat up all CPU
            sleep $sleepTime
        }
    
        while [ ${#pidsArray[@]} -gt 0 ] || [ $counter -lt $mainItemCount ] || [ $postponedItemCount -ne 0 ]; do
            _ExecTasksTimeCheck
            retval=$?
            if [ $retval -ne 0 ]; then
                return $retval;
            fi
    
            # The following execution bloc is only needed in ParallelExec mode since WaitForTaskCompletion does not execute commands, but only monitors them
            if [ $functionMode == "ParallelExec" ]; then
                while [ ${#pidsArray[@]} -lt $numberOfProcesses ] && ([ $counter -lt $mainItemCount ] || [ $postponedItemCount -ne 0 ]); do
                    _ExecTasksTimeCheck
                    retval=$?
                    if [ $retval -ne 0 ]; then
                        return $retval;
                    fi
    
                    executeCommand=false
                    isPostponedCommand=false
                    currentCommand=""
                    currentCommandCondition=""
                    needsPostponing=false
    
                    if [ $readFromFile == true ]; then
                        # awk identifies first line as 1 instead of 0 so we need to increase counter
                        currentCommand=$(awk 'NR == num_line {print; exit}' num_line=$((counter+1)) "$mainInput")
                        if [ $auxItemCount -ne 0 ]; then
                            currentCommandCondition=$(awk 'NR == num_line {print; exit}' num_line=$((counter+1)) "$auxInput")
                        fi
    
                        # Check if we need to fetch postponed commands
                        if [ "$currentCommand" == "" ]; then
                            currentCommand=$(awk 'NR == num_line {print; exit}' num_line=$((postponedCounter+1)) "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedMain.$id.$SCRIPT_PID.$TSTAMP")
                            currentCommandCondition=$(awk 'NR == num_line {print; exit}' num_line=$((postponedCounter+1)) "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedAux.$id.$SCRIPT_PID.$TSTAMP")
                            isPostponedCommand=true
                        fi
                    else
                        currentCommand="${commandsArray[$counter]}"
                        if [ $auxItemCount -ne 0 ]; then
                            currentCommandCondition="${commandsConditionArray[$counter]}"
                        fi
    
                        if [ "$currentCommand" == "" ]; then
                            currentCommand="${postponedCommandsArray[$postponedCounter]}"
                            currentCommandCondition="${postponedCommandsConditionArray[$postponedCounter]}"
                            isPostponedCommand=true
                        fi
                    fi
    
                    # Check if we execute postponed commands, or if we delay them
                    if [ $isPostponedCommand == true ]; then
                        # Get first value before '@'
                        postponedExecTime="${currentCommand%%@*}"
                        postponedExecTime=$((SECONDS-postponedExecTime))
                        # Get everything after first '@'
                        temp="${currentCommand#*@}"
                        # Get first value before '@'
                        postponedRetryCount="${temp%%@*}"
                        # Replace currentCommand with actual filtered currentCommand
                        currentCommand="${temp#*@}"
    
                        # Since we read a postponed command, we may decrase postponedItemCounter
                        postponedItemCount=$((postponedItemCount-1))
                        #Since we read one line, we need to increase the counter
                        postponedCounter=$((postponedCounter+1))
    
                    else
                        postponedRetryCount=0
                        postponedExecTime=0
                    fi
                    if ([ $postponedRetryCount -lt $maxPostponeRetries ] && [ $postponedExecTime -ge $minTimeBetweenRetries ]) || [ $isPostponedCommand == false ]; then
                        if [ "$currentCommandCondition" != "" ]; then
                            Logger "Checking condition [$currentCommandCondition] for command [$currentCommand]." "DEBUG"
                            eval "$currentCommandCondition" &
                            ExecTasks $! "subConditionCheck" false 0 0 1800 3600 true $SLEEP_TIME $KEEP_LOGGING true true true
                            subRetval=$?
                            if [ $subRetval -ne 0 ]; then
                                # is postponing enabled ?
                                if [ $maxPostponeRetries -gt 0 ]; then
                                    Logger "Condition [$currentCommandCondition] not met for command [$currentCommand]. Exit code [$subRetval]. Postponing command." "NOTICE"
                                    postponedRetryCount=$((postponedRetryCount+1))
                                    if [ $postponedRetryCount -ge $maxPostponeRetries ]; then
                                        Logger "Max retries reached for postponed command [$currentCommand]. Skipping command." "NOTICE"
                                    else
                                        needsPostponing=true
                                    fi
                                    postponedExecTime=0
                                else
                                    Logger "Condition [$currentCommandCondition] not met for command [$currentCommand]. Exit code [$subRetval]. Ignoring command." "NOTICE"
                                fi
                            else
                                executeCommand=true
                            fi
                        else
                            executeCommand=true
                        fi
                    else
                        needsPostponing=true
                    fi
    
                    if [ $needsPostponing == true ]; then
                        postponedItemCount=$((postponedItemCount+1))
                        if [ $readFromFile == true ]; then
                            echo "$((SECONDS-postponedExecTime))@$postponedRetryCount@$currentCommand" >> "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedMain.$id.$SCRIPT_PID.$TSTAMP"
                            echo "$currentCommandCondition" >> "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}-postponedAux.$id.$SCRIPT_PID.$TSTAMP"
                        else
                            postponedCommandsArray+=("$((SECONDS-postponedExecTime))@$postponedRetryCount@$currentCommand")
                            postponedCommandsConditionArray+=("$currentCommandCondition")
                        fi
                    fi
    
                    if [ $executeCommand == true ]; then
                        Logger "Running command [$currentCommand]." "DEBUG"
                        randomOutputName=$(date '+%Y%m%dT%H%M%S').$(PoorMansRandomGenerator 5)
                        eval "$currentCommand" >> "$RUN_DIR/$PROGRAM.${FUNCNAME[0]}.$id.$pid.$randomOutputName.$SCRIPT_PID.$TSTAMP" 2>&1 &
                        pid=$!
                        pidsArray+=($pid)
                        commandsArrayPid[$pid]="$currentCommand"
                        commandsArrayOutput[$pid]="$RUN_DIR/$PROGRAM.${FUNCNAME[0]}.$id.$pid.$randomOutputName.$SCRIPT_PID.$TSTAMP"
                        # Initialize pid execution time array
                        pidsTimeArray[$pid]=0
                    else
                        Logger "Skipping command [$currentCommand]." "DEBUG"
                    fi
    
                    if [ $isPostponedCommand == false ]; then
                        counter=$((counter+1))
                    fi
                    _ExecTasksPidsCheck
                done
            fi
    
        _ExecTasksPidsCheck
        done
    
        # Return exit code if only one process was monitored, else return number of errors
        # As we cannot return multiple values, a global variable WAIT_FOR_TASK_COMPLETION contains all pids with their return value
    
        eval "WAIT_FOR_TASK_COMPLETION_$id=\"$failedPidsList\""
    
        if [ $mainItemCount -eq 1 ]; then
            return $retval
        else
            return $errorcount
        fi
    }
    

    Hope you have fun.

    0 讨论(0)
  • 2021-01-23 20:16

    The find command will slow things down and the script is more complicated than it needs to be.

    If you want to do this with grep, better to loop through data_file and within that grep $line * > /dev/null && do_something (or grep -R $line * > /dev/null && do_something if there are subdirectories to deal with)

    0 讨论(0)
  • 2021-01-23 20:25

    Using GNU Parallel you can do something like this:

    doit() {
        f="$1"
        line="$2"
        found=$(grep $line $f)      
    
        if [ ! -z "$found" ]; then
          perform task here
        fi
    }
    export -f doit
    
    find . -type f | parallel doit :::: - data_file
    
    0 讨论(0)
提交回复
热议问题