Contents
Introduction
evQueue is a task queueing engine and a fast and scalable job scheduler. It can handle simple tasks or tasks combination (called “Workflows”).
The engine is entirely based on kernel events. There is no active wait in any parts of the process. When chaining tasks, the next task starts when the kernel signals the end of the preceding process. This allows very low overhead and effective CPU management.
The web control interface allows simple monitoring of tasks and workflows and help system administrators keeping a trace of what happens on a platform. User management system also provides a simple way to restrict rights so you can give some users the rights to launch defined tasks when they need.
Beside web interface, evQueue also provides a full XML based API which allows launching and state gathering of workflows. This can be used by web processes to launch time taking jobs (eventually on remote machine) and monitoring status asynchronously (with AJAX for example).
Definitions
- Task
This is the smallest unit of treatment used by evQueue. A task is an executable file that will be launched by the engine. The task can be a system level standard binary (like "ls", "bzip" or "rsync") or a custom binary specially created in any language. - Job
A job is a group of tasks. All tasks in a job are executed concurrently (though depending of queues limitations). The job is considered terminated when all tasks of the group are terminated. If any of the tasks fails, the job is considered failed. - Workflow
A workflow is a sequence of jobs and tasks. Tasks can be chained by different means: tasks groups (jobs), sub-jobs, loops, conditions... evQueue only handles workflows, you can't directly launch a task. To launch a single task you'll have to create a workflow containing only one task. - Workflow instance
A workflow is a definition of how tasks should be chained. A workflow instance is the execution of a defined workflow. It contains tasks results (output) and expended variables or loops. A workflow instance is created by launching a workflow with its specified parameters (if needed). - Queue
A queue controls the execution of tasks. It is defined by a name and a maximum concurrency. Each task must be associated to a queue. When a task is launched, it increments the running tasks counter of its queue. When the queue is full (i.e. maximum concurrency is reached), no more tasks can be launched in this queue before one of the running tasks gets terminated. Queue management is event driven, so as soon as one process exists, the next one will be launched. - Retry schedule
evQueue can be configured to automatically retry tasks if they fail. The retry schedule defines when and how many times a task must be retried before considering is is failed. This can be useful if you're accessing external resources that may not be available (FTP, Webservices...). - Scheduled workflow
A scheduled workflow is a workflow that will be launched by evQueue at a specified time. It is basically a scheduled job (like cron).
Queues
Queues are used to manage tasks concurrency. Each task must be associated to a queue within a workflow. The same task can be associated to different queues in separate workflows. When a task is executed, the counter of running tasks of the queue gets incremented. When the task is terminated, the counter gets decremented. The running tasks counter cannot be higher than the maximum concurrency, specified when creating the queue.
When the running tasks counter is below the maximum concurrency, tasks in the queue are executed immediately. If the maximum concurrency is reached, tasks are queued (not executed). They get executed when one or more tasks are terminated.
Queues are global, they do not depend of a workflow or workflow instance. The tasks are placed in the same queue, even if they are launched from different workflows. This allows a good control of CPU occupation and can prevent from overloading the system.
Example
Let's take the example of pictures resizing, which is a quite CPU intensive task. Suppose we have a queue named “photo_resize” with a maximum concurrency of 4. The resize workflow contains one (or more) tasks to resize pictures that are bound to this queue. No matter how many workflow instances are launched, the number of concurrent photo_resize will never exceed 4.
Utility of queues
- Prevent CPU overloading. If 300 workflows are launched at the same time to resize pictures, this will ensure reasonable load on the machine.
- Keep room for other tasks. Limiting concurrency on the kind of tasks will keep CPUs of Cores available for other tasks.
- Mutual exclusion. A queue with a concurrency of 1 will allow you to create “mutex” to prevent a task from being launched multiple times at the same time. This is very useful for accessing shared resources.
- Execution speedup. Using loops, you can easily create groups of parallel tasks. You can then let evQueue manage execution and parallelization and focus on task writing.
- Reporting. The reporting screen of the web control interface, available in “System state → Queues” can give a very good idea of the current load.
Create a queue
To create a queue, go to “Settings → Queues”. Use the blue “+” at the right of the title bar. Enter queue name and maximum concurrency and click “submit”.
Note that you have to restart evQueue engine after creating or editing a queue.
Tasks
A task is the smallest unit of control in evQueue. A task is mainly composed of a name, used to call the task in a workflow, and a path to a binary file that will be executed. When a task is used in a workflow, it must be bound to a queue. The same task can be bound to different queues within different workflows.
A task can be executed remotely using SSH. In this case, you have to give a user and host for remote execution. The path to the binary is then relative to the destination machine. Specifying a remote host in the task definition means that the task will always be executed on this host, regardless of the workflow execution host (task definition has the highest priority).
Parameters can be passed as command line arguments (the traditional way of passing parameters) or by environment variables. Environment variables are very useful when the number of arguments begins to increase. Is is much more easy to read and maintain. On the other hand, system binaries (like “ls” or “bzip”) only accept command line arguments.
Task output will be imported in the workflow instance. Two output formats are supported: text or XML. It is only possible to use the result of task with an XML output. The result can be used to create loops, conditions, or as arguments for sub-tasks. Text output is generally used with system binaries.
Workflows
A workflow is a set of tasks that are executed in a defined order. This order is controlled by specific control structures used when defining workflow. The workflow itself is described by an XML structure, which is also used by evQueue for its internal representation.
The workflow contains the description of the tasks that must be executed. It can also contain loops, conditions or variables. When the workflow is launched, a workflow instance is created. In this instance, all variables gets replaced by their values and tasks are expended with their result.
The workflow instance is a well formed XML. A task declaring XML output will be considered failed if its XML output is not well formed (regardless of its return value). For text output, evQueue will create an “output” node to encapsulated the task output.
A task can refer to any of the values returned by its ancestors by using XPath expressions. These expressions can be used in any of the control structures that are explained below. These values can be used as input parameters or in loops and conditions.
Control structures
In order to create a workflow, you have to understand how control structures works:
- Task
The task is the smallest control unit. The task status is determined with the return code (0 meaning “OK” and any other value indicating an error). A task can be assigned a retry schedule. When a task with a retry schedule fails, it will be retried according to its retry schedule. The task will be considered failed if all the retries fail. If any of the retry is successful, the task is considered successful. - Job
A job is a set of one or more tasks. The job is terminated when all tasks are terminated. The job is considered successful if all tasks are successful. The job is considered failed if one or more tasks are failed. Every task must be included in a job, even if the job contains only one task. - Sub-job
A sub-job is a job that will be executed only once the current job is terminated. If the current job contains more than one task, the sub-job will be executed only when all tasks of the current job are terminated. - Condition
A condition is an XPath expression that will be evaluated to possibly skip the job. If the condition evaluates to “true” the job is executed, otherwise the job is skipped. - Loop
A loop is an XPath expression that will be evaluated to duplicate jobs or tasks. This can be used to apply the same operation to different files (first job lists files and the sub-job loops on files).
Workflow instances
Workflow instances (executing or terminated) can be found in “System state → Workflow instances”. Here you can see which workflows are currently executing as well as terminated workflows. You can also see the output and status of tasks (executing, terminated, skiped...).
Scheduled workflows
A scheduled workflow is a workflow that will be launched by evQueue at given time(s). The execution of their workflows are also event driven. There is no active polling to determine when to launch a workflow. Instead, the evQueue engine will compute at which time the next execution should be made and ask the kernel to be woken up at this time. This allows a very small resolution for task launching. The smallest amount of time is set to the second for practical reasons, although it could be reduced to about 100ms.
A schedule will never be triggered again while it is still running. That is, if you plan every minute a task running for 60 minutes, it will be launched 1 minute after the previous execution's ending (61 minutes in this case).
Scheduled workflow status
The status of scheduled workflows is summarized is “System state → Scheduled workflows”. Here you can see all scheduled workflows with their last execution time and status, and their next execution time. The "eye" can be used to see all the instances that have been launched for this scheduled workflow.