2.1. HTCondor Glossary

ClassAd

ClassAd stands for “Classified Advertisement.” It’s a flexible and expressive language used by HTCondor for representing jobs, machines, and other resources. ClassAds are similar to attribute-value pairs, and they’re used to match jobs with appropriate resources.

Scheduler

The component in HTCondor that queues and manages users’ job submissions. It finds appropriate matches for the jobs in the queue using the ClassAd system.

Startd (Start Daemon)

This is the daemon running on the worker node that advertises the node’s resources and capabilities. It’s responsible for executing and managing jobs on the node.

Negotiator

Part of the central manager services, it is responsible for making match decisions, pairing submitted jobs with suitable execution resources.

Worker Nodes (or Execute Nodes)

The computational resources or machines where the jobs are executed. Each worker node runs a startd process.

Submit Node

The machine or location from which jobs are submitted into the HTCondor system. The scheduler runs on this node.

Central Manager

The central authority in an HTCondor pool that hosts the Negotiator and the Collector services. It’s essential for resource matchmaking and information gathering.

Collector

A service running on the Central Manager that gathers ClassAd information from other daemons (like startd and schedd) across the pool.

Condor Pool

A collection of machines working under a single HTCondor system. This includes the Central Manager, Worker Nodes, and potentially multiple submit nodes.

Universe

In HTCondor, a Universe is a specific execution environment for a job. Examples include the Vanilla Universe, Parallel Universe, and Docker Universe. The chosen Universe determines how a job is executed and what features are available to it.

Checkpointing

A feature that allows jobs to be paused and resumed. This is especially useful if a job gets preempted or if the machine it’s running on goes down.

Preemption

The act of suspending or stopping a currently running job to free up resources for another job that has higher priority or better matches the resources.

Rank

An expression in the ClassAd system that indicates a preference for a match. For example, a job might rank execution machines by available memory, favoring matches with more memory.

Requirements

Expressions in the ClassAd system that must be satisfied for a match to occur. If a job’s requirements do not match the attributes of a machine, then the job will not be sent to that machine.

Dedicated Scheduling

In the HTCondor Parallel Universe, “dedicated” scheduling refers to the process by which certain compute nodes (machines) are reserved exclusively for running parallel jobs. Such a setup ensures that parallel jobs, like MPI jobs, have consistent and predictable communication between the nodes without interference from other non-parallel jobs. Dedicated scheduling is advantageous for jobs that require tight inter-process communication or a specific arrangement of nodes. When machines are part of the dedicated scheduler, they won’t execute other tasks outside of the designated parallel jobs.