Docker Driver
Name: docker
The docker
driver provides a first-class Docker workflow on Nomad. The Docker
driver handles downloading containers, mapping ports, and starting, watching,
and cleaning up after containers.
Note: If you are using Docker Desktop for Windows or MacOS, please check our FAQ.
Task Configuration
task "webservice" {
driver = "docker"
config {
image = "redis:7"
labels {
group = "webservice-cache"
}
}
}
The docker
driver supports the following configuration in the job spec. Only
image
is required.
image
- The Docker image to run. The image may include a tag or custom URL and should includehttps://
if required. By default it will be fetched from Docker Hub. If the tag is omitted or equal tolatest
the driver will always try to pull the image. If the image to be pulled exists in a registry that requires authentication credentials must be provided to Nomad. Please see the Authentication section.config { image = "https://hub.docker.internal/redis:7" }
image_pull_timeout
- (Optional) A time duration that controls how long Nomad will wait before cancelling an in-progress pull of the Docker image as specified inimage
. Defaults to"5m"
.args
- (Optional) A list of arguments to the optionalcommand
. If nocommand
is specified, the arguments are passed directly to the container. References to environment variables or any interpretable Nomad variables will be interpreted before launching the task. For example:config { args = [ "-bind", "${NOMAD_PORT_http}", "${nomad.datacenter}", "${MY_ENV}", "${meta.foo}", ] }
auth
- (Optional) Provide authentication for a private registry (see below).auth_soft_fail
(bool: false)
- Don't fail the task on an auth failure. Attempt to continue without auth. If the Nomad client configuration has anauth.helper
block, the helper will be tried for all images, including public images. If you mix private and public images, you will need to includeauth_soft_fail=true
in every job using a public image.command
- (Optional) The command to run when starting the container.config { command = "my-command" }
container_exists_attempts
- (Optional) A number of attempts to be made to purge a container if during task creation Nomad encounters an existing one in non-running state for the same task. Defaults to5
.dns_search_domains
- (Optional) A list of DNS search domains for the container to use. If you are using bridge networking mode with anetwork
block in the task group, you must set all DNS options in thenetwork.dns
block instead.dns_options
- (Optional) A list of DNS options for the container to use. If you are using bridge networking mode with anetwork
block in the task group, you must set all DNS options in thenetwork.dns
block instead.dns_servers
- (Optional) A list of DNS servers for the container to use (e.g. ["8.8.8.8", "8.8.4.4"]). Requires Docker v1.10 or greater. If you are using bridge networking mode with anetwork
block in the task group, you must set all DNS options in thenetwork.dns
block instead.entrypoint
- (Optional) A string list overriding the image's entrypoint.extra_hosts
- (Optional) A list of hosts, given as host:IP, to be added to/etc/hosts
. This option may not work as expected inbridge
network mode when there is more than one task within the same group. Refer to the upgrade guide for more information.force_pull
- (Optional)true
orfalse
(default). Always pull most recent image instead of using existing local image. Should be set totrue
if repository tags are mutable. If image's tag islatest
or omitted, the image will always be pulled regardless of this setting.group_add
- (Optional) A list of supplementary groups to be applied to the container user.healthchecks
- (Optional) A configuration block for controlling how the docker driver manages HEALTHCHECK directives built into the container. Sethealthchecks.disable
to disable any built-in healthcheck.config { healthchecks { disable = true } }
hostname
- (Optional) The hostname to assign to the container. When launching more than one of a task (usingcount
) with this option set, every container the task starts will have the same hostname.init
- (Optional)true
orfalse
(default). Enable init (tini) system when launching your container. When enabled, an init process will be used as the PID1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.The default init process used is the first
docker-init
executable found in the system path of the Docker daemon process. Thisdocker-init
binary, included in the default installation, is backed by tini.interactive
- (Optional)true
orfalse
(default). Keep STDIN open on the container.isolation
- (Optional) Specifies Windows isolation mode:"hyperv"
or"process"
. Defaults to"hyperv"
.
sysctl
- (Optional) A key-value map of sysctl configurations to set to the containers on start.config { sysctl = { "net.core.somaxconn" = "16384" } }
ulimit
- (Optional) A key-value map of ulimit configurations to set to the containers on start.config { ulimit { nproc = "4242" nofile = "2048:4096" } }
privileged
- (Optional)true
orfalse
(default). Privileged mode gives the container access to devices on the host. Note that this also requires the nomad agent and docker daemon to be configured to allow privileged containers.ipc_mode
- (Optional) The IPC mode to be used for the container. The default isnone
for a private IPC namespace. Other values arehost
for sharing the host IPC namespace or the name or id of an existing container. Note that it is not possible to refer to Docker containers started by Nomad since their names are not known in advance. Note that setting this option also requires the Nomad agent to be configured to allow privileged containers.ipv4_address
- (Optional) The IPv4 address to be used for the container when using user defined networks. Requires Docker 1.13 or greater.ipv6_address
- (Optional) The IPv6 address to be used for the container when using user defined networks. Requires Docker 1.13 or greater.labels
- (Optional) A key-value map of labels to set to the containers on start.config { labels { foo = "bar" zip = "zap" } }
load
- (Optional) Load an image from atar
archive file instead of from a remote repository. Equivalent to thedocker load -i <filename>
command. If you're using anartifact
block to fetch the archive file, you'll need to ensure that Nomad keeps the archive intact after download.artifact { source = "http://path.to/redis.tar" options { archive = false } } config { load = "redis.tar" image = "redis" }
logging
- (Optional) A key-value map of Docker logging options. Defaults tojson-file
with log rotation (max-file=2
andmax-size=2m
).config { logging { type = "fluentd" config { fluentd-address = "localhost:24224" tag = "your_tag" } } }
mac_address
- (Optional) The MAC address for the container to use (e.g. "02:68:b3:29:da:98").memory_hard_limit
- (Optional) The maximum allowable amount of memory used (megabytes) by the container. If set, thememory
parameter of the task resource configuration becomes a soft limit passed to the docker driver as--memory_reservation
, andmemory_hard_limit
is passed as the--memory
hard limit. When the host is under memory pressure, the behavior of soft limit activation is governed by the Kernel.network_aliases
- (Optional) A list of network-scoped aliases, provide a way for a container to be discovered by an alternate name by any other container within the scope of a particular network. Network-scoped alias is supported only for containers in user defined networksconfig { network_mode = "user-network" network_aliases = [ "${NOMAD_TASK_NAME}", "${NOMAD_TASK_NAME}-${NOMAD_ALLOC_INDEX}" ] }
network_mode
- (Optional) The network mode to be used for the container. In order to support userspace networking plugins in Docker 1.9 this accepts any value. The default isbridge
for all operating systems but Windows, which defaults tonat
. Other networking modes may not work without additional configuration on the host (which is outside the scope of Nomad). Valid values pre-docker 1.9 aredefault
,bridge
,host
,none
, orcontainer:name
.The default
network_mode
for tasks that use group networking inbridge
mode will becontainer:<name>
, where the name is the container name of the parent container used to share network namespaces between tasks. If you set the groupnetwork.mode
to"bridge"
you should not set this Dockernetwork_mode
config, otherwise the container will be unable to reach other containers in the task group. This will also prevent Connect-enabled tasks from reaching the Envoy sidecar proxy. You must also set any DNS options in thenetwork.dns
block and not in the task configuration.If you are in the process of migrating from the default Docker network to group-wide bridge networking, you may encounter issues preventing your containers from reaching networks outside of the bridge interface on systems with firewalld enabled. This behavior is often caused by the CNI plugin not registering the group network as trusted and can be resolved as described in the network block documentation.
oom_score_adj
- (Optional) A positive integer to indicate the likelihood of the task being OOM killed (valid only for Linux). Defaults to 0.pid_mode
- (Optional)host
or not set (default). Set tohost
to share the PID namespace with the host. Note that this also requires the Nomad agent to be configured to allow privileged containers. See below for more details.ports
- (Optional) A list of port labels to map into the container (see below).port_map
- (Optional) Deprecated A key-value map of port labels (see below).security_opt
- (Optional) A list of string flags to pass directly to--security-opt
. For example:config { security_opt = [ "credentialspec=file://gmsaUser.json", ] }
shm_size
- (Optional) The size (bytes) of /dev/shm for the container.storage_opt
- (Optional) A key-value map of storage options set to the containers on start. This overrides the host dockerd configuration. For example:config { storage_opt = { size = "40G" } }
tty
- (Optional)true
orfalse
(default). Allocate a pseudo-TTY for the container.uts_mode
- (Optional)host
or not set (default). Set tohost
to share the UTS namespace with the host. Note that this also requires the Nomad agent to be configured to allow privileged containers.userns_mode
- (Optional)host
or not set (default). Set tohost
to use the host's user namespace (effectively disabling user namespacing) when user namespace remapping is enabled on the docker daemon. This field has no effect if the docker daemon does not have user namespace remapping enabled.volumes
- (Optional) A list ofhost_path:container_path
strings to bind host paths to container paths. Mounting host paths outside of the allocation working directory is prevented by default and limits volumes to directories that exist inside the allocation working directory. You can allow mounting host paths outside of the allocation working directory on individual clients by setting thedocker.volumes.enabled
option totrue
in the client's configuration. We recommend usingmount
if you wish to have more control over volume definitions.config { volumes = [ # Use absolute paths to mount arbitrary paths on the host "/path/on/host:/path/in/container", # Use relative paths to rebind paths already in the allocation dir "relative/to/task:/also/in/container" ] }
volume_driver
- (Optional) The name of the volume driver used to mount volumes. Must be used along withvolumes
. Ifvolume_driver
is omitted, then relative paths will be mounted from inside the allocation dir. If a"local"
or other driver is used, then they may be named volumes instead. Ifdocker.volumes.enabled
is false then volume drivers and paths outside the allocation directory are disallowed.config { volumes = [ # Use named volume created outside nomad. "name-of-the-volume:/path/in/container" ] # Name of the Docker Volume Driver used by the container volume_driver = "pxd" }
work_dir
- (Optional) The working directory inside the container.mount
- Since 1.0.1 (Optional) Specify a mount to be mounted into the container. Volume, bind, and tmpfs type mounts are supported. May be specified multiple times.config { # sample volume mount mount { type = "volume" target = "/path/in/container" source = "name-of-volume" readonly = false volume_options { no_copy = false labels { foo = "bar" } driver_config { name = "pxd" options { foo = "bar" } } } } # sample bind mount mount { type = "bind" target = "/path/in/container" source = "/path/in/host" readonly = false bind_options { propagation = "rshared" } } # sample tmpfs mount mount { type = "tmpfs" target = "/path/in/container" readonly = false tmpfs_options { size = 100000 # size in bytes } } }
mounts
- (deprecated: Replaced bymount
in 1.0.1) (Optional) A list of mounts to be mounted into the container. Volume, bind, and tmpfs type mounts are supported.config { mounts = [ # sample volume mount { type = "volume" target = "/path/in/container" source = "name-of-volume" readonly = false volume_options = { no_copy = false labels = { foo = "bar" } driver_config = { name = "pxd" options = { foo = "bar" } } } }, # sample bind mount { type = "bind" target = "/path/in/container" source = "/path/in/host" readonly = false bind_options = { propagation = "rshared" } }, # sample tmpfs mount { type = "tmpfs" target = "/path/in/container" readonly = false tmpfs_options = { size = 100000 # size in bytes } } ] }
devices
- (Optional) A list of devices to be exposed the container.host_path
is the only required field. By default, the container will be able toread
,write
andmknod
these devices. Use the optionalcgroup_permissions
field to restrict permissions.config { devices = [ { host_path = "/dev/sda1" container_path = "/dev/xvdc" cgroup_permissions = "r" }, { host_path = "/dev/sda2" container_path = "/dev/xvdd" } ] }
cap_add
- (Optional) A list of Linux capabilities as strings to pass directly to--cap-add
. Effective capabilities (computed fromcap_add
andcap_drop
) must be a subset of the allowed capabilities configured with theallow_caps
plugin option key in the client node's configuration. Note that"all"
is not permitted here if theallow_caps
field in the driver configuration doesn't also allow all capabilities. For example:
config {
cap_add = ["net_raw", "sys_time"]
}
cap_drop
- (Optional) A list of Linux capabilities as strings to pass directly to--cap-drop
. Effective capabilities (computed fromcap_add
andcap_drop
) must be a subset of the allowed capabilities configured with theallow_caps
plugin option key in the client node's configuration. For example:
config {
cap_drop = ["mknod"]
}
cpu_hard_limit
- (Optional)true
orfalse
(default). Use hard CPU limiting instead of soft limiting. By default this isfalse
which means soft limiting is used and containers are able to burst above their CPU limit when there is idle capacity.cpu_cfs_period
- (Optional) An integer value that specifies the duration in microseconds of the period during which the CPU usage quota is measured. The default is 100000 (0.1 second) and the maximum allowed value is 1000000 (1 second). See here for more details.advertise_ipv6_address
- (Optional)true
orfalse
(default). Use the container's IPv6 address (GlobalIPv6Address in Docker) when registering services and checks. See IPv6 Docker containers for details.readonly_rootfs
- (Optional)true
orfalse
(default). Mount the container's filesystem as read only.runtime
- (Optional) A string representing a configured runtime to pass to docker. This is equivalent to the--runtime
argument in the docker CLI For example, to use gVisor:config { # gVisor runtime is runsc runtime = "runsc" }
pids_limit
- (Optional) An integer value that specifies the pid limit for the container. Defaults to unlimited.
Additionally, the docker driver supports customization of the container's user through the task's user
option.
Container Name
Nomad creates a container after pulling an image. Containers are named
{taskName}-{allocId}
. This is necessary in order to place more than one
container from the same task on a host (e.g. with count > 1). This also means
that each container's name is unique across the cluster.
This is not configurable.
Authentication
If you want to pull from a private repo (for example on dockerhub or quay.io), you will need to specify credentials in your job via:
the
auth
option in the task config.by storing explicit repository credentials or by specifying Docker
credHelpers
in a file and setting the auth config value on the client in the plugin options.by specifying an auth helper on the client in the plugin options.
The auth
object supports the following keys:
username
- (Optional) The account username.password
- (Optional) The account password.email
- (Optional) The account email.server_address
- (Optional) The server domain/IP without the protocol. Docker Hub is used by default.
Example task-config:
task "example" {
driver = "docker"
config {
image = "secret/service"
auth {
username = "dockerhub_user"
password = "dockerhub_password"
}
}
}
Example docker-config, using two helper scripts in $PATH
,
"docker-credential-ecr-login" and "docker-credential-vault":
{
"auths": {
"internal.repo": {
"auth": "`echo -n '<username>:<password>' | base64 -w0`"
}
},
"credHelpers": {
"<XYZ>.dkr.ecr.<region>.amazonaws.com": "ecr-login"
},
"credsStore": "secretservice"
}
Example agent configuration, using a helper script
"docker-credential-ecr-login" in $PATH
client {
enabled = true
}
plugin "docker" {
config {
auth {
# Nomad will prepend "docker-credential-" to the helper value and call
# that script name.
helper = "ecr-login"
}
}
}
Be Careful! At this time these credentials are stored in Nomad in plain text. Secrets management will be added in a later release.
Insecure Registries
In order to pull images from a registry without TLS, you must configure the
Docker daemon's insecure-registries
flag. No additional Nomad client
configuration is required. You should only allow insecure registries for
registries running locally on the client or when the communication to the
registry is otherwise encrypted. List the insecure-registries
flag in the
dockerd
configuration file.
{
"insecure-registries": ["example.local:5000"]
}
Networking
Docker supports a variety of networking configurations, including using host
interfaces, SDNs, etc. Nomad uses bridged
networking by default, like Docker.
You can specify other networking options, including custom networking plugins in Docker 1.9. You may need to perform additional configuration on the host in order to make these work. This additional configuration is outside the scope of Nomad.
Allocating Ports
You can allocate ports to your task using the port syntax described on the networking page. Here is a recap:
group {
network {
port "http" {}
port "https" {}
}
task "example" {
driver = "docker"
config {
ports = ["http", "https"]
}
}
}
Forwarding and Exposing Ports
A Docker container typically specifies which port a service will listen on by
specifying the EXPOSE
directive in the Dockerfile
.
Because dynamic ports will not match the ports exposed in your Dockerfile,
Nomad will automatically expose any ports specified in the ports
field.
These ports will be identified via environment variables. For example:
group {
network {
port "http" {}
}
task "api" {
driver = "docker"
config {
ports = ["http"]
}
}
}
If Nomad allocates port 23332
to your api task for http
, 23332
will be
automatically exposed and forwarded to your container, and the driver will set
an environment variable NOMAD_PORT_http
with the value 23332
that you can
read inside your container.
This provides an easy way to use the host
networking option for better
performance.
Using the Port Map
If you prefer to use the traditional port-mapping method, you can specify the
the to
field in the port configuration. It looks like this:
group "example" {
network {
port "redis" { to = 6379 }
}
task "example" {
driver = "docker"
config {
image = "redis"
ports = ["redis"]
}
}
}
If Nomad allocates port 23332
to your allocation, the Docker driver will
automatically setup the port mapping from 23332
on the host to 6379
in your
container, so it will just work.
Note that by default this only works with bridged
networking mode. It may
also work with custom networking plugins which implement the same API for
expose and port forwarding.
Deprecated port_map
Syntax
Up until Nomad 0.12, ports could be specified in a task's resource block and set using the docker
port_map
field. As more features have been added to the group network resource allocation, task based
network resources are deprecated. With it the port_map
field is also deprecated and can only be used
with task network resources.
Users should migrate their jobs to define ports in the group network block and specified which ports
a task maps with the ports
field.
Advertising Container IPs
When using network plugins like weave
that assign containers a routable IP
address, that address will automatically be used in any service
advertisements for the task. You may override what address is advertised by
using the address_mode
parameter on a service
. See
service for details.
Networking Protocols
The Docker driver configures ports on both the tcp
and udp
protocols.
This is not configurable.
Other Networking Modes
Some networking modes like container
or none
will require coordination
outside of Nomad. First-class support for these options may be improved later
through Nomad plugins or dynamic job configuration.
Capabilities
The docker
driver implements the following capabilities.
Feature | Implementation |
---|---|
nomad alloc signal | true |
nomad alloc exec | true |
filesystem isolation | image |
network isolation | host, group, task |
volume mounting | all |
Client Requirements
Nomad requires Docker to be installed and running on the host alongside the Nomad agent.
By default Nomad communicates with the Docker daemon using the daemon's Unix socket. Nomad will need to be able to read/write to this socket. If you do not run Nomad as root, make sure you add the Nomad user to the Docker group so Nomad can communicate with the Docker daemon.
For example, on Ubuntu you can use the usermod
command to add the nomad
user to the docker
group so you can run Nomad without root:
$ sudo usermod -G docker -a nomad
Nomad clients manage a cpuset cgroup for each task to reserve or share CPU
cores. In order for Nomad to be compatible with Docker's own cgroups
management, it must write to cgroups owned by Docker, which requires running as
root. If Nomad is not running as root, CPU isolation and NUMA-aware scheduling
will not function correctly for workloads with resources.cores
, including
workloads using task drivers other than docker
on the same host.
For the best performance and security features you should use recent versions of the Linux Kernel and Docker daemon.
If you would like to change any of the options related to the docker
driver
on a Nomad client, you can modify them with the plugin block
syntax. Below is an example of a configuration (many of the values are the
default). See the next section for more information on the options.
plugin "docker" {
config {
endpoint = "unix:///var/run/docker.sock"
auth {
config = "/etc/docker-auth.json"
helper = "ecr-login"
}
tls {
cert = "/etc/nomad/nomad.pub"
key = "/etc/nomad/nomad.pem"
ca = "/etc/nomad/nomad.cert"
}
extra_labels = ["job_name", "job_id", "task_group_name", "task_name", "namespace", "node_name", "node_id"]
gc {
image = true
image_delay = "3m"
container = true
dangling_containers {
enabled = true
dry_run = false
period = "5m"
creation_grace = "5m"
}
}
volumes {
enabled = true
selinuxlabel = "z"
}
allow_privileged = false
allow_caps = ["chown", "net_raw"]
}
}
Plugin Options
endpoint
- If using a non-standard socket, HTTP or another location, or if TLS is being used, docker.endpoint must be set. If unset, Nomad will attempt to instantiate a Docker client using theDOCKER_HOST
environment variable and then fall back to the default listen address for the given operating system. Defaults tounix:///var/run/docker.sock
on Unix platforms andnpipe:////./pipe/docker_engine
for Windows.allow_privileged
- Defaults tofalse
. Changing this to true will allow containers to use privileged mode, which gives the containers full access to the host's devices. Note that you must set a similar setting on the Docker daemon for this to work.pull_activity_timeout
- Defaults to2m
. If Nomad receives no communication from the Docker engine during an image pull within this timeframe, Nomad will time out the request that initiated the pull command. (Minimum of1m
)pids_limit
- Defaults to unlimited (0
). An integer value that specifies the pid limit for all the Docker containers running on that Nomad client. You can override this limit by settingpids_limit
in your task config. If this value is greater than0
, your taskpids_limit
must be less than or equal to the value defined here.allow_caps
- A list of allowed Linux capabilities. Defaults to
["audit_write", "chown", "dac_override", "fowner", "fsetid", "kill", "mknod",
"net_bind_service", "setfcap", "setgid", "setpcap", "setuid", "sys_chroot"]
which is the same list of capabilities allowed by docker by
default (without NET_RAW
). Allows the operator
to control which capabilities can be obtained by tasks using
cap_add
and cap_drop
options. Supports the value
"all"
as a shortcut for allow-listing all capabilities supported by the
operating system. Note that due to a limitation in Docker, tasks running as
non-root users cannot expand the capabilities set beyond the default. They can
only have their capabilities reduced.
Warning: Allowing more capabilities beyond the default may lead to undesirable consequences, including untrusted tasks being able to compromise the host system.
allow_runtimes
- defaults to["runc", "nvidia"]
- A list of the allowed docker runtimes a task may use.auth
block:config
- Allows an operator to specify a JSON file which is in the dockercfg format containing authentication information for a private registry, from either (in order)auths
,credsStore
orcredHelpers
.helper
- Allows an operator to specify a credsStore like script on$PATH
to lookup authentication information from external sources. The script's name must begin withdocker-credential-
and this option should include only the basename of the script, not the path.If you set an auth helper, it will be tried for all images, including public images. If you mix private and public images, you will need to include
auth_soft_fail=true
in every job using a public image.
tls
block:cert
- Path to the server's certificate file (.pem
). Specify this along withkey
andca
to use a TLS client to connect to the docker daemon.endpoint
must also be specified or this setting will be ignored.key
- Path to the client's private key (.pem
). Specify this along withcert
andca
to use a TLS client to connect to the docker daemon.endpoint
must also be specified or this setting will be ignored.ca
- Path to the server's CA file (.pem
). Specify this along withcert
andkey
to use a TLS client to connect to the docker daemon.endpoint
must also be specified or this setting will be ignored.
disable_log_collection
- Defaults tofalse
. Setting this to true will disable Nomad logs collection of Docker tasks. If you don't rely on nomad log capabilities and exclusively use host based log aggregation, you may consider this option to disable nomad log collection overhead.extra_labels
- Extra labels to add to Docker containers. Available options arejob_name
,job_id
,task_group_name
,task_name
,namespace
,node_name
,node_id
. Globs are supported (e.g.task*
)logging
block:type
- Defaults to"json-file"
. Specifies the logging driver docker should use for all containers Nomad starts. Note that for older versions of Docker, onlyjson-file
file orjournald
will allow Nomad to read the driver's logs via the Docker API, and this will prevent commands such asnomad alloc logs
from functioning.config
- Defaults to{ max-file = "2", max-size = "2m" }
. This option can also be used to pass further configuration to the logging driver.
gc
block:image
- Defaults totrue
. Changing this tofalse
will prevent Nomad from removing images from stopped tasks.image_delay
- A time duration, as defined here, that defaults to3m
. The delay controls how long Nomad will wait between an image being unused and deleting it. If a task is received that uses the same image within the delay, the image will be reused. If an image is referenced by more than one tag,image_delay
may not work correctly.container
- Defaults totrue
. This option can be used to disable Nomad from removing a container when the task exits. Under a name conflict, Nomad may still remove the dead container.dangling_containers
block for controlling dangling container detection and cleanup:enabled
- Defaults totrue
. Enables dangling container handling.dry_run
- Defaults tofalse
. Only log dangling containers without cleaning them up.period
- Defaults to"5m"
. A time duration that controls interval between Nomad scans for dangling containers.creation_grace
- Defaults to"5m"
. Grace period after a container is created during which the GC ignores it. Only used to prevent the GC from removing newly created containers before they are registered with the GC. Should not need adjusting higher but may be adjusted lower to GC more aggressively.
volumes
block:enabled
- Defaults tofalse
. Allows tasks to bind host paths (volumes
) inside their container and use volume drivers (volume_driver
). Binding relative paths is always allowed and will be resolved relative to the allocation's directory.selinuxlabel
- Allows the operator to set a SELinux label to the allocation and task local bind-mounts to containers. If used withdocker.volumes.enabled
set to false, the labels will still be applied to the standard binds in the container.
infra_image
- This is the Docker image to use when creating the parent container necessary when sharing network namespaces between tasks. Defaults togcr.io/google_containers/pause-<goarch>:3.1
. The image will only be pulled from the container registry if its tag islatest
or the image doesn't yet exist locally.infra_image_pull_timeout
- A time duration that controls how long Nomad will wait before cancelling an in-progress pull of the Docker image as specified ininfra_image
. Defaults to"5m"
.windows_allow_insecure_container_admin
- Indicates that on windows, docker checks thetask.user
field or, if unset, the container image manifest after pulling the container, to see if it's running asContainerAdmin
. If so, exits with an error unless the task config hasprivileged=true
. Defaults tofalse
.
Client Configuration
Note: client configuration options will soon be deprecated. Please use plugin options instead. See the plugin block documentation for more information.
The docker
driver has the following client configuration
options:
docker.endpoint
- If using a non-standard socket, HTTP or another location, or if TLS is being used,docker.endpoint
must be set. If unset, Nomad will attempt to instantiate a Docker client using theDOCKER_HOST
environment variable and then fall back to the default listen address for the given operating system. Defaults tounix:///var/run/docker.sock
on Unix platforms andnpipe:////./pipe/docker_engine
for Windows.docker.auth.config
- Allows an operator to specify a JSON file which is in the dockercfg format containing authentication information for a private registry, from either (in order)auths
,credsStore
orcredHelpers
.docker.auth.helper
- Allows an operator to specify a credsStore -like script on \$PATH to lookup authentication information from external sources. The script's name must begin withdocker-credential-
and this option should include only the basename of the script, not the path.docker.tls.cert
- Path to the server's certificate file (.pem
). Specify this along withdocker.tls.key
anddocker.tls.ca
to use a TLS client to connect to the docker daemon.docker.endpoint
must also be specified or this setting will be ignored.docker.tls.key
- Path to the client's private key (.pem
). Specify this along withdocker.tls.cert
anddocker.tls.ca
to use a TLS client to connect to the docker daemon.docker.endpoint
must also be specified or this setting will be ignored.docker.tls.ca
- Path to the server's CA file (.pem
). Specify this along withdocker.tls.cert
anddocker.tls.key
to use a TLS client to connect to the docker daemon.docker.endpoint
must also be specified or this setting will be ignored.docker.cleanup.image
Defaults totrue
. Changing this tofalse
will prevent Nomad from removing images from stopped tasks.docker.cleanup.image.delay
A time duration, as defined here, that defaults to3m
. The delay controls how long Nomad will wait between an image being unused and deleting it. If a tasks is received that uses the same image within the delay, the image will be reused.docker.volumes.enabled
: Defaults tofalse
. Allows tasks to bind host paths (volumes
) inside their container and use volume drivers (volume_driver
). Binding relative paths is always allowed and will be resolved relative to the allocation's directory.docker.volumes.selinuxlabel
: Allows the operator to set a SELinux label to the allocation and task local bind-mounts to containers. If used withdocker.volumes.enabled
set to false, the labels will still be applied to the standard binds in the container.docker.privileged.enabled
Defaults tofalse
. Changing this totrue
will allow containers to useprivileged
mode, which gives the containers full access to the host's devices. Note that you must set a similar setting on the Docker daemon for this to work.docker.caps.allowlist
: A list of allowed Linux capabilities. Defaults to"CHOWN,DAC_OVERRIDE,FSETID,FOWNER,MKNOD,NET_RAW,SETGID,SETUID,SETFCAP, SETPCAP,NET_BIND_SERVICE,SYS_CHROOT,KILL,AUDIT_WRITE"
, which is the list of capabilities allowed by docker by default, as defined here. Allows the operator to control which capabilities can be obtained by tasks usingcap_add
andcap_drop
options. Supports the value"ALL"
as a shortcut for allowlisting all capabilities.docker.cleanup.container
: Defaults totrue
. This option can be used to disable Nomad from removing a container when the task exits. Under a name conflict, Nomad may still remove the dead container.docker.nvidia_runtime
: Defaults tonvidia
. This option allows operators to select the runtime that should be used in order to expose Nvidia GPUs to the container.
Note: When testing or using the -dev
flag you can use DOCKER_HOST
,
DOCKER_TLS_VERIFY
, and DOCKER_CERT_PATH
to customize Nomad's behavior. If
docker.endpoint
is set Nomad will only read client configuration from the
config file.
An example is given below:
client {
options {
"docker.cleanup.image" = "false"
}
}
Client Attributes
The docker
driver will set the following client attributes:
driver.docker
- This will be set to "1", indicating the driver is available.driver.docker.bridge_ip
- The IP of the Docker bridge network if one exists.driver.docker.version
- This will be set to version of the docker server.
Here is an example of using these properties in a job file:
job "docs" {
# Require docker version higher than 1.2.
constraint {
attribute = "${attr.driver.docker.version}"
operator = ">"
version = "1.2"
}
}
Resource Isolation
CPU
Nomad limits containers' CPU based on CPU shares. CPU shares allow containers
to burst past their CPU limits. CPU limits will only be imposed when there is
contention for resources. When the host is under load your process may be
throttled to stabilize QoS depending on how many shares it has. You can see how
many CPU shares are available to your process by reading NOMAD_CPU_LIMIT
.
1000 shares are approximately equal to 1 GHz.
Please keep the implications of CPU shares in mind when you load test workloads on Nomad.
If resources cores
is set, the task is given an isolated reserved set of
CPU cores to make use of. The total set of cores the task may run on is the private
set combined with the variable set of unreserved cores. The private set of CPU cores
is available to your process by reading NOMAD_CPU_CORES
.
Memory
Nomad limits containers' memory usage based on total virtual memory. This means that containers scheduled by Nomad cannot use swap. This is to ensure that a swappy process does not degrade performance for other workloads on the same host.
Since memory is not an elastic resource, you will need to make sure your
container does not exceed the amount of memory allocated to it, or it will be
terminated or crash when it tries to malloc. A process can inspect its memory
limit by reading NOMAD_MEMORY_LIMIT
, but will need to track its own memory
usage. Memory limit is expressed in megabytes so 1024 = 1 GB.
IO
Nomad's Docker integration does not currently provide QoS around network or filesystem IO. These will be added in a later release.
Security
Docker provides resource isolation by way of cgroups and namespaces. Containers essentially have a virtual file system all to themselves. If you need a higher degree of isolation between processes for security or other reasons, it is recommended to use full virtualization like QEMU.
Caveats
Dangling Containers
Nomad has a detector and a reaper for dangling Docker containers, containers that Nomad starts yet does not manage or track. Though rare, they lead to unexpectedly running services, potentially with stale versions.
When Docker daemon becomes unavailable as Nomad starts a task, it is possible for Docker to successfully start the container but return a 500 error code from the API call. In such cases, Nomad retries and eventually aims to kill such containers. However, if the Docker Engine remains unhealthy, subsequent retries and stop attempts may still fail, and the started container becomes a dangling container that Nomad no longer manages.
The newly added reaper periodically scans for such containers. It only targets
containers with a com.hashicorp.nomad.allocation_id
label, or match Nomad's
conventions for naming and bind-mounts (i.e. /alloc
, /secrets
, local
).
Containers that don't match Nomad container patterns are left untouched.
Operators can run the reaper in a dry-run mode, where it only logs dangling
container ids without killing them, or disable it by setting the
gc.dangling_containers
config block.
Docker for Windows
Docker for Windows only supports running Windows containers. Because Docker for Windows is relatively new and rapidly evolving you may want to consult the list of relevant issues on GitHub.