(Archived) root360 Container Platform

You have tried to access an archived page. Please go to the new https://root360.atlassian.net/wiki/spaces/KB to find more documents.

Terminology

Container: a docker container
Container type: a term describing a set of containers that have the same role, e.g. api, core, booking-engine. It is defined by a corresponding Task Definition.
ECS Service: manages the containers of a certain container type.
Image: a docker image
Docker Cluster: A logical entity consisting of a number of docker hosts
Task Definition: an extended, ECS-specific docker compose file (JSON)
ECS Task: Instance of a Task Definition, usually represents a single container
Desired Tasks: A number of containers based on the same image (horizontal scale)
Revision: Version of an ECS Task, that has a set of properties, such as an image-URL pointing to an image in a registry
Registry: a system for storing and managing images
Repository: a logical space seperator within the registry to separate images e.g. by type or purpose

The deployment process

Docker Container Deployment describes a process of updating a fleet of running docker containers that are operated via Amazon AWS EC2 Container Service (ECS).

Deployment strategies

Per default a rolling update is used, starting new containers based on a new docker image. The method is explained below in the context of a web application. Knowledge of typical ECS terms are useful. Besides rolling updates it is also possible to have replacing updates. This means running containers of a container type are terminated and then new ones are launched based on the submitted image.

The main difference between replacing and rolling updates is that deployments run with 0 downtime only in the later case. However, additional capacity has to be provided permanently. Thus increasing cost of permanently running docker hosts in the cluster.

Process example

Initiation of deployment from the Jump-Server via CLI for a Container Type providing a new image-URL
Creation of a new ECS Task Definition with the new image-URL, which has its own version number (revision)
Update of the respective ECS Service managing the Containers of the Container Type of consideration
Creation of new Containers based on the new image-URL in the Docker Cluster
All new Containers are taken into Service from the Loadbalancer
Connections to all old containers are drained by the Loadbalancer, meaning established ones are allowed to finish, bu no new ones are established to them
Shutdown of all old containers
Finish

Safeguards

Several mechanism are in place to recognize success and handle failure and rollback:

Container state verification
- A deployment is successful if all new containers are recognized as running and the Loadbalancer has taken them into service successfully
Image-URL verification
- Checking to given Image-URL to be a valid String
- If the docker registry images should be pulled from is part of your environment (within the same AWS account) we try to locate the given image
Roll-back by Roll-forward in case of Timeout
- If a deployment can not finish for whatever reason and runs into a timeout and a roll-back is initiated
Roll-back by Roll-forward in case of failed deployment
- If a deployment fails bringing up the new docker containers a roll-back is initiated
Circuit Breaker
- If a deployment fails bringing up the new docker containers AWS ECS changes the interval it tries to start new containers in from asap to every 5 minutes. Ultimatly it stops the deployment process after around 1.5 hours.

The roll-back process

If a deployment process is not successful it must be dealt with. However, AWS ECS does not offer a mechanism to just stop the deployment process (and undo steps taken so far).

To cope with this 3 possibilities exist:

If an issue with the docker cluster or the loadbalancer is reported during the deployment process, solve it (fast)
If the image to be deployed is broken, initiate a new deployment with another one
Initiate a deployment process based on the last known stable state

The 2 later ones are named as roll-forward deployments.

The phrase Roll-back by Roll-forward used in the Safeguards section above thus refers to:

recognize the running deployment as failed
identify the last known stable state (which is described by the Task Definition, which was associated with the ECS Service before we initiated the deployment)
initiate a new deployment referencing to this last known stable state

Native docker commands

The following commands can be called via "sudo" by default, meaning in every environment:

docker ps, docker ps -a
docker images, docker images -a
docker logs [OPTION] CONTAINERID
docker exec -it CONTAINERID bash, docker exec -it CONTAINERID sh
docker stats, docker stats CONTAINERID
docker inspect CONTAINERID

For non-productive environments root360 can also enable access to the following commands e.g. to support debugging and analysis:

docker run
docker pull
docker start
docker create

You can find further details about the individual commands in docker cli reference.

Example usage scenarios

A great example is the docker start command. If you deployed a new container image to your test environment and it wouldn't come up you are in a dilemma.

On the one hand you want to look into the failing container, but on the other hand AWS ECS keeps trying to launch and terminate containers. Thus it will constantly kill the container you are actually want to keep a little bit longer for analysis.

By using docker start to run a container image AWS ECS has created on a docker host before, you can do so without AWS ECS interfering anymore.

root360 Cloud Platform Management Suite commands

Constraints

All commands operating on docker hosts and containers and the tasks to manipulate them are available in the "r3 container" subcommand.

Aliases are not supported.

A general description how the r3 CLI Suite works can be found in root360 Cloud Management CLI Suite Handbook.

In the following actual usage examples are presented.

What are the general command line options?

r3 container overview

USER@JUMPSERVER:~$ r3 container -h

# Command Response:
Manage Docker Hosts and Containers on root360 Cloud Platform

optional arguments:
  -h, --help          show this help message and exit

Command Overview:
  {list,deploy,show}
    list              List Container by Hosts
    deploy            Deploy Containers to Hosts
    show              Show detailed Information

On which host which container is running at the moment and which resources are booked?

r3 container list

USER@JUMPSERVER:~$ r3 container list 

# Command Response:
  host: 10.12.57.129  | CPU (free/max): 1024/1024  | memory (free/max): 2000/2000
    container name                      	  image                    	  reserved CPU   	  reserved Memory	  state    	  launched         
    -                                   	  -                        	  -              	  -              	  -        	  -
  host: ...

Which container types can be deployed?

r3 container list --container-types

USER@JUMPSERVER:~$ r3 container list --container-types

# Command Response:
['container-type-a', 'container-type-b', ...]

How to start a new deployment for container-type x?

r3 container deploy --container-type CONTAINER-TYPE-Name --image-url Registry/Repository:Tag

On which image do the currently running containers of type X base?

r3 container show --container-type CONTAINER-TYPE-Name --image

Show the latest events for a given container type

r3 container show --container-type CONTAINER-TYPE-Name --events

Shows the deployment status for a container-type

sudo show-ecs-service-status.py -s CONTAINER-TYPE-Name

The important points here are:

The service is ACTIVE, which means that containers are started in the cluster
"desired" specifies the number of container instances to run as well
"running" indicates how many container instances actually run
"pending" indicates how many container instances are still in the start process
"deploys" describes existing deployment processes or the current state
- status: "PRIMARY" specifies which container revision (" taskDefinition" ) is currently running
if there is more than one block under "deploys", a migration process is displayed as part of a ongoing deployment (rolling update)

Show detailed information about all containers of container-type X (see following example for explanation)

container-status Value

The value for the attribute --container-status can be either lookup or a valid ECS TaskID. The ECS TaskID is the last portion of the TaskARN.

You can obtain the value from:

an unspecific lookup like shown in the following
the event log, e.g. task 5d1c1443-0dff-4378-bce3-ed116db70b96 shown in the example above

Example:

TaskARN = "arn:aws:ecs:eu-west-1:XXXX:task/29dce5e3-16d8-45fe-acc2-0da74d93f069"
TaskID = 29dce5e3-16d8-45fe-acc2-0da74d93f069

r3 container show --container-type CONTAINER-TYPE-Name --container-status lookup

Show detailed information of a dedicated container of container type X

r3 container show --container-type CONTAINER-TYPE-Name --container-status TaskID