[Airflow] Using the KubernetesPodOperator on Cloud Composer
Cloud Composer is a fully managed version of the open source workflow tool Apache Airflow on Google Cloud Platform (GCP). To run docker container from Cloud Composer, one of the way is to use the KubernetesPodOperator
, which can launch Kubernetes pods into Kubernetes.
This post will cover these topics:
- Build container images with Google Cloud Build
- Create Kubernetes Secrets
- Using the
KubernetesPodOperator
Build Container Images with Google Cloud Build
Google Cloud Build is a fully managed solution for building containers or other artifacts. It can import source code from Google Cloud Storage, Cloud Source Repositories, GitHub and BitBucket.
Build config file
A build pipeline can be easily set up by a YAML or json file in the same directory that contains the application source code and the Dockerfile. Build steps are defined in this config file. For example:
|
|
There are two steps defined in the config file. Cloud Build provides some pre-build images (base images), named Cloud Builders. In the example file above, the name
field specifies that the pre-build Docker image is used by Cloud Build, and the args
field are the args passing to the image for execution.
Build command
Starting the build with:
|
|
When the build completes, Cloud Build will upload the built image to the container registry. You can also pass in parameters (called substitutions) to the build and customize some build options such as image tags. For more details, please read the documentation.
Create Kubernetes Secrets
A Secret object contains a small amount of sensitive data such as a password, a token, or a key. To use a secret, a Pod needs to reference the secret.
Creating a Secret Using kubectl
To create a Kubernetes secret that sets tha value of my_secret
to test_value
, run the following command:
|
|
If you need to mount a secret file, you can create your secret use:
|
|
Note that the scrects cannot be access from different namespace.
Define Secret in DAG
To reference the built secret in a DAG as an environment variable, you can use the following code:
|
|
deploy_type
: The exposing type of the Secret.deploy_target
: Name of the environment variable, since deploy_type isenv
rather thanvolume
.secret
: Name of the built Kubernetes Secret.key
: Key name of a secret stored in this Secret object.
In this example, there is a built Secret object named airflow-secret
, one of the key stored in this Secret object is called my_secret
. The object will be deployed to kubernetes Pods as an enviornment variable, and the key of the variable is MY_SECRET
.
To reference the built secret file:
|
|
Note that the completed mount path will be /path/to/secret/file/file_path.json
.
Also, you need to define the secret to pass to Pod. The way to pass it will be covered in the next section.
Using the KubernetesPodOperator
Let’s take a look into an example first:
|
|
task_id
: ID specified for the task.name
: Name of task you want to run, used to generate Pod ID.namespace
: Namespace to run within Kubernetes, default namespace isdefault
.image
: The docker image to use. In our case, it’s the image built in section Build Container Images with Google Cloud Build.secrets
: Name of the Kubernetes Secret, defined in section Define Secret in DAG. The Pod will fail to create if the secrets you specify in a Secret object do not exist in Kubernetes.arguments
: Arguments to the entrypoint, jinja template is allowed.env_vars
: To access the variables defined in Airflow UI. In this case we are getting the value ofmy_var
from the UI and pass it to the environment variableENV_VAR
.
For more information about each configuration variable, see the Airflow reference.