The airflow
utility is an orchestrator that allows for workflows to be programmatically authored, scheduled, and monitored.
This utility plugin is meant to be used in favor of the Airflow orchestrator plugin type.
EDK Based Plugin
Getting Started
Prerequisites
If you haven't already, follow the initial steps of the Getting Started guide:
Installation and configuration
-
Add the airflow utility to your
project using
:meltano add
-
Configure the airflow
settings using
:meltano config
meltano add utility airflow
meltano config airflow set --interactive
Next steps
Use the meltano schedule command to create pipeline schedules in your project, to be run by Airflow.
If you're running Airflow for the first time in a new environment:
# explicitly seed the database, create default airflow.cfg, deploy the meltano dag orchestrator meltano invoke airflow:initialize # create an airflow user with admin privileges meltano invoke airflow users create -u admin@localhost -p password --role Admin -e admin@localhost -f admin -l admin
Launch the Airflow UI and log in using the username/password you created:
meltano invoke airflow webserver
By default, the UI will be available at at
http://localhost:8080
. You can change this using thewebserver.web_server_port
setting documented below.Start Scheduler or execute Airflow commands directly using the instructions in the Meltano docs.
If you run into any issues, learn how to get help.
Capabilities
This plugin currently has no capabilities defined. If you know the capabilities required by this plugin, please contribute!Settings
Meltano centralizes the configuration of all of the plugins in your project, including Airflow's. This means that if the Airflow documentation tells you to put something in airflow.cfg
, you can use meltano config
, meltano.yml
, or environment variables instead, and get the benefits of Meltano features like environments.
Any setting you can add to airflow.cfg
can be added to meltano.yml
, manually or using meltano config
. For example, [core] executor = SequentialExecutor
becomes meltano config airflow set core executor SequentialExecutor
on the CLI, or core.executor: SequentialExecutor
in meltano.yml
. Config sections indicated by [section]
in airflow.cfg
become nested dictionaries in meltano.yml
.
The
airflow
settings that are known to Meltano are documented below. To quickly
find the setting you're looking for, click on any setting name from the list:
core.dags_are_paused_at_creation
core.dags_folder
core.load_examples
core.plugins_folder
database.sql_alchemy_conn
extension.airflow_config
extension.airflow_home
logging.base_log_folder
logging.dag_processor_manager_log_location
scheduler.child_process_log_directory
webserver.web_server_port
You can also list these settings using
with the meltano config
list
subcommand:
meltano config airflow list
You can
override these settings or specify additional ones
in your meltano.yml
by adding the settings
key.
Please consider adding any settings you have defined locally to this definition on MeltanoHub by making a pull request to the YAML file that defines the settings for this plugin.
Pause DAGs at Creation (core.dags_are_paused_at_creation)
-
Environment variable:
AIRFLOW_CORE_DAGS_ARE_PAUSED_AT_CREATION
-
Default Value:
false
Configure this setting directly using the following Meltano command:
meltano config airflow set core dags_are_paused_at_creation [value]
DAGs Folder (core.dags_folder)
-
Environment variable:
AIRFLOW_CORE_DAGS_FOLDER
-
Default Value:
$MELTANO_PROJECT_ROOT/orchestrate/airflow/dags
Configure this setting directly using the following Meltano command:
meltano config airflow set core dags_folder [value]
Load Examples (core.load_examples)
-
Environment variable:
AIRFLOW_CORE_LOAD_EXAMPLES
-
Default Value:
false
Configure this setting directly using the following Meltano command:
meltano config airflow set core load_examples [value]
Plugins Folder (core.plugins_folder)
-
Environment variable:
AIRFLOW_CORE_PLUGINS_FOLDER
-
Default Value:
$MELTANO_PROJECT_ROOT/orchestrate/airflow/plugins
Configure this setting directly using the following Meltano command:
meltano config airflow set core plugins_folder [value]
SQL Alchemy Connection (database.sql_alchemy_conn)
-
Environment variable:
AIRFLOW_DATABASE_SQL_ALCHEMY_CONN
-
Default Value:
sqlite:///$MELTANO_PROJECT_ROOT/.meltano/utilities/airflow/airflow.db
Configure this setting directly using the following Meltano command:
meltano config airflow set database sql_alchemy_conn [value]
Airflow Home (extension.airflow_config)
-
Environment variable:
AIRFLOW_EXTENSION_AIRFLOW_CONFIG
-
Default Value:
$MELTANO_PROJECT_ROOT/orchestrate/airflow/airflow.cfg
The path where the Airflow configuration file will be stored.
Configure this setting directly using the following Meltano command:
meltano config airflow set extension airflow_config [value]
Airflow Home (extension.airflow_home)
-
Environment variable:
AIRFLOW_EXTENSION_AIRFLOW_HOME
-
Default Value:
$MELTANO_PROJECT_ROOT/orchestrate/airflow
The directory where Airflow will store its configuration, logs, and other files.
Configure this setting directly using the following Meltano command:
meltano config airflow set extension airflow_home [value]
Base Log Folder (logging.base_log_folder)
-
Environment variable:
AIRFLOW_LOGGING_BASE_LOG_FOLDER
-
Default Value:
$MELTANO_PROJECT_ROOT/.meltano/utilities/airflow/logs
The folder where airflow should store its log files. This path must be absolute. There are a few existing configurations that assume this is set to the default. If you choose to override this you may need to update the dag_processor_manager_log_location and child_process_log_directory settings as well.
Configure this setting directly using the following Meltano command:
meltano config airflow set logging base_log_folder [value]
Dag Processor Manager Log Location (logging.dag_processor_manager_log_location)
-
Environment variable:
AIRFLOW_LOGGING_DAG_PROCESSOR_MANAGER_LOG_LOCATION
-
Default Value:
$MELTANO_PROJECT_ROOT/.meltano/utilities/airflow/logs/dag_processor_manager/dag_processor_manager.log
Where to send dag parser logs.
Configure this setting directly using the following Meltano command:
meltano config airflow set logging dag_processor_manager_log_location [value]
Child Process Log Directory (scheduler.child_process_log_directory)
-
Environment variable:
AIRFLOW_SCHEDULER_CHILD_PROCESS_LOG_DIRECTORY
-
Default Value:
$MELTANO_PROJECT_ROOT/.meltano/utilities/airflow/logs/scheduler
Where to send the logs of each scheduler process.
Configure this setting directly using the following Meltano command:
meltano config airflow set scheduler child_process_log_directory [value]
Webserver Port (webserver.web_server_port)
-
Environment variable:
AIRFLOW_WEBSERVER_WEB_SERVER_PORT
-
Default Value:
8080
Configure this setting directly using the following Meltano command:
meltano config airflow set webserver web_server_port [value]
Commands
The airflow utility supports the following commands that can be used withmeltano invoke
:create-admin
-
Equivalent to:
users create --username admin --firstname FIRST_NAME --lastname LAST_NAME --role Admin --email admin@example.org
Create an Airflow user with admin privileges.
meltano invoke airflow:create-admin [args...]
describe
-
Equivalent to:
describe
Describe the Airflow Extension
meltano invoke airflow:describe [args...]
initialize
-
Equivalent to:
initialize
Initialize the Airflow Extension which will seed the database, create the default airflow.cfg, and deploy the Meltano DAG orchestrator.
meltano invoke airflow:initialize [args...]
ui
-
Equivalent to:
webserver
Start the Airflow webserver.
meltano invoke airflow:ui [args...]
Something missing?
This page is generated from a YAML file that you can contribute changes to.
Edit it on GitHub!Looking for help?
#plugins-general
channel.