centos7
Airflow 1.10.6
Python 3.6.8
Mysql 5.6
redis 3.3
Slightly (by Baidu)
Pay attention to open the remote connection (close the firewall)
The character set is uniformly modified to UTF8 (utf8mb4 can also be used) to prevent garbled characters
A higher version of mysql or Maria DB will report an error of VARCHAR(5000). Suggest a lower version
The reason is that the high version of the database limits the maximum length of VARCHER for efficiency
postgresql has not tried to add later
Python installation slightly (by Baidu)
Please add python to the environment variable (convenient)
vim ~/.bashrc
# Add a line of environment variable export AIRFLOW_HOME=/opt/airflow
source ~/.bashrc
export SLUGIFY_USES_TEXT_UNIDECODE=yes
# Generate configuration file, some errors may be reported, please ignore, ensure AIRFLOW_Generated under the HOME directory.cfg and related documents prove the success of this execution#If the environment variables of pytho are configured, execute directly#Not configured in${PYTHON_HOME}/lib/python3.6/sit-packages/airflow/Execute in the bin directory`./airflow`
pip install apache-airflow
pip install'apache-airflow[mysql]'
pip install'apache-airflow[celery]'
pip install'apache-airflow[redis]'
# sqlalchemy link
sql_alchemy_conn = mysql://username:password@localhost:3306/airflow
# Configure actuator
executor=CeleryExecutor
# Configure celery broker_url
broker_url = redis://lochost:5379/0
# Configure metadata information management
result_backend = db+mysql://username:password@localhost:3306/airflow
# Create user group and user groupadd airflow
useradd airflow -g airflow
# will{AIRFLOW_HOME}Directory repair user group cd/opt/
chgrp -R airflow airflow
# Start web service in the foreground
airflow webserver
# Start web service in the background
airflow webserver -D
# Start scheduler in the foreground
airflow schedule
# Start scheduler in the background
airflow scheduler -D
# The worker host only needs to open the airflow worker with ordinary users#Create user airflowuseradd airflow
# Set password passwd airflow for user test
# Under the root user, change the permissions of the airflow folder and set it to fully open chmod-R 777 /opt/airflow
# Switch to a normal user and execute the airflow worker command#Found that ordinary users read during startup~/.bashrc file is inconsistent and rejoin AIRFLOW_HOME will do#If you configure the environment variables before creating a new normal user, you may not have this problem. I modified the environment variables after creating the user.
airflow worker
worker.png
# Run temporary variables before executing the worker (temporary cannot be used permanently) export C_FORCE_ROOT="true"#No need to switch user cd/usr/local/python3/bin/
# Start worker service in the foreground
airflow worker
# Start the work service in the background
airflow worker -D
default_timezone = Asia/Shanghai
The reference is as follows:
cd /usr/local/lib/python3.6/site-packages/airflow
# In utc= pendulum.timezone(‘UTC’)This line(Line 27)Add from airflow under the code.configuration import conf
try:
tz = conf.get("core","default_timezone")if tz =="system":
utc = pendulum.local_timezone()else:
utc = pendulum.timezone(tz)except Exception:pass#Modify utcnow()function(On line 69)
Original code d= dt.datetime.utcnow()
Amended to d= dt.datetime.now()
# In utc= pendulum.timezone(‘UTC’)This line(Line 37)Add from airflow under the code.configuration import conf
try:
tz = conf.get("core","default_timezone")if tz =="system":
utc = pendulum.local_timezone()else:
utc = pendulum.timezone(tz)except Exception:pass
Put the code var UTCseconds=(x.getTime()+ x.getTimezoneOffset()*60*1000);
Change to var UTCseconds= x.getTime();
Put the code"timeFormat":"H:i:s %UTC%",
To"timeFormat":"H:i:s",
Refer to aiflow official document
email_backend = airflow.utils.email.send_email_smtp
smtp in the mailbox server address you want to set, check in the mailbox settings (here 163
smtp_host = smtp.163.com
Mailbox communication protocol
smtp_starttls = False
smtp_ssl = True
Your email address
smtp_user = [email protected]
Check your email authorization code in email settings or Baidu
smtp_password = 16-bit authorization code
Mailbox service port
smtp_port = port
Your email address smtp_mail_from = [email protected]
Add parameters to default_args in dag
default_args ={
# Accept mailbox
' email':['[email protected]''],
# Whether to send mail when task fails
' email_on_failure': True,
# whether task retry to send mail
' email_on_retry': False,}
——————————————————————————————————————————————
Set in the global variables of airflow
Add parameters to the DAG to control the entire dag
dag =DAG(f"dag_name",
default_args=default_args,
schedule_interval="0 12 * * *",
max_active_runs =1)
Set the parameters in the Operator in each task
t3 =PythonOperator(
task_id='demo_task',
provide_context=True,
python_callable=demo_task,
task_concurrency=1,
dag=dag)
Recommended Posts