Centos7 installation and deployment of Airflow detailed

Airflow(1.10)+celery+redis installation under Centos7#

Installation environment and version##

centos7

Airflow 1.10.6

Python 3.6.8

Mysql 5.6

redis 3.3

installation##

Database installation###

Slightly (by Baidu)

airflow installation###

vim ~/.bashrc
# Add a line of environment variable export AIRFLOW_HOME=/opt/airflow
source ~/.bashrc
export SLUGIFY_USES_TEXT_UNIDECODE=yes

Install airflow

# Generate configuration file, some errors may be reported, please ignore, ensure AIRFLOW_Generated under the HOME directory.cfg and related documents prove the success of this execution#If the environment variables of pytho are configured, execute directly#Not configured in${PYTHON_HOME}/lib/python3.6/sit-packages/airflow/Execute in the bin directory`./airflow`

pip install apache-airflow

Install airflow related dependencies#

pip install'apache-airflow[mysql]'
pip install'apache-airflow[celery]'
pip install'apache-airflow[redis]'

Configuration##

Modify the configuration file###

# sqlalchemy link
sql_alchemy_conn = mysql://username:password@localhost:3306/airflow
# Configure actuator
executor=CeleryExecutor
# Configure celery broker_url
broker_url = redis://lochost:5379/0
# Configure metadata information management
result_backend = db+mysql://username:password@localhost:3306/airflow

Create user (worker is not allowed to be executed under root user)

# Create user group and user groupadd airflow
useradd airflow -g airflow
# will{AIRFLOW_HOME}Directory repair user group cd/opt/
chgrp -R airflow airflow

start up##

# Start web service in the foreground
airflow webserver 

# Start web service in the background
airflow webserver -D

# Start scheduler in the foreground
airflow schedule

# Start scheduler in the background
airflow scheduler -D

Start worker

# The worker host only needs to open the airflow worker with ordinary users#Create user airflowuseradd airflow

# Set password passwd airflow for user test

# Under the root user, change the permissions of the airflow folder and set it to fully open chmod-R 777 /opt/airflow

# Switch to a normal user and execute the airflow worker command#Found that ordinary users read during startup~/.bashrc file is inconsistent and rejoin AIRFLOW_HOME will do#If you configure the environment variables before creating a new normal user, you may not have this problem. I modified the environment variables after creating the user.
airflow worker 

worker.png

# Run temporary variables before executing the worker (temporary cannot be used permanently) export C_FORCE_ROOT="true"#No need to switch user cd/usr/local/python3/bin/

# Start worker service in the foreground
airflow worker

# Start the work service in the background
airflow worker -D

Modify time zone##

default_timezone = Asia/Shanghai
The reference is as follows:
cd /usr/local/lib/python3.6/site-packages/airflow
# In utc= pendulum.timezone(‘UTC’)This line(Line 27)Add from airflow under the code.configuration import conf
try:
	tz = conf.get("core","default_timezone")if tz =="system":
		utc = pendulum.local_timezone()else:
		utc = pendulum.timezone(tz)except Exception:pass#Modify utcnow()function(On line 69)

Original code d= dt.datetime.utcnow() 
Amended to d= dt.datetime.now()
# In utc= pendulum.timezone(‘UTC’)This line(Line 37)Add from airflow under the code.configuration  import conf
try:
	tz = conf.get("core","default_timezone")if tz =="system":
		utc = pendulum.local_timezone()else:
		utc = pendulum.timezone(tz)except Exception:pass
Put the code var UTCseconds=(x.getTime()+ x.getTimezoneOffset()*60*1000); 
Change to var UTCseconds= x.getTime();

Put the code"timeFormat":"H:i:s %UTC%",
To"timeFormat":"H:i:s",

Configure email alarm and modify it in airflow configuration file airflow.cfg##

default_args ={
 # Accept mailbox
 ' email':['[email protected]''],
 # Whether to send mail when task fails
 ' email_on_failure': True,
 # whether task retry to send mail
 ' email_on_retry': False,}

——————————————————————————————————————————————

supplement##

When running a task, it is found that some tasks will have abnormal data when they are in parallel. Solution:

Set in the global variables of airflow

Add parameters to the DAG to control the entire dag

dag =DAG(f"dag_name",
   default_args=default_args,
   schedule_interval="0 12 * * *",
   max_active_runs =1)

Set the parameters in the Operator in each task

t3 =PythonOperator(
 task_id='demo_task',
 provide_context=True,
 python_callable=demo_task,
 task_concurrency=1,
 dag=dag)

Please correct me if there are any errors#

Recommended Posts

Centos7 installation and deployment of Airflow detailed
Centos6.5 installation and deployment of KVM
Centos-6.5 installation and deployment of LNMP environment
Centos7 installation and deployment of gitlab server
Centos8 minimal deployment and installation of OpenStack Ussuri detailed tutorial
Centos7 installation of PHP and Nginx tutorial detailed
MySQL 8.0 installation and deployment under CentOS, super detailed!
Centos7 installation and configuration of Jenkins
CentOS 8 installation of MariaDB detailed tutorial
CentOS7 installation and maintenance of Gitlab
CentOs7 installation and deployment Zabbix3.4 original
Detailed explanation of Spark installation and configuration tutorial under centOS7
ubuntu Docker installation and deployment of Rancher
Installation and use of Mysql under CentOS
MySQL 8.0 installation, deployment and configuration under CentOS 6/7
Installation and deployment of Nginx in Ubuntu
Zabbix installation and deployment and localization under CentOS
Jenkins installation and deployment tutorial under CentOS 7
Graphical installation of CentOS8
Installation and configuration of JDK in CentOS 7 system
CentOS 6.5 system installation and configuration graphic tutorial (detailed graphic)
Installation and configuration of rsync server under CentOS 6.5
CentOS7 Docker Nginx deployment and operation detailed explanation
MySQL 8.0 installation, deployment and configuration tutorial on CentOS 8
Installation and cracking of confluence6.3 operation records under Centos
Centos mysql installation and configuration
Centos7 installation and configuration prometheus
CentOS 7 installation and configuration PPTP
Deployment of graphite on centos7
CentOS installation and configuration cmake
Graphical centos installation detailed process
Centos7.5 installation and configuration MongoDB4.0.4
CentOS 7 installation and configuration PPTP
centos7 kvm installation and use
CentOS7 postgresql installation and use
Detailed installation steps of CentOS6.4 system in virtual machine
CentOS environment installation of Docker
Centos7 elk7.1.1 installation and use
CentOS7 installation and maintenance of nginx from entry to master
Deployment of vulnerability scanning and analysis software Nessus under CentOS
Analysis of Hyper-V installation CentOS 8 problem
Centos7 installation of Dameng database tutorial
Installation under centos6.9 of jenkins learning
CentOS6 minimal installation KVM detailed tutorial
Centos7 hadoop cluster installation and configuration
Centos8 installation diagram (super detailed tutorial)
Detailed examples of Centos6 network configuration
CentOS 7.X system installation and optimization
Java-JDK installation and configuration under CentOS
CentOS 7 Tomcat service installation and configuration
Centos 7 RAID 5 detailed explanation and configuration
001. Installation of enterprise-level CentOS7.6 operating system
CentOS NTP server installation and configuration
CentOS deployment method of flask project
2-Kubernetes entry manual installation and deployment
JumpServer Bastion Host--CentOS 8 Installation and Deployment (4)
Centos7 mysql database installation and configuration
Detailed explanation of quick installation and configuration of Subversion (SVN) under Ubuntu
Detailed explanation of the installation and use of SSH in the Ubuntu environment
2019-07-09 CentOS7 installation
centos7_1708 installation