安装过程,依赖的问题比较多。所以,部分是在线安装。
1.下载并执行Anaconda2-4.4.0-Linux-x86_64.sh 脚本 登录网页下载()
2.在安装的过程中可以自己选择安装路径也可默认安装。
3.配置 mysql以启用CeleryExecutor
1)安装MySQL数据库支持
yum install mysql mysql-server mysql-devel
pip install airflow[mysql]
linuxman在上步用pip安装时报python版本太低。要用2.7 这时在安装的脚本生成的目录下找到python。第一步安装时产生的主目录下/root/anaconda2/bin/python
vim /usr/bin/pip
把shabang改为找到的python路径。我的为
#!/root/anaconda2/bin/python
4.新建用户和数据库
### 新建名字为的数据库
mysql> CREATE DATABASE airflow;
### 新建用户`hoch`,密码为`123456`, 该用户对数据库`airflow`有完全操作权限
mysql> GRANT all privileges on airflow.* TO 'hoch'@'localhost' IDENTIFIED BY '123456';
mysql> FLUSH PRIVILEGES;
5.初始化数据库: airflow initdb
1)初始化最后会报错(sqlalchemy.exc.ProgrammingError: (_mysql_exceptions.ProgrammingError) (1064, "You have an error in ……………………。)
解决方法:需要变更
/root/anaconda2/lib/python2.7/site-packages/airflow/migrations/versions/4addfa1236f1_add_fractional_seconds_to_mysql_tables.py (注:此路径为默认,若在执行第一步时选择了其它路径,前面的路径要改)
将 mysql.DATETIME(fsp=6) 全部改为 mysql.DATETIME()或将mysql升级到5.7或以上的版本。
2)linuxman 本人是修改的配置文件:
vim
4addfa1236f1_add_fractional_seconds_to_mysql_tables.py
:%s/fsp=6//g
:wq
6.修改airflow配置文件支持mysql
airflow.cfg 默认在/root/airflow/airflow.cfg
##更改数据库链接
sql_alchemy_conn = mysql://hoch:123456@localhost/airflow
##对应字段解释如下: dialect+driver://username:password@host:port/database
7.安装airflow的celery和rabbitmq组件
pip install airflow[celery,rabbitmq]
1)安装erlang和rabbitmq
wget ~centos~6_amd64.rpm
yum install esl-erlang_18.3-1~centos~6_amd64.rpm
2) wget (wget 无法下载。就打开网页直接下载。)
yum install
esl-erlang-compat-18.1-1.noarch.rpm
3) 下载RabbitMQ
wget
rpm -ivh rabbitmq-server-3.6.1-1.noarch.rpm
(注:若在安装rabbitmq-server的过程中遇到如下问题:
Error: Package: rabbitmq-server-3.6.1-1.noarch (/rabbitmq-server-3.6.1-1.noarch)
Requires: erlang >= R16B-3
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
这是由于erlang的版本问题,其实是没有影响的,你可以使用下面的命令进行安装:
rpm -ivh --nodeps rabbitmq-server-3.6.1-1.noarch.rpm
8.启动rabbitmq
启动rabbitmq: rabbitmq-server -detached
开机启动rabbitmq: chkconfig rabbitmq-server on
配置rabbitmq
rabbitmqctl add_user hoch 123456
rabbitmqctl add_vhost hoch_airflow
rabbitmqctl set_user_tags hoch administrator
rabbitmqctl set_permissions -p hoch_airflow hoch ".*" ".*" ".*"
rabbitmq-plugins enable rabbitmq_management
9.修改airflow配置
airflow.cfg 默认在/root/airflow/airflow.cfg
更改executor为: executor = CeleryExecutor
更改broker_url:
broker_url = amqp://hoch:123456@localhost:5672/hoch_airflow
###Format explanation: transport://userid:password@hostname:port/virtual_host
更改celery_result_backend, 可以与broker_url相同:
celery_result_backend = amqp://hoch:123456@localhost:5672/hoch_airflow
###Format explanation: transport://userid:password@hostname:port/virtual_host
10. 在启动worker 时,提示不能用root用户权限启动。修改方法:
echo "export C_FORCE_ROOT="True"" >> /etc/profile
source /etc/profile
11.安装全部完成,开放防火墙端口或关闭防火墙。 启动服务
启动服务器:airflow webserver --debug
启动celery worker (不能用根用户):airflow worker
启动scheduler: airflow scheduler
12.
-
打开网页登录验证功能:
-
修改airflow.cfg
-
[webserver]
-
authenticate = True
-
auth_backend = airflow.contrib.auth.backends.password_auth
-
filter_by_owner = True
-
-
添加用户脚本:vim filename.py
-
import airflow
-
from airflow import models, settings
-
from airflow.contrib.auth.backends.password_auth import PasswordUser
-
user = PasswordUser(models.User())
-
user.username = 'hoch'
-
user.email = 'hochg@*.com.cn'
-
user.password = '123456'
-
session = settings.Session()
-
session.add(user)
-
session.commit()
-
session.close()
-
exit()
-
(注:上文中红体字部分用自己的帐号替换)
-
python filename.py
导入用户报错:
[2017-08-02 07:26:43,351] {__init__.py:61} CRITICAL - Cannot import authentication module airflow.contrib.auth.backends.password_auth. Please correct your authentication backend or disable authentication: No module named flask_bcrypt
解决: pip install flask_bcrypt
重新执行脚本导入用户成功
打开服务器网页测试:ip:8080