已经很简单了,到这节课才明白了多实例的含义,底层都指向了一个数据:mysql实例
1)创建一个 MySQL 实例,实例名为 myinst,使用的端口为 3306;
su - sdbadmin
cd /opt/sequoiasql/mysql
bin/sdb_sql_ctl addinst myinst -D database/3306/
sdb_sql_ctl start myinst
2)创建 metauser 用户并赋予全局权限,用于创建 SparkSQL 元数据库 metastore,存放元数据信息;
mysql -h127.0.0.1 -uroot
create user 'metauser'@'%' identified by 'metauser';
GRANT ALL ON *.* TO 'metauser'@'%';
flush privileges;
CREATE DATABASE metastore CHARACTER SET 'latin1' COLLATE 'latin1_bin';
Note:
保持 MySQL 实例用户 root 密码为空
3)在 MySQL 实例中创建数据库 company,数据表 employee ( empno INT, ename VARCHAR(128), age INT ),并写入如下数据,然后查询是否存在数据;
create database company;
use company;
create table employee (empno int,ename varchar(128),age int);
insert into employee values(10001,'Georgi',48);
(empno:10001, ename:'Georgi', age:48)
4)创建一个 PostgreSQL 实例,实例名为 pginst,使用的端口为 5432,并与 SequoiaDB 巨杉数据库对接;
cd /opt/sequoiasql/postgresql
bin/sdb_sql_ctl addinst pginst -D database/5432
sdb_sql_ctl start pginst
sdb_sql_ctl createdb company pginst
5)在 PostgreSQL 实例中创建数据库 company,外部表 employee 与 MySQL 实例创建的分区表对应,然后查询是否存在数据;
psql -p 5432 company
create extension sdb_fdw;
CREATE SERVER sdb_server FOREIGN DATA WRAPPER sdb_fdw
OPTIONS (address '127.0.0.1', service '11810', preferedinstance 'A', transaction 'on');
CREATE FOREIGN TABLE employee
(
empno INTEGER,
ename TEXT,
age INTEGER
) SERVER sdb_server
OPTIONS (collectionspace 'company', collection 'employee', decimal 'on');
6)安装部署 SparkSQL 实例,配置 SparkSQL 的元数据信息到 MySQL 实例中,要求安装后 SPARK 的 HOME 目录为:/home/sdbadmin/spark-2.4.4-bin-hadoop2.7
Note: SparkSQL 实例的安装包目录:/home/sdbadmin/soft
tar xzvf spark* -C /home/sdbadmin
7)SparkSQL 实例中创建数据库 company 和数据表 employee 与 MySQL 实例中创建的数据库 company 和数据表 employee 对应,实现数据共享;
Note:
使用 spark-sql 客户端进行 SparkSQL 的操作
ssh-keygen -t rsa
ssh-copy-id sdbadmin@sdbserver1
cd /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/conf
cp spark-env.sh.template spark-env.sh
echo "SPARK_MASTER_HOST=sdbserver1" >> spark-env.sh
cat > /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/conf/hive-site.xml << EOF
hive.metastore.schema.verification
false
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3306/metastore?useSSL=false
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
metauser
javax.jdo.option.ConnectionPassword
metauser
datanucleus.autoCreateSchema
true
creates necessary schema on a startup if one doesn't exist. set this to false, after creating it once
EOF
cp /opt/sequoiadb/spark/spark-sequoiadb_2.11-3.4.jar /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/jars/
cp /opt/sequoiadb/java/sequoiadb-driver-3.4.jar /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/jars/
cp /home/sdbadmin/soft/mysql-jdbc.jar /home/sdbadmin/spark-2.4.4-bin-hadoop2.7/jars/
cp log4j.properties.template log4j.properties
sed -i 's/log4j.rootCategory=INFO, console/log4j.rootCategory=ERROR, console/g' log4j.properties
cd /home/sdbadmin/spark-2.4.4-bin-hadoop2.7
sbin/start-all.sh
jps
bin/spark-sql
CREATE DATABASE company;
USE company;
CREATE TABLE company.employee
(
empno INT,
ename STRING,
age INT
) USING com.sequoiadb.spark OPTIONS (host 'localhost:11810', collectionspace 'company', collection 'employee', username '', password '');
SELECT AVG(age) FROM company.employee;
阅读(917) | 评论(0) | 转发(0) |