跳到主要内容

CDH 6.3.2 安装部署

官网地址

https://www.cloudera.com/products/open-source/apache-hadoop/key-cdh-components.html

硬件需求

https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_os_requirements.html

1 下载CM和CDH

cm下载地址:https://archive.cloudera.com/cm6/6.2.1/repo-as-tarball/

cdh下载地址:https://archive.cloudera.com/cdh6/6.2.1/parcels/

# 从2021年1月31日开始,所有Cloudera软件都需要有效的订阅,并且只能通过付费墙进行访问。
# 解决方案:用下面的库代替clouder官方库
http://ro-bucharest-repo.bigstepcloud.com/cloudera-repos/

可以下载

https://pan.baidu.com/s/125OQ01FsYyTGKPW0e43Cnw?pwd=boom 提取码: boom

2 CDH 6.2.1组件版本信息

ComponentComponent VersionChanges Information
Apache Avro1.8.2Changes
Apache Flume1.9.0Changes
Apache Hadoop3.0.0Changes
Apache HBase2.1.2Changes
HBase Indexer1.5Changes
Apache Hive2.1.1Changes
Hue4.3.0Changes
Apache Impala3.2.0Changes
Apache Kafka2.1.0Changes
Kite SDK1.0.0Changes
Apache Kudu1.9.0Changes
Apache Solr7.4.0Changes
Apache Oozie5.1.0Changes
Apache Parquet1.9.0Changes
Parquet-format2.3.1Changes
Apache Pig0.17.0Changes
Apache Sentry2.1.0Changes
Apache Spark2.4.0Changes
Apache Sqoop1.4.7Changes
Apache ZooKeeper3.4.5Changes

3 搭建步骤

安装文档参考 https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/installation.html

1.1 硬件检查

# 查看操作系统版本
cat /etc/redhat-release
# 查看内存
cat /proc/meminfo
# 查看cpu核数
cat /proc/cpuinfo| grep "processor"| wc -l
# 查看cpu个数
cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l
# 查看磁盘容量
df -h

1.2 基础环境搭建

1.2.1 配置ip(本地环境)

# vim /etc/sysconfig/network-scripts/ifcfg-ens33
IPADDR="192.168.219.150"
BOOTPROTO="static"
NETMASK="255.255.255.0"
GATEWAY="192.168.219.2"

1.2.2 修改主机名

注意:这里主机名必须为小写,后续 kerberos 会报错。

hostnamectl set-hostname 主机名

添加主机映射

vim /etc/hosts
10.0.6.112 cdh-112
10.0.6.113 cdh-113
10.0.6.114 cdh-114
10.0.6.115 cdh-115
10.0.6.116 cdh-116

1.2.3 安装jdk

所有节点都执行

rpm -qa|grep jdk
java-1.8.0-openjdk-1.8.0.242.b08-1.el7.x86_64
java-1.7.0-openjdk-headless-1.7.0.251-2.6.21.1.el7.x86_64
java-1.8.0-openjdk-headless-1.8.0.242.b08-1.el7.x86_64
java-1.7.0-openjdk-1.7.0.251-2.6.21.1.el7.x86_64
copy-jdk-configs-3.3-10.el7_5.noarch

# 删除以上安装包
rpm -e --nodeps java-1.8.0-openjdk-1.8.0.242.b08-1.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-headless-1.7.0.251-2.6.21.1.el7.x86_64
rpm -e --nodeps java-1.8.0-openjdk-headless-1.8.0.242.b08-1.el7.x86_64
rpm -e --nodeps java-1.7.0-openjdk-1.7.0.251-2.6.21.1.el7.x86_64
rpm -e --nodeps copy-jdk-configs-3.3-10.el7_5.noarch

# 安装
rpm -ivh oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm

# 默认安装位置
/usr/java/jdk1.8.0_181-cloudera

# 配置环境变量
vim /etc/profile 结尾处添加
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin

# 环境变量生效
source /etc/profile

1.2.4 关闭防火墙

# 查看防火墙状态
systemctl status firewalld.service
# 关闭防火墙
systemctl stop firewalld.service
# 关闭开机禁用防火墙自启
systemctl disable firewalld.service

1.2.5 关闭selinux

# vim /etc/selinux/config
SELINUX=disabled (修改)

1.2.6 安装时间同步服务

所有节点同时安装ntp

e

在主节点上编辑配置文件

vim /etc/ntp.conf
// master节点配置(ntp.conf)
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

#指向时间服务器
server ntp.aliyun.com

//slaves节点配置
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

注意:如果公司有时间服务器,所有节点都指向公司的时间服务器,如果没有,可选择一台作为时间服务器主节点,其他节点指向它即可。

开机启动

 systemctl enable ntpd

系统时间同步到硬件

hwclock --systohc

1.2.7 安装mysql

选择其中一台

卸载自带 MySQL

rpm -qa | grep mysql
rpm -e --nodeps mysql //强力删除

下载

wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm

yum 安装

rpm -ivh mysql-community-release-el7-5.noarch.rpm
yum install mysql-server -y

MySQL 启动

systemctl start  mysqld.service

登录数据库,密码为空

mysql -uroot -p

设置密码

update mysql.user set password=PASSWORD('123456') where User='root';

开启 root 远程访问

grant all privileges on *.* to 'root'@'%' identified by '123456' with grant option;

删除用户

delete from mysql.user where user='ambari';
drop user "ambari"@"%";

刷新权限并退出

flush privileges; 
exit;

1.2.8 配置 mysql-connector-java 包

每个节点

创建文件夹

mkdir -p /usr/share/java
cd /usr/share/java

驱动下载

wget https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.46.tar.gz
tar zxvf mysql-connector-java-5.1.46.tar.gz
cd mysql-connector-java-5.1.46
cp mysql-connector-java-5.1.46-bin.jar /usr/share/java/mysql-connector-java.jar

1.2.9 配置ssh免密登陆

生成秘钥

ssh-keygen -t rsa

拷贝到其他节点(包括本身)

ssh-copy-id -i cdh-112

1.2.10 下载第三方依赖包

所有节点

yum install -y bind-utils psmisc cyrus-sasl-plain cyrus-sasl-gssapi fuse portmap fuse-libs /lib/lsb/init-functions httpd mod_ssl openssl-devel python-psycopg2 MySQL-python libxslt

1.2.11 创建ClouderaManager用户

useradd --system --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm

1.3 CM安装部署

1.3.1 MySQL中建库

1)创建各组件需要的数据库

GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY '123456';
CREATE DATABASE hive DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
CREATE DATABASE hue DEFAULT CHARSET utf8 COLLATE utf8_general_ci;

可以按照官方推荐创建

Cloudera Manager Serverscmscm
Activity Monitoramonamon
Reports Managerrmanrman
Huehuehue
Hive Metastore Servermetastorehive
Sentry Serversentrysentry
Cloudera Navigator Audit Servernavnav
Cloudera Navigator Metadata Servernavmsnavms
Oozieoozieoozie

1.3.2 CM安装

(1)集群规划

节点cdh-112cdh-113cdh-114
服务cloudera-scm-server cloudera-scm-agentcloudera-scm-agentcloudera-scm-agent

(2)创建cloudera-manager目录,存放cdh安装文件

创建文件夹

mkdir /opt/cloudera-manager

解压

tar -zxvf cm6.3.1-redhat7.tar.gz

移动

cd cm6.3.1/RPMS/x86_64/
cp cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm /opt/cloudera-manager/
cp cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm /opt/cloudera-manager/
cp cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm /opt/cloudera-manager/
cd /opt/cloudera-manager/

ll

total 1185872
-rw-r--r-- 1 2001 2001 10483568 Sep 25 2019 cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001 1203832464 Sep 25 2019 cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
-rw-r--r-- 1 2001 2001 11488 Sep 25 2019 cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm

(3)拷贝cloudera-manager到其他节点

scp -r /opt/cloudera-manager/ root@cdh-112:/opt/
scp -r /opt/cloudera-manager/ root@cdh-113:/opt/

(4)每个节点安装cloudera-manager-daemons,安装完毕后生成/opt/cloudera目录

rpm -ivh /opt/cloudera-manager/cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm

(5)每个节点安装cloudera-manager-agent

rpm -ivh /opt/cloudera-manager/cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm

或者使用yum -y install命令,示例:yum -y install cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm

(6)配置agent的server节点

指定 server的 ip

vim /etc/cloudera-scm-agent/config.ini
server_host=cdh-112

(7)安装cloudera-manager-server

主节点

[root@cdh-112]# rpm -ivh /opt/cloudera-manager/cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm 

(8)上传CDH包导parcel-repo

mkdir -p /opt/cloudera/parcel-repo
cp CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel /opt/cloudera/parcel-repo
cp CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1 /opt/cloudera/parcel-repo
cp manifest.json /opt/cloudera/parcel-repo

修改.sha1名称为.sha

mv CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha1 CDH-6.3.2-1.cdh6.3.2.p0.1605554-el7.parcel.sha

(9)修改server的db.properties

 vim /etc/cloudera-scm-server/db.properties 
com.cloudera.cmf.db.type=mysql
com.cloudera.cmf.db.host=cdh-112:3306
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.password=123456
com.cloudera.cmf.db.setupType=EXTERNAL

(10)初始化数据库

/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -hlocalhost -uroot -p scm scm
[root@cdh-112 cloudera-manager]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -hlocalhost -uroot -p scm scm
Enter database password:
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly

# 官网地址:https://docs.cloudera.com/documentation/enterprise/latest/topics/prepare_cm_database.html

(11)启动server服务

systemctl start cloudera-scm-server

日志地址

tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

设置为开启自启

chkconfig cloudera-scm-server on

(12)启动agent节点

systemctl start cloudera-scm-agent

日志地址

/var/log/cloudera-scm-agent/cloudera-scm-agent.log

设置为开机自启

chkconfig cloudera-scm-agent on

启动后初始化有点慢,可以tail -f 日志,等一会儿再登录web页面

1.4 登陆安装向导

1.4.1 登陆地址

http://cdh-112:7180
# 用户名/密码:admin/admin

1.4.2 安装向导

image-20210628111431901

image-20210628111648635

image-20210628111741155

Continue 即可 需要等待一会儿

1.5 集群安装

image-20210628111902188

1.5.1 集群命名

image-20210628111926313

1.5.2 选择主机

image-20210628112036980

1.5.3 选择存储库

image-20210628141240007

1.5.4 执行安装

image-20210628113207535

1.5.5 检查网络和主机性能

image-20210628153311285

所有节点执行

#关闭交换空间
#临时关闭
swapoff -a

#永久关闭
vim /etc/fstab
注释掉 swap 行即可

以上两步均要操作,当前关闭,永久关闭,防止机器重启失效

# 关闭透明大页
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled

执行完后, 点击 Run Again

image-20210628154004262

image-20210628154200308

1.5.6 选择服务

image-20210628154505917

1.5.7 配置hive连接

之前已经创建了 hive oozie hue 的数据库

image-20210628154958054

4 常见异常

oozie 页面无法使用

安装包中的 ext-2.2.zip 上传到 /var/lib/oozie/目录下解压即可

Hive Metastore 创建数据库失败

复制mysql 驱动到

cp /usr/share/java/mysql-connector-java.jar /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/hive/lib

重启hive

5 关于失败重装

  1. 删除数据库scm

  2. 清空/var/lib/cloudera-scm-server和/var/log/cloudera-scm-agent

  3. 清空/opt/cloudera/parcel-cache和/opt/cloudera/parcels

  4. 清空server节点/opt/cloudera/parcel-repo下非下载的文件

  5. rm -rf /var/lib/cloudera-scm-agent/cm_guid

    6.卸载

rpm -e --nodeps cloudera-manager-daemons-6.3.1-1466458.el7.x86_64
rpm -e --nodeps cloudera-manager-agent-6.3.1-1466458.el7.x86_64
rpm -e --nodeps cloudera-manager-server-6.3.1-1466458.el7.x86_64

参考 https://blog.csdn.net/wzy0623/article/details/102946646

rm -rf /var/log/*
rm -rf /opt/cloudera*
rm -rf /etc/systemd/system/multi-user.target.wants/cloudera*
rm -rf /etc/default/cloudera*
rm -rf /etc/cloudera*
rm -rf /var/lib/cloudera*
rm -rf /var/log/cloudera*
rm -rf /usr/lib/systemd/system/cloudera*
rm -rf /run/cloudera*
rm -rf /sys/fs/cgroup/systemd/system.slice/cloudera*
rm -rf /etc/security/limits.d/cloudera*
rm -rf /var/lib/yum/repos/x86_64/7/cloudera*
rm -rf /var/cache/yum/x86_64/7/cloudera*
rm -rf /tmp/*

rm -rf /var/lib/hadoop-*
rm -rf /var/lib/impala
rm -rf /var/lib/solr
rm -rf /var/lib/zookeeper
rm -rf /var/lib/hue
rm -rf /var/lib/oozie
rm -rf /var/lib/pgsql
rm -rf /var/lib/sqoop2
rm -rf /data/dfs/
rm -rf /data/impala/
rm -rf /data/yarn/
rm -rf /dfs/
rm -rf /impala/
rm -rf /yarn/
rm -rf /var/run/hadoop-*/
rm -rf /var/run/hdfs-*/
rm -rf /usr/bin/hadoop*
rm -rf /usr/bin/zookeeper*
rm -rf /usr/bin/hbase*
rm -rf /usr/bin/hive*
rm -rf /usr/bin/hdfs
rm -rf /usr/bin/mapred
rm -rf /usr/bin/yarn
rm -rf /usr/bin/sqoop*
rm -rf /usr/bin/oozie
rm -rf /etc/hadoop*
rm -rf /etc/zookeeper*
rm -rf /etc/hive*
rm -rf /etc/hue
rm -rf /etc/impala
rm -rf /etc/sqoop*
rm -rf /etc/oozie
rm -rf /etc/hbase*
rm -rf /etc/hcatalog

rm -rf /var/lib/alternatives/impala-conf
rm -rf /var/lib/alternatives/impalad
rm -rf /var/lib/alternatives/impala-collect-diagnostics
rm -rf /var/lib/alternatives/impala-shell
rm -rf /var/lib/alternatives/impala-collect-minidumps

rm -rf /etc/alternatives/impala-shell
rm -rf /etc/alternatives/impalad
rm -rf /etc/alternatives/impala-collect-diagnostics
rm -rf /etc/alternatives/impala-conf
rm -rf /etc/alternatives/impala-collect-minidumps

rm -rf /var/log/impala*

rm -rf /var/lib/alternatives/zookeeper-client
rm -rf /var/lib/alternatives/zookeeper-server
rm -rf /var/lib/alternatives/zookeeper-conf
rm -rf /var/lib/alternatives/zookeeper-server-initialize
rm -rf /var/lib/alternatives/zookeeper-server-cleanup
rm -rf /var/lib/alternatives/zookeeper-security-migration

rm -rf /etc/alternatives/zookeeper-conf
rm -rf /etc/alternatives/zookeeper-server
rm -rf /etc/alternatives/zookeeper-server-cleanup
rm -rf /etc/alternatives/zookeeper-server-initialize
rm -rf /etc/alternatives/zookeeper-security-migration
rm -rf /etc/alternatives/zookeeper-client
rm -rf /var/log/zookeeper