2019-05-07 12:07:02 xiaTianCsDN 阅读数 266
  • 大数据技术

    本阶段详细介绍了大数据所涉及到的Linux、shell、Hadoop、zookeeper、HadoopHA、Hive、Flume、Kafka、Hbase、Sqoop、Oozie等技术的概念、安装配置、架构原理、数据类型定义、数据操作、存储集群等重点知识点。

    1473 人正在学习 去看看 尚硅谷IT教育

环境:

CentOS 6.5
kafka_2.11-2.2.0

前提是需要安装zookeeper,本文不再赘述,详细安装请参照:https://blog.csdn.net/xiaTianCsDN/article/details/89917836

下载Linux版的kafka安装包

http://kafka.apache.org/downloads

下载完成之后通过ftp上传到linux服务器指定目录下,比如:/usr/local/kafka_2.11-2.2.0.tgz

安装集群在每个节点上的安装步骤基本上都是一样的,我以一个节点为例

解压安装包:

cd /usr/local
tar -zxvf kafka_2.11-2.2.0.tgz

为了方便简单我使用了重命名

mv kafka_2.11-2.2.0 kafka

进入到kafka安装目录下创建kafka日志文件夹并修改kafka配置文件

cd /usr/local/kafka
mkdir logs
cd /usr/local/kafka/config/
vim server.properties

详细配置如下(红色代表必须配置):
broker.id=0,1,2 //分别代表0,1,2三个集群节点中各个节点服务器配置broker.id,不能设置成相同
listeners=PLAINTEXT://192.168.1.151:9092 //分别绑定各节点服务器ip地址
log.dirs=/usr/local/kafka/logs //kafka日志存放路径
num.partitions=1 //默认即可
zookeeper.connect=node01:2181,node02:2181,node03:2181 //zookeeper集群连接

启动kafka服务(注意:启动kafka之前必须先启动zookeeper服务)

/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties

后台启动

nohup /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties 1>/dev/null 2>&1 &

查看服务是否启动成功
1,各个节点服务器上查看kafka端口是否启用(默认端口:9092)

netstat -ntlp

在这里插入图片描述
2,通过jps查看

jps

在这里插入图片描述
为了保证端口能被访问需要开放端口或者关闭防火墙,我选择的是关闭防火墙

service iptables stop

至此一个节点上的安装就完成了,照此步骤安装其他节点一样的

2017-08-17 10:51:59 gavinguo1987 阅读数 1186
  • 大数据技术

    本阶段详细介绍了大数据所涉及到的Linux、shell、Hadoop、zookeeper、HadoopHA、Hive、Flume、Kafka、Hbase、Sqoop、Oozie等技术的概念、安装配置、架构原理、数据类型定义、数据操作、存储集群等重点知识点。

    1473 人正在学习 去看看 尚硅谷IT教育

 

一、  环境准备

Kafka依赖ZooKeeper,首先需要确保ZooKeeper服务已经搭建完成。

 

操作系统:

CentOS-7-x86_64-1611

 

节点IP端口:

192.168.2.200:9092

192.168.2.201:9092

192.168.2.202:9092

 

二、  Kafka安装

 

1. 下载Kafka

Kafka官网地址http://kafka.apache.org/

 

cd /usr/local

wget https://www.apache.org/dyn/closer.cgi?path=/kafka/0.11.0.0/kafka_2.11-0.11.0.0.tgz

 

2.  解压

cd /usr/local

tar -zxvf kafka_2.11-0.11.0.0.tgz

ls

 

三、  Kafka集群配置

 

1.  修改server.properties配置文件

cd /usr/local/kafka_2.11-0.11.0.0/config/

vi server.properties

server.properties修改后文件内容

############################# Server Basics #############################

 

# The id of the broker. This must be set to a unique integer for each broker.

broker.id=0

 

# Switch to enable topic deletion or not, default value is false

delete.topic.enable=true

 

############################# Socket Server Settings #############################

 

# The address the socket server listens on. It will get the value returned from

# java.net.InetAddress.getCanonicalHostName() if not configured.

#   FORMAT:

#     listeners = listener_name://host_name:port

#   EXAMPLE:

#     listeners = PLAINTEXT://your.host.name:9092

listeners=PLAINTEXT://192.168.2.200:9092

 

# Hostname and port the broker will advertise to producers and consumers. If not set,

# it uses the value for "listeners" if configured.  Otherwise, it will use the value

# returned from java.net.InetAddress.getCanonicalHostName().

#advertised.listeners=PLAINTEXT://your.host.name:9092

 

# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details

#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL

 

# The number of threads that the server uses for receiving requests from the network and sending responses to the network

num.network.threads=3

 

# The number of threads that the server uses for processing requests, which may include disk I/O

num.io.threads=8

 

# The send buffer (SO_SNDBUF) used by the socket server

socket.send.buffer.bytes=1048576

 

# The receive buffer (SO_RCVBUF) used by the socket server

socket.receive.buffer.bytes=1048576

 

# The maximum size of a request that the socket server will accept (protection against OOM)

socket.request.max.bytes=104857600

 

 

############################# Log Basics #############################

 

# A comma seperated list of directories under which to store log files

log.dirs=/usr/local/kafka_2.11-0.11.0.0/kafka-logs

 

# The default number of log partitions per topic. More partitions allow greater

# parallelism for consumption, but this will also result in more files across

# the brokers.

num.partitions=1

 

# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.

# This value is recommended to be increased for installations with data dirs located in RAID array.

num.recovery.threads.per.data.dir=1

 

############################# Internal Topic Settings  #############################

# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"

# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.

offsets.topic.replication.factor=1

transaction.state.log.replication.factor=1

transaction.state.log.min.isr=1

 

############################# Log Flush Policy #############################

 

# Messages are immediately written to the filesystem but by default we only fsync() to sync

# the OS cache lazily. The following configurations control the flush of data to disk.

# There are a few important trade-offs here:

#    1. Durability: Unflushed data may be lost if you are not using replication.

#    2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.

#    3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to exceessive seeks.

# The settings below allow one to configure the flush policy to flush data after a period of time or

# every N messages (or both). This can be done globally and overridden on a per-topic basis.

 

# The number of messages to accept before forcing a flush of data to disk

log.flush.interval.messages=20000

 

# The maximum amount of time a message can sit in a log before we force a flush

log.flush.interval.ms=10000

 

############################# Log Retention Policy #############################

 

# The following configurations control the disposal of log segments. The policy can

# be set to delete segments after a period of time, or after a given size has accumulated.

# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens

# from the end of the log.

 

# The minimum age of a log file to be eligible for deletion due to age

log.retention.hours=168

 

# A size-based retention policy for logs. Segments are pruned from the log as long as the remaining

# segments don't drop below log.retention.bytes. Functions independently of log.retention.hours.

#log.retention.bytes=1073741824

 

# The maximum size of a log segment file. When this size is reached a new log segment will be created.

log.segment.bytes=1073741824

 

# The interval at which log segments are checked to see if they can be deleted according

# to the retention policies

log.retention.check.interval.ms=300000

 

############################# Zookeeper #############################

 

# Zookeeper connection string (see zookeeper docs for details).

# This is a comma separated host:port pairs, each corresponding to a zk

# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".

# You can also append an optional chroot string to the urls to specify the

# root directory for all kafka znodes.

zookeeper.connect=192.168.2.200:2181,192.168.2.201:2181,192.168.2.202:2181,192.168.2.203:2181,192.168.2.204:2181

 

# Timeout in ms for connecting to zookeeper

zookeeper.connection.timeout.ms=6000

 

 

############################# Group Coordinator Settings #############################

 

# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.

# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.

# The default value for this is 3 seconds.

# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.

# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.

group.initial.rebalance.delay.ms=0

      在集群中每个节点的broker.id不同,类似zookeeper的myid

 

2.  启动kafka集群

cd /usr/local/kafka_2.11-0.11.0.0/

nohup bin/kafka-server-start.sh config/server.properties &

 

3.  创建主题

bin/kafka-topics.sh --create --zookeeper 192.168.2.200:2181,192.168.2.201:2181,192.168.2.202:2181,192.168.2.203:2181,192.168.2.204:2181 --replication-factor 3 --partitions 1 --topic CZ-ICBC-TOPIC

 

4. 查看主题状态

bin/kafka-topics.sh --describe --zookeeper 192.168.2.200:2181,192.168.2.201:2181,192.168.2.202:2181,192.168.2.203:2181,192.168.2.204:2181 --topic CZ-ICBC-TOPIC

 

 

5. 生产者生产数据

bin/kafka-console-producer.sh --broker-list 192.168.2.200:9092,192.168.2.201:9092,192.168.2.202:9092 --topic CZ-ICBC-TOPIC

this is a message

 

6. 消费者消费数据

bin/kafka-console-consumer.sh --bootstrap-server 192.168.2.200:9092,192.168.2.201:9092,192.168.2.202:9092 --from-beginning --topic CZ-ICBC-TOPIC

 

 

2018-04-10 22:38:37 dejunyang 阅读数 202
  • 大数据技术

    本阶段详细介绍了大数据所涉及到的Linux、shell、Hadoop、zookeeper、HadoopHA、Hive、Flume、Kafka、Hbase、Sqoop、Oozie等技术的概念、安装配置、架构原理、数据类型定义、数据操作、存储集群等重点知识点。

    1473 人正在学习 去看看 尚硅谷IT教育
1. 前提条件
1.1 已经安装 ZooKeeper,ZooKeeper集群搭建:https://blog.csdn.net/dejunyang/article/details/79874491
1.2 已经安装 jdk 1.8或更高版本,jdk安装:https://blog.csdn.net/dejunyang/article/details/79826172
1.3 本次安装Kafka版本2.11-1.1.0
1.4 本次配置三台服务器,假设IP地址分别为 
192.168.174.129
192.168.174.130
192.168.174.131
2. 下载
http://kafka.apache.org/downloads
3. 解压
tar zxvf ***.tgz
4. 配置
4.1 第一台服务器配置, 192.168.174.129
修改内容:
log.dirs,消息持久化的位置,例如:
log.dirs=/home/ydj/AppData/kafka-logs
zookeeper.connect,连接ZooKeeper地址,例如:
zookeeper.connect=192.168.174.129:2181,192.168.174.130:2181,192.168.174.131:2181
添加内容:
在log.retention.hours下边添加如下内容
message.max.byte=5848576
default.replication.factor=2
replication.fetch.max.bytes=5848576
4.2 第二台服务器配置, 192.168.174.130
修改内容:
broker.id=1
log.dirs,消息持久化的位置,例如:
log.dirs=/home/ydj/AppData/kafka-logs
zookeeper.connect,连接ZooKeeper地址,例如:
zookeeper.connect=192.168.174.129:2181,192.168.174.130:2181,192.168.174.131:2181
添加内容:
在log.retention.hours下边添加如下内容
message.max.byte=5848576
default.replication.factor=2
replication.fetch.max.bytes=5848576
4.3 第三台服务器配置, 192.168.174.131
修改内容:
broker.id=2
log.dirs,消息持久化的位置,例如:
log.dirs=/home/ydj/AppData/kafka-logs
zookeeper.connect,连接ZooKeeper地址,例如:
zookeeper.connect=192.168.174.129:2181,192.168.174.130:2181,192.168.174.131:2181
添加内容:
在log.retention.hours下边添加如下内容
message.max.byte=5848576
default.replication.factor=2
replication.fetch.max.bytes=5848576
5. 分别启动三台服务器的kafka服务
5.1 启动kafka之前首先启动ZooKeeper
5.2 启动kafka,在bin目录下执行命令
./kafka-server-start.sh
2019-01-11 17:20:01 winterking3 阅读数 376
  • 大数据技术

    本阶段详细介绍了大数据所涉及到的Linux、shell、Hadoop、zookeeper、HadoopHA、Hive、Flume、Kafka、Hbase、Sqoop、Oozie等技术的概念、安装配置、架构原理、数据类型定义、数据操作、存储集群等重点知识点。

    1473 人正在学习 去看看 尚硅谷IT教育

1 准备3台机器

hostname ip
server1 192.168.16.151
server2 192.168.16.152
server3 192.168.16.153

1.1host配置

#以server1为例
127.0.0.1       server1
192.168.16.151  server1
192.168.16.152  server2
192.168.16.153  server3

2 zookeeper配置

参考上篇博客 安装Zookeeper集群

3 安装kafka

3.1 设置环境变量

解压kafka后添加环境变量

#修改环境变量
sudo vim /etc/profile
export JAVA_HOME=/usr/lib/java/jdk1.8.0_181
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

export SCALA_HOME=/usr/lib/scala/scala-2.11.8
export PATH=${SCALA_HOME}/bin:$PATH

export ZOOKEEPER_HOME=/usr/local/programs/zookeeper-3.4.13
export PATH=${ZOOKEEPER_HOME}/bin:$PATH

export KAFKA_HOME=/usr/local/programs/kafka_2.11-2.1.0
export PATH=${KAFKA_HOME}/bin:$PATH
#让环境变量生效
source /etc/profile

3.2 修改改$KAFKA_HOME/config 目录下的server.properties文件

在三台机器上分别修改,这里用server1举例

#这个1就是和zookeeper的myid文件的1对应
broker.id=1
#填写3台机器的地址,中间用逗号隔开
zookeeper.connect=server1:2181,server2:2181,server3:2181
#host.name是当前机器的地址
host.name=192.168.16.151

3.4 启动zookeeper和kafka

在3台机器上分别启动

#启动zookeeper
zkServer.sh start

#启动kafka
kafka-server-start.sh -daemon /usr/local/programs/kafka_2.11-2.1.0/config/server.properties 

#查看进程
jps

3.5 测试

学习参考博客

#在leader上创建主题
kafka-topics.sh --create --zookeeper server1:2181,server2:2181,server3:2181 --replication-factor 3 --partitions 3 --topic WinterTop

#查看主题List
kafka-topics.sh --list --zookeeper server1:2181,server2:2181,server3:2181

#在leader上发送消息
kafka-console-producer.sh --broker-list server1:9092,server2:9092,server3:9092 --topic WinterTop

#在follower上消费消息
kafka-console-consumer.sh --bootstrap-server server1:9092,server2:9092,server3:9092  --from-beginning --topic WinterTop
2016-06-06 15:08:35 lmaosheng 阅读数 37
  • 大数据技术

    本阶段详细介绍了大数据所涉及到的Linux、shell、Hadoop、zookeeper、HadoopHA、Hive、Flume、Kafka、Hbase、Sqoop、Oozie等技术的概念、安装配置、架构原理、数据类型定义、数据操作、存储集群等重点知识点。

    1473 人正在学习 去看看 尚硅谷IT教育
1.下载 kafka_2.10-0.9.0.1.tgz

下载路径:http://kafka.apache.org/downloads.html

2.解包

# tar -zvxf kafka_2.10-0.9.0.1.tgz

3.切换目录

# cd kafka_2.10-0.9.0.1

4.配置

# cd config

# vi server.properties

【server1】

broker.id=1
port=9092
host.name=server1
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dir=./kafka1-logs
num.partitions=10
zookeeper.connect=zk1:2181,zk2:2181,zk3:2181

【server2】

broker.id=2
port=9092
host.name=server2
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dir=./kafka1-logs
num.partitions=10
zookeeper.connect=zk1:2181,zk2:2181,zk3:2181

【server3】

broker.id=3
port=9092
host.name=server3
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dir=./kafka1-logs
num.partitions=10
zookeeper.connect=zk1:2181,zk2:2181,zk3:2181


4)启动kafka服务(在server1,server2,server3上分别运行)

# cd bin

# ./kafka-server-start.sh ../config/server.properties &

5)创建Topic

# ./kafka-topics.sh --create --zookeeper zk1:2181,zk2:2181,zk3:2181 --replication-factor 3 --partitions 1 --topic test

6)查看Topic列表

# ./kafka-topics.sh --list --zookeeper zk1:2181,zk2:2181,zk3:2181

7)查看Topic状态

# ./kafka-topics.sh --describe --zookeeper zk1:2181,zk2:2181,zk3:2181 --topic test

8)producer发送消息

#./kafka-console-producer.sh --broker-list server1:9092,server2:9092,server3:9092 --topic test

9)consumer接收消息

#./kafka-console-consumer.sh --zookeeper zk1:2181,zk2:2181,zk3:2181 --topic test --from-beginning

kafka集群部署

阅读数 4

kafka集群安装

阅读数 18

linux 搭建 kafka集群

阅读数 544

没有更多推荐了,返回首页