Gocollector 配置详解
Author: fupeng.li
date: 2020-05-13
关键词:gocollector,td-agent
需求文档
原 EC2 安装注意事项
Td-agent 安装 ansible 命令
ansible playbook
- name: Delegate to DEB Package(Debian/Ubuntu)
include: use-deb.yml
when: ansible_pkg_mgr == "apt"
- name: Delegate to RPM Package(Centos)
include: use-rpm.yml
when: ansible_pkg_mgr == "yum"
- name: Install fluent-plugin-*
shell: /usr/sbin/td-agent-gem install {{ item }}
with_items:
- fluent-plugin-s3
ignore_errors: yes
- name: Deploy fluent config file
template:
src: "{{ ENV }}_{{ AREA }}_reaper_agent.conf.j2"
dest: /etc/td-agent/td-agent.conf
notify:
- reload td-agent
- name: Keep fluent running
service:
name: td-agent
state: started
Use-deb.yml
---
- name: Test if fluentd is installed
shell: /usr/bin/dpkg -L td-agent
register: installed
ignore_errors: true
- name: Install fluent for Ubuntu[xenial]
shell: curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent3.sh | sh
when: installed.rc != 0 and ansible_distribution_release == "xenial"
- name: Install fluent for Ubuntu[bionic]
shell: curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-bionic-td-agent3.sh | sh
when: installed.rc != 0 and ansible_distribution_release == "bionic"
- name: Install fluent for Ubuntu[trusty]
shell: curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent3.sh | sh
when: installed.rc != 0 and ansible_distribution_release == "trusty"
- name: Install fluent for Ubuntu[stretch]
shell: curl -L https://toolbelt.treasuredata.com/sh/install-debian-stretch-td-agent3.sh | sh
when: installed.rc != 0 and ansible_distribution_release == "stretch"
- name: Install fluent for Ubuntu[jessie]
shell: curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent3.sh | sh
when: installed.rc != 0 and ansible_distribution_release == "jessie"
- name: Make sure fluent run as root(User)
replace:
path: /lib/systemd/system/td-agent.service
regexp: '^User=.*$'
replace: 'User=root'
- name: Make sure fluent run as root(Group)
replace:
path: /lib/systemd/system/td-agent.service
regexp: '^Group=.*$'
replace: 'Group=root'
td-agent.yml 配置文件,使用的是 tail 的方式,现在的 fargate 使用的是直接发送的方式
使用的插件
monitor_agent <监控>
tail <文件读取>
s3 <上传>
forward <转发>
每个插件的使用方式
monitor_agent
设置监控
<source>
@type monitor_agent
bind 0.0.0.0
port 24220 // curl http://127.0.0.1:24220 可以看到 fluentd 状态
// Telegraf 会使用该接口进行状态的监控
</source>
tail 一个文件
<source>
@type tail
format none
path /path/app.log // 需要 fluentd 传输的日志文件
pos_file /path/app.log.pos // 记录当前 tail 的文件 inode 以及位置,16进制
// 不要随便删!!!
tag app.stdout // fluentd 的对该类型日志的标记,内部标记,方便其他模块使用</source>
上传到 S3
<match morpheus.stdout> // 注意,这里使用了之前的标记,告诉 fluentd 处理哪种类型的日志
@type s3
s3_region {{ iuacfront_region }}
aws_key_id {{ iuacfront_aws_access_key_id }}
aws_sec_key {{ iuacfront_aws_secret_access_key }}
s3_bucket bytepower
path logs/morpheus/stdout/ // s3 bucket 下的路径
format single_value
s3_object_key_format %{path}%Y/%m/%d/%H/%M-%S-{{ inventory_hostname }}-%{index}.%{file_extension} // s3 文件的绝对路径,index 参数可以防止文件名冲突, hostname 用来区分对应的机器,防止因为多机器同时写入导致的写入覆盖, 这里用的是 ansible 语法,所以用的{{}}, fluentd 自带了 {hostname} 也可以直接使用
store_as gzip_command
</match>
Ruby has GIL (Global Interpreter Lock), which allows only one thread to execute at a time. While I/O tasks can be multiplexed, CPU-intensive tasks will block other jobs. One of the CPU-intensive tasks in Fluentd is compression. The new version of S3/Treasure Data plugin allows compression outside of the Fluentd process, using gzip. This frees up the Ruby interpreter while allowing Fluentd to process other tasks.
从其他机器接收日志
<source>
@type forward
port 24224
bind 0.0.0.0
</source>
完整配置
<source>
@type monitor_agent
bind 0.0.0.0
port 24220
</source>
<source>
@type tail
format none
path {{ APP_LOG_DIR }}/*/event.log
pos_file {{ APP_LOG_DIR }}/event.log.pos
tag *
</source>
<match **>
@type s3
s3_region {{ prod_cn_region }}
aws_key_id {{ prod_cn_aws_access_key_id }}
aws_sec_key {{ prod_cn_aws_secret_access_key }}
s3_bucket {{ prod_cn_bucket }}
path {{ s3_directory }}/${tag[3]}/rawdata/
s3_object_key_format %{path}%Y/%m/%d/%H/%M/%S_%Y%m%d%H%M%S_%{{ inventory_hostname }}-{index}.%{file_extension}
format single_value
store_as gzip_command
<buffer tag,time>
@type file
path /mnt/log/td-agent/s3
flush_thread_count 8
queued_chunks_limit_size 10
chunk_limit_size 512m
timekey 60
timekey_wait 10
timekey_use_utc true
</buffer>
</match>
性能优化方式
- 使用 S3 plugin 提供的压缩方式
store_as: gzip_command 代替 compress: gzip ,后者会增加 fluentd 的负担2.
-
flush_thread_count可以配置让fluentd以多线程方式并发传输文件 -
使用多进程
监控 fluentd 的方式
Fluentd 的 monitor_agent 会提供一个 HTTP 接口,接口返回当前 Fluentd 的状态信息(包括:buffer_size、buffer_queue_size、retry_count 等等)
Telegraf 提供一个监控 Fluentd 的插件,可以每 10s 读取一次 Fluentd 的状态,
需要在 telegraf 里配置对应的插件,具体插件的配置之前的监控讲过