filebeat

常用配置文档:

开启 UDP 收日志,以 JSON 格式解码原始日志,输出到控制台:

processors:
- decode_json_fields:
    fields:
    - message

filebeat.inputs:
- type: udp
  host: :8000

output.console:
  pretty: true

开启 UDP 收日志,清理自带的一些字段,以 JSON 格式解码原始日志到顶层,清理原始日志,输出到控制台:

processors:

- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log

- decode_json_fields:
    target: ""
    fields:
    - message

- drop_fields:
    fields:
    - message


filebeat.inputs:
- type: udp
  host: :8000


output.console:
  pretty: true

开启 TPC/UDP 收日志,清理自带的一些字段,以 JSON 格式解码原始日志到顶层,清理原始日志,输出到 elasticsearch:

processors:

- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log

- decode_json_fields:
    target: ""
    fields:
    - message

- drop_fields:
    fields:
    - message


filebeat.inputs:
- type: tcp
  host: localhost:1111
  timeout: 864000
- type: tcp
  host: :2222
- type: udp
  host: :2222


output.elasticsearch:
  hosts: ${ES:es}


setup:
  ilm.enabled: false
  template:
    name: q
    pattern: q-*
    overwrite: true
    fields: /dev/null
    append_fields:
    - name: "@timestamp"
      type: date
    settings.index:
      number_of_replicas: 0
      translog:
        durability: async
        sync_interval: 3s   

elasticsearch.host/INDEX-NAME/_settings - default_field 字段不想要太多数据,可以定义 fields 文件为空。

index.translog 是为了写入速度容忍数据不安全;情景是单节点且数据不贵重时,index.number_of_replicas: 0 节省存储空间。

环境变量可以在配置文件中出现,https://www.elastic.co/guide/en/beats/packetbeat/current/using-environ-vars.html#_specify_complex_objects_in_environment_variables

输出到 elasticsearch 时使用 %{+yyyy.MM.dd} 这种时,以 filebeat 所在机器的时间为准(可以通过 %{+yyyy.MM.dd..HH} 验证),所以建议挂载宿主机的 /etc/localtime

来总结一下配置:

processors:

- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log

- decode_json_fields:
    target: ""
    fields:
    - message

- drop_fields:
    fields:
    - message


filebeat.inputs:
- type: tcp
  host: :2222
  timeout: 864000
- type: udp
  host: :2222


output.elasticsearch:
  hosts: ${ES:es}


setup:
  ilm:
    rollover_alias: ${NAME:x}
  template:
    name: whatever
    pattern: "*"
    fields: /dev/null
    append_fields:
    - name: "@timestamp"
      type: date
    settings.index:
      number_of_replicas: ${REPLICAS:0}
      translog:
        durability: async
        sync_interval: 3s

接着,可以 docker 打包一个定制版:

FROM elastic/filebeat:7.9.3
EXPOSE 2222 2222/udp
ENV ES_HOSTS="" NAME="" REPLICAS=""
COPY filebeat.yml /usr/share/filebeat/

k8s 部署参考:

---
apiVersion: v1
kind: Pod
metadata:
  name: es
  labels:
    foo: es
spec:
  initContainers:
  - name: chmod
    image: busybox
    args:
    - chown
    - 1000:1000
    - /data
    volumeMounts:
    - name: data-dir
      mountPath: /data
  containers:
  - name: kibana
    image: elastic/kibana:7.9.3
    env:
    - name: ELASTICSEARCH_HOSTS
      value: http://localhost:9200
  - name: elasticsearch
    image: elastic/elasticsearch:7.9.3
    env:
    - name: discovery.type
      value: single-node
    - name: bootstrap.memory_lock
      value: "true"
    - name: ES_JAVA_OPTS
      value: "-Xms256m -Xmx256m"
    volumeMounts:
    - name: data-dir
      mountPath: /usr/share/elasticsearch/data
  - name: filebeat
    image: YOUR/OWN/filebeat
    env:
    - name: NAME
      value: q
    - name: ES_HOSTS
      value: localhost
    - name: REPLICAS
      value: "0"
  volumes:
  - name: data-dir
    hostPath:
      path: /data/es
      type: DirectoryOrCreate
---
apiVersion: v1
kind: Service
metadata:
  name: es
spec:
  clusterIP: None
  selector:
    foo: es
  ports:
  - name: kibana
    port: 5601
  - name: elasticsearch
    port: 9200
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: es
spec:
  rules:
  - host: kibana.foo.bar
    http:
      paths:
      - backend:
          serviceName: es
          servicePort: 5601
  - host: elasticsearch.foo.bar
    http:
      paths:
      - backend:
          serviceName: es
          servicePort: 9200

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: filebeat-tcp-test
spec:
  replicas: 0
  selector:
    matchLabels:
      foo: test-filebeat
  template:
    metadata:
      labels:
        foo: test-filebeat
    spec:
      containers:
      - name: t
        image: busybox
        args:
        - sh
        - -c
        - sleep 3 && yes '{"a":0}' | nc localhost 2222
      - name: filebeat
        image: harbor.avlyun.org/j-inf/filebeat
        env:
        - name: ES_HOSTS
          value: es

Docker

processors:
- add_docker_metadata:
- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log
    - container.labels
    - container.id


filebeat.inputs:
- type: container
  paths:
  - '/var/lib/docker/containers/*/*.log'
  ignore_older: 10m
  json.ignore_decoding_error: true

output.elasticsearch:
  hosts: ${ES:es}
  index: "%{[container.name]}-%{+yyyy.MM.dd}" 

setup.ilm.enabled: false
setup.template:
  name: docker
  pattern: "*"
  fields: /dev/null
  append_fields:
  - name: "@timestamp"
    type: date
  settings.index:
    number_of_replicas: ${REPLICAS:0}
    translog:
      durability: async
      sync_interval: 3s
  

http endpoint

启用 http 服务接收日志,限 application/json,数据默认放入 json 字段(暂时不知道如何放入顶层)。 可自定义 index,JSON 日志指定 _ 字段即可。

processors:
- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log


filebeat.inputs:
- type: http_endpoint
  listen_address: 0.0.0.0
  listen_port: 80
  content_type: ""


output.elasticsearch:
  hosts: ${ES:es}
  indices:
  - index: "%{[json._]}-%{+yyyy.MM.dd}"
    when.has_fields:
    - json._
  - index: "%{+yyyy.MM.dd}"


setup.ilm.enabled: false
setup.template:
  name: http-endpoint
  pattern: "*"
  fields: /dev/null
  append_fields:
  - name: "@timestamp"
    type: date
  settings.index:
    number_of_replicas: ${REPLICAS:0}
    translog:
      durability: async
      sync_interval: 3s

把 http 日志转存文件,加额外的 ts 字段表示时间:

processors:
- timestamp:
    field: '@timestamp'
    target_field: json.ts
    layouts:
    - '2006-01-02T15:04:05Z'

filebeat.inputs:
- type: http_endpoint
  listen_address: 0.0.0.0
  listen_port: 80
  content_type: ""

output.file:
  path: foo
  filename: bar
  codec.format.string: '%{[json]}'

见:

两个 filebeat 协作,一个 TCP 收日志,存原始数据到文件,一个监听日志文件,发到 ES。这样,es 有日志,日志文件又可以拿别的地方用。

filebeat-1:

filebeat.inputs:
- type: tcp
  host: :1111

output.file:
  path: foo
  filename: bar
  rotate_every_kb: 1048576  # 2**20 k = 1G
  number_of_files: 256
  codec.format.string: '%{[message]}'

filebeat-2:

processors:
- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log

filebeat.inputs:
- type: log
  paths:
  - /xxx/foo/bar*
  json.ignore_decoding_error: true

output.elasticsearch:
  hosts: ${ES:es}
  index: "log-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template:
  name: whatever
  pattern: "*"
  fields: /dev/null
  append_fields:
  - name: "@timestamp"
    type: date
  settings.index:
    number_of_replicas: 0
    translog:
      durability: async
      sync_interval: 3s

filebeat-2, 改进版,先主动解析 JSON,再取 ts 字段为真实的时间:

processors:
- drop_fields:
    fields:
    - input
    - agent
    - host
    - ecs
    - log
	
- decode_json_fields:
    target: ""
    fields:
    - message

- drop_fields:
    fields:
    - message

- timestamp:
    field: ts
    ignore_missing: true
    ignore_failure: true
    layouts:
    - UNIX

filebeat.inputs:
- type: log
  paths:
  - /xxx/foo/bar*

output.elasticsearch:
  hosts: ${ES:es}
  indices:
  - index: "%{[_]}-%{+yyyy.MM.dd}"
    when.has_fields:
    - _
  - index: "${NAME:default}-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template:
  name: whatever
  pattern: "*"
  fields: /dev/null
  append_fields:
  - name: "@timestamp"
    type: date
  settings.index:
    number_of_replicas: ${REPLICAS:0}
    translog:
      durability: async
      sync_interval: 3s

简单情况下,上面两个可以去掉中间文件,合并成一个。