zoukankan      html  css  js  c++  java
  • Opentelemetry Collector的配置和使用

    Collector的配置和使用

    Collector配置

    collector通过pipeline处理service中启用的数据。pipeline由接收遥测数据的组件构成,包括:

    其次还可以通过扩展来为Collector添加功能,但扩展不需要直接访问遥测数据,且不是pipeline的一部分。扩展同样可以在service中启用。

    Receivers

    receiver定义了数据如何进入OpenTelemetry Collector。必须配置一个或多个receiver,默认不会配置任何receivers。

    下面给出了所有可用的receivers的基本例子,更多配置可以参见receiver文档

    receivers:
      opencensus:
        address: "localhost:55678"
    
      zipkin:
        address: "localhost:9411"
    
      jaeger:
        protocols:
          grpc:
          thrift_http:
          thrift_tchannel:
          thrift_compact:
          thrift_binary:
    
      prometheus:
        config:
          scrape_configs:
            - job_name: "caching_cluster"
              scrape_interval: 5s
              static_configs:
                - targets: ["localhost:8889"]
    

    Processors

    Processors运行在数据的接收和导出之间。虽然Processors是可选的,但有时候会建议使用Processors。

    下面给出了所有可用的Processors的基本例子,更多参见Processors文档

    processors:
      attributes/example:
        actions:
          - key: db.statement
            action: delete
      batch:
        timeout: 5s
        send_batch_size: 1024
      probabilistic_sampler:
        disabled: true
      span:
        name:
          from_attributes: ["db.svc", "operation"]
          separator: "::"
      queued_retry: {}
      tail_sampling:
        policies:
          - name: policy1
            type: rate_limiting
            rate_limiting:
              spans_per_second: 100
    

    Exporters

    exporter指定了如何将数据发往一个或多个后端/目标。必须配置一个或多个exporter,默认不会配置任何exporter。

    下面给出了所有可用的exporters的基本例子,更多参见exporters文档

    exporters:
      opencensus:
        headers: {"X-test-header": "test-header"}
        compression: "gzip"
        cert_pem_file: "server-ca-public.pem" # optional to enable TLS
        endpoint: "localhost:55678"
        reconnection_delay: 2s
    
      logging:
        loglevel: debug
    
      jaeger_grpc:
        endpoint: "http://localhost:14250"
    
      jaeger_thrift_http:
        headers: {"X-test-header": "test-header"}
        timeout: 5
        endpoint: "http://localhost:14268/api/traces"
    
      zipkin:
        endpoint: "http://localhost:9411/api/v2/spans"
    
      prometheus:
        endpoint: "localhost:8889"
        namespace: "default"
    

    Service

    Service部分用于配置OpenTelemetry Collector根据receivers, processors, exporters, 和extensions sections的配置会启用那些特性。service分为两部分:

    • extensions
    • pipelines

    extensions包含启用的扩展,如:

        service:
          extensions: [health_check, pprof, zpages]
    

    Pipelines有两类:

    • metrics: 采集和处理metrics数据
    • traces: 采集和处理trace数据

    一个pipeline是一组 receivers, processors, 和exporters的集合。必须在service之外定义每个receiver/processor/exporter的配置,然后将其包含到pipeline中。

    注:每个receiver/processor/exporter都可以用到多个pipeline中。当多个pipeline引用processor(s)时,每个pipeline都会获得该processor(s)的一个实例,这与多个pipeline中引用receiver(s)/exporter(s)的情况不同(所有pipelines仅能获得receiver/exporter的一个实例)。

    下面给出了一个pipeline配置的例子,更多可以参见pipeline文档

    service:
      pipelines:
        metrics:
          receivers: [opencensus, prometheus]
          exporters: [opencensus, prometheus]
        traces:
          receivers: [opencensus, jaeger]
          processors: [batch, queued_retry]
          exporters: [opencensus, zipkin]
    

    Extensions

    Extensions可以用于监控OpenTelemetry Collector的健康状态。Extensions是可选的,默认不会配置任何Extensions。

    下面给出了所有可用的extensions的基本例子,更多参见extensions文档

    extensions:
      health_check: {}
      pprof: {}
      zpages: {}
    

    使用环境变量

    collector配置中可以使用环境变量,如:

    processors:
      attributes/example:
        actions:
          - key: "$DB_KEY"
            action: "$OPERATION"
    

    Collector的使用

    下面使用官方demo来体验一下Collector的功能

    本例展示如何从OpenTelemetry-Go SDK 中导出trace和metric数据,并将其导入OpenTelemetry Collector,最后通过Collector将trace数据传递给Jaeger,将metric数据传递给Prometheus。完整的流程为:

                                              -----> Jaeger (trace)
    App + SDK ---> OpenTelemtry Collector ---|
                                              -----> Prometheus (metrics)
    

    部署到Kubernetes

    k8s目录中包含本demo所需要的所有部署文件。为了简化方便,官方将部署目录集成到了一个makefile文件中。在必要时可以手动执行Makefile中的命令。

    部署Prometheus operator

    git clone https://github.com/coreos/kube-prometheus.git
    cd kube-prometheus
    kubectl create -f manifests/setup
    
    # wait for namespaces and CRDs to become available, then
    kubectl create -f manifests/
    

    可以使用如下方式清理环境:

    kubectl delete --ignore-not-found=true -f manifests/ -f manifests/setup
    

    等待prometheus所有组件变为running状态

    # kubectl get pod -n monitoring
    NAME                                   READY   STATUS    RESTARTS   AGE
    alertmanager-main-0                    2/2     Running   0          16m
    alertmanager-main-1                    2/2     Running   0          16m
    alertmanager-main-2                    2/2     Running   0          16m
    grafana-7f567cccfc-4pmhq               1/1     Running   0          16m
    kube-state-metrics-85cb9cfd7c-x6kq6    3/3     Running   0          16m
    node-exporter-c4svg                    2/2     Running   0          16m
    node-exporter-n6tnv                    2/2     Running   0          16m
    prometheus-adapter-557648f58c-vmzr8    1/1     Running   0          16m
    prometheus-k8s-0                       3/3     Running   0          16m
    prometheus-k8s-1                       3/3     Running   1          16m
    prometheus-operator-5b469f4f66-qx2jc   2/2     Running   0          16m
    

    使用Makefile

    下面使用makefile部署Jaeger,Prometheus monitor和Collector,依次执行如下命令即可:

    # Create the namespace
    make namespace-k8s
    
    # Deploy Jaeger operator
    make jaeger-operator-k8s
    
    # After the operator is deployed, create the Jaeger instance
    make jaeger-k8s
    
    # Then the Prometheus instance. Ensure you have enabled a Prometheus operator
    # before executing (see above).
    make prometheus-k8s
    
    # Finally, deploy the OpenTelemetry Collector
    make otel-collector-k8s
    

    等待observability命名空间下的Jaeger和Collector的Pod变为running状态

    # kubectl get pod -n observability
    NAME                              READY   STATUS    RESTARTS   AGE
    jaeger-7b868df4d6-w4tk8           1/1     Running   0          97s
    jaeger-operator-9b4b7bb48-q6k59   1/1     Running   0          110s
    otel-collector-7cfdcb7658-ttc8j   1/1     Running   0          14s
    

    可以使用make clean-k8s命令来清理环境,但该命令不会移除命名空间,需要手动删除命名空间:

    kubectl delete namespaces observability
    

    配置OpenTelemetry Collector

    完成上述步骤之后,就部署好了所需要的所有资源。下面看一下Collector的配置文件

    为了使应用发送数据到OpenTelemetry Collector,首先需要配置otlp类型的receiver,它使用gRpc进行通信:

    ...
      otel-collector-config: |
        receivers:
          # Make sure to add the otlp receiver.
          # This will open up the receiver on port 55680.
          otlp:
            endpoint: 0.0.0.0:55680
        processors:
    ...
    

    上述配置会在Collector侧创建receiver,并打开55680端口,用于接收trace。剩下的配置都比较标准,唯一需要注意的是需要创建Jaeger和Prometheus exporters:

    ...
        exporters:
          jaeger_grpc:
            endpoint: "jaeger-collector.observability.svc.cluster.local:14250"
    
          prometheus:
               endpoint: 0.0.0.0:8889
               namespace: "testapp"
    ...
    
    OpenTelemetry Collector service

    配置中另外一个值得注意的是用于访问OpenTelemetry Collector的NodePort

    apiVersion: v1
    kind: Service
    metadata:
            ...
    spec:
      ports:
      - name: otlp # Default endpoint for otlp receiver.
        port: 55680
        protocol: TCP
        targetPort: 55680
        nodePort: 30080
      - name: metrics # Endpoint for metrics from our app.
        port: 8889
        protocol: TCP
        targetPort: 8889
      selector:
        component: otel-collector
      type:
        NodePort
    

    该service 会将用于访问otlp receiver的30080端口与cluster node的55680端口进行绑定,这样就可以通过静态地址<node-ip>:30080来访问Collector。

    运行代码

    main.go文件中可以看到完整的示例代码。要运行该代码,需要满足Go的版本>=1.13

    # go run main.go
    2020/10/20 09:19:17 Waiting for connection...
    2020/10/20 09:19:17 Doing really hard work (1 / 10)
    2020/10/20 09:19:18 Doing really hard work (2 / 10)
    2020/10/20 09:19:19 Doing really hard work (3 / 10)
    2020/10/20 09:19:20 Doing really hard work (4 / 10)
    2020/10/20 09:19:21 Doing really hard work (5 / 10)
    2020/10/20 09:19:22 Doing really hard work (6 / 10)
    2020/10/20 09:19:23 Doing really hard work (7 / 10)
    2020/10/20 09:19:24 Doing really hard work (8 / 10)
    2020/10/20 09:19:25 Doing really hard work (9 / 10)
    2020/10/20 09:19:26 Doing really hard work (10 / 10)
    2020/10/20 09:19:27 Done!
    2020/10/20 09:19:27 exporter stopped
    

    该示例模拟了一个正在运行应用程序,计算10秒之后结束。

    查看采集到的数据

    运行go run main.go的数据流如下:

    Jaeger UI

    Jaeger上查询trace内容如下:

    Prometheus

    运行main.go结束之后,可以在Prometheus中查看该metric。其对应的Prometheus target为observability/otel-collector/0

    Prometheus上查询metric内容如下:

    FAQ:

    • 在运行完部署命令之后,发现Prometheus没有注册如http://10.244.1.33:8889/metrics这样的target。可以查看Prometheus pod的日志,可能是因为Prometheus没有对应的role权限导致的,将Prometheus的clusterrole修改为如下内容即可:

      kind: ClusterRole
      apiVersion: rbac.authorization.k8s.io/v1
      metadata:
        name: prometheus-k8s
        namespace: monitoring
      rules:
      - apiGroups: [""]
        resources: ["services","pods","endpoints","nodes/metrics"]
        verbs: ["get", "watch", "list"]
      - apiGroups: ["extensions"]
        resources: ["ingresses"]
        verbs: ["get", "watch", "list"]
      - nonResourceURLs: ["/metrics"]
        verbs: ["get", "watch", "list"]
      
    • 在运行"go run main.go"时可能会遇到rpc error: code = Internal desc = grpc: error unmarshalling request: unexpected EOF这样的错误,通常因为client和server使用的proto不一致导致的。client端(即main.go)使用的proto文件目录为go.opentelemetry.io/otel/exporters/otlp/internal/opentelemetry-proto-gen,而collector使用proto文件目录为go.opentelemetry.io/collector/internal/data/opentelemetry-proto-gen,需要比较这两个目录下的文件是否一致。如果不一致,则需要根据collector的版本为main.go生成对应的proto文件(或者可以直接更换collector的镜像,注意使用的otel/opentelemetry-collector的镜像版本)。在collector的proto目录下可以看到对应的注释和使用的proto版本,如下:

      collector使用的proto git库为opentelemetry-proto。clone该库,切换到对应版本后,执行make gen-go即可生成对应的文件。

      Component Maturity
      Binary Protobuf Encoding
      collector/metrics/* Alpha
      collector/trace/* Stable
      common/* Stable
      metrics/* Alpha
      resource/* Stable
      trace/trace.proto Stable
      trace/trace_config.proto Alpha
      JSON encoding
      All messages Alpha
  • 相关阅读:
    ZOJ 1002 Fire Net
    Uva 12889 One-Two-Three
    URAL 1881 Long problem statement
    URAL 1880 Psych Up's Eigenvalues
    URAL 1877 Bicycle Codes
    URAL 1876 Centipede's Morning
    URAL 1873. GOV Chronicles
    Uva 839 Not so Mobile
    Uva 679 Dropping Balls
    An ac a day,keep wa away
  • 原文地址:https://www.cnblogs.com/charlieroro/p/13883602.html
Copyright © 2011-2022 走看看