Kubernetes Monitor

The Kubernetes monitor collects logs from all containers/pods within the same node. This monitor is based off the docker_monitor plugin, and uses the raw logs mode of the docker plugin to send Kubernetes logs to Scalyr. It also reads labels from the Kubernetes API and associates them with the appropriate logs.

This document is a reference that lists the Configuration variables and Metrics collected by the Kubernetes monitor. For instructions on how to configure the Kubernetes monitor, please see Configuring Scalyr Agent (Kubernetes)

Configuration Reference

The following configuration options are available:

Option Usage
module Always scalyr_agent.builtin_monitors.kubernetes_monitor
log_mode Optional (defaults to docker_api). Determine which method is used to gather logs from the local containers. If docker_api, then this agent will use the docker API to contact the local containers and pull logs from them. If syslog, then this agent expects the other containers to push logs to this one using the syslog Docker log plugin.
container_name Optional (defaults to None). Defines a regular expression that matches the name given to the container running the scalyr-agent. If this is None, the Scalyr agent will look for a container running /usr/sbin/scalyr-agent-2 as the main process.
container_check_interval Optional (defaults to 5). How often (in seconds) to check if containers have been started or stopped.
api_socket Optional (defaults to /var/scalyr/docker.sock). Defines the unix socket used to communicate with the docker API. WARNING, if you have mode set to syslog, you must also set the docker_api_socket configuration option in the syslog monitor to this same value. Note: You need to map the host's /var/run/docker.sock to the same value as specified here, using the -v parameter, e.g., docker run -v /var/run/docker.sock:/var/scalyr/docker.sock ...
docker_api_version Optional (defaults to 'auto'). The version of the Docker API to use. WARNING, if you have mode set to syslog, you must also set the docker_api_version configuration option in the syslog monitor to this same value
docker_percpu_metrics Optional (defaults to False). If True, the agent emits CPU usage metrics per core. Note: this is disabled by default because it can lead to an excessive amount of metric data on CPUs with a large number of cores.
container_globs Optional (defaults to None). If true, a list of glob patterns for container names. Only containers whose names match one of the glob patterns will be monitored.
report_container_metrics Optional (defaults to True). If true, metrics will be collected from the container and reported to Scalyr. Note, metrics are only collected from those containers whose logs are being collected
report_k8s_metrics Optional (defaults to True). If true and report_container_metrics is true, metrics will be collected from k8s and reported to Scalyr.
k8s_ignore_namespaces Optional (defaults to "kube-system"). A comma-delimited list of the namespaces whose pods' logs should be ignored.
k8s_ignore_pod_sandboxes Optional (defaults to True). If True then all containers that have the label io.kubernetes.docker.type with value podsandbox will be excluded from log collection.
k8s_include_all_containers Optional (defaults to True). If True, all containers in all pods will be monitored by the kubernetes monitor unless they have an include: false or exclude: true annotation. If false, only pods/containers with an include:true or exclude:false annotation will be monitored. See documentation on annotations for further detail.
k8s_cache_init_abort_delay Optional (defaults to 20). The number of seconds to wait for initialization of the Kubernetes cache before aborting the Kubernetes_monitor.
k8s_parse_json DEPRECATED. Please use k8s_parse_format. If set, and True, then this flag will override the k8s_parse_format to auto. If set and False, then this flag will override the k8s_parse_format to raw.
k8s_parse_format Optional (defaults to auto). Valid values are: auto, json, cri and raw. If auto, the monitor will try to detect the format of the raw log files, e.g., json or cri. Log files will be parsed in this format before uploading to the server to extract log and timestamp fields. If raw, the raw contents of the log will be uploaded to Scalyr without being parsed. (Note: An incorrect setting can cause parsing to fail which will result in raw logs being uploaded to Scalyr, so please leave this as auto if in doubt.)
k8s_always_use_cri Optional (defaults to False). If True, the Kubernetes monitor will always try to read logs using the Container Runtime Interface (CRI) even when the runtime is detected as docker
k8s_cri_query_filesystem Optional (defaults to False). If True, then when in CRI mode, the monitor will only query the filesystem for the list of active containers, rather than first querying the Kubelet API. This is a useful optimization when the Kubelet API is known to be disabled.
k8s_verify_api_queries Optional (defaults to True). If true, then the ssl connection for all queries to the k8s API will be verified using the ca.crt certificate found in the service account directory. If false, no verification will be performed. This is useful for older k8s clusters where certificate verification can fail.
gather_k8s_pod_info Optional (defaults to False). If true, then every gather_sample interval, metrics will be collected from the docker and k8s APIs showing all discovered containers and pods. This is mostly a debugging aid and there are performance implications to always leaving this enabled.
include_daemonsets_as_deployments DEPRECATED.

Container metrics

Below is a description of all metrics collected by the Scalyr Kubernetes monitor.

Network metrics

Metric Description
docker.net.rx_bytes Total received bytes on the network interface.
docker.net.rx_dropped Total receive packets dropped on the network interface.
docker.net.rx_errors Total receive errors on the network interface.
docker.net.rx_packets Total received packets on the network interface.
docker.net.tx_bytes Total transmitted bytes on the network interface.
docker.net.tx_dropped Total transmitted packets dropped on the network interface.
docker.net.tx_errors Total transmission errors on the network interface.
docker.net.tx_packets Total packets transmitted on the network intervace.
k8s.pod.network.rx_bytes The total received bytes on a pod.
k8s.pod.network.rx_errors The total received errors on a pod.
k8s.pod.network.tx_bytes The total transmitted bytes on a pod.
k8s.pod.network.tx_errors The total transmission errors on a pod.
k8s.node.network.rx_bytes The total received bytes on a pod.
k8s.node.network.rx_errors The total received errors on a pod.
k8s.node.network.tx_bytes The total transmitted bytes on a pod.
k8s.node.network.tx_errors The total transmission errors on a pod.

Memory metrics

Metric Description
docker.mem.stat.active_anon The number of bytes of active memory backed by anonymous pages, excluding sub-cgroups.
docker.mem.stat.active_file The number of bytes of active memory backed by files, excluding sub-cgroups.
docker.mem.stat.cache The number of bytes used for the cache, excluding sub-cgroups.
docker.mem.stat.hierarchical_memory_limit The memory limit in bytes for the container.
docker.mem.stat.inactive_anon The number of bytes of inactive memory in anonymous pages, excluding sub-cgroups.
docker.mem.stat.inactive_file The number of bytes of inactive memory in file pages, excluding sub-cgroups.
docker.mem.stat.mapped_file The number of bytes of mapped files, excluding sub-groups.
docker.mem.stat.pgfault The total number of page faults, excluding sub-cgroups.
docker.mem.stat.pgmajfault The number of major page faults, excluding sub-cgroups.
docker.mem.stat.pgpgin The number of charging events, excluding sub-cgroups.
docker.mem.stat.pgpgout The number of uncharging events, excluding sub-groups.
docker.mem.stat.rss The number of bytes of anonymous and swap cache memory (includes transparent hugepages), excluding sub-cgroups.
docker.mem.stat.rss_huge The number of bytes of anonymous transparent hugepages, excluding sub-cgroups.
docker.mem.stat.unevictable The number of bytes of memory that cannot be reclaimed (mlocked etc), excluding sub-cgroups.
docker.mem.stat.writeback The number of bytes being written back to disk, excluding sub-cgroups.
docker.mem.stat.total_active_anon The number of bytes of active memory backed by anonymous pages, including sub-cgroups.
docker.mem.stat.total_active_file The number of bytes of active memory backed by files, including sub-cgroups.
docker.mem.stat.total_cache The number of bytes used for the cache, including sub-cgroups.
docker.mem.stat.total_inactive_anon The number of bytes of inactive memory in anonymous pages, including sub-cgroups.
docker.mem.stat.total_inactive_file The number of bytes of inactive memory in file pages, including sub-cgroups.
docker.mem.stat.total_mapped_file The number of bytes of mapped files, including sub-groups.
docker.mem.stat.total_pgfault The total number of page faults, including sub-cgroups.
docker.mem.stat.total_pgmajfault The number of major page faults, including sub-cgroups.
docker.mem.stat.total_pgpgin The number of charging events, including sub-cgroups.
docker.mem.stat.total_pgpgout The number of uncharging events, including sub-groups.
docker.mem.stat.total_rss The number of bytes of anonymous and swap cache memory (includes transparent hugepages), including sub-cgroups.
docker.mem.stat.total_rss_huge The number of bytes of anonymous transparent hugepages, including sub-cgroups.
docker.mem.stat.total_unevictable The number of bytes of memory that cannot be reclaimed (mlocked, etc), including sub-cgroups.
docker.mem.stat.total_writeback The number of bytes being written back to disk, including sub-cgroups.
docker.mem.max_usage The max amount of memory used by container in bytes.
docker.mem.usage The current number of bytes used for memory including cache.
docker.mem.fail_cnt The number of times the container hit its memory limit.
docker.mem.limit The memory limit for the container in bytes.

CPU metrics

Metric Description
docker.cpu.usage Total CPU consumed by container in nanoseconds.
docker.cpu.system_cpu_usage Total CPU consumed by container in kernel mode in nanoseconds.
docker.cpu.usage_in_usermode Total CPU consumed by tasks of the cgroup in user mode in nanoseconds.
docker.cpu.total_usage Total CPU consumed by tasks of the cgroup in nanoseconds.
docker.cpu.usage_in_kernelmode Total CPU consumed by tasks of the cgroup in kernel mode in nanoseconds.
docker.cpu.throttling.periods The number of of periods with throttling active.
docker.cpu.throttling.throttled_periods The number of periods where the container hit its throttling limit.
docker.cpu.throttling.throttled_time The aggregate amount of time the container was throttled in nanoseconds.