Journald Monitor

A Scalyr agent monitor that imports log entries from journald.

The journald monitor polls systemd journal files every journal_poll_interval seconds and uploads any new entries to the Scalyr servers.

An agent monitor plugin is a component of the Scalyr Agent. To use a plugin, simply add it to the monitors section of the Scalyr Agent configuration file (/etc/scalyr/agent.json). For more information, see Agent Plugins.

By default, the journald monitor logs all log entries, but it can also be configured to filter entries by specific fields.

Dependencies

The journald monitor has a dependency on the Python systemd library, which is a Python wrapper around the systemd C API. You need to ensure this library has been installed on your system in order to use this monitor, otherwise a warning message will be printed if the Scalyr Agent is configured to use the journald monitor but the systemd library cannot be found.

You can install the systemd Python library via package manager. For debian/ubuntu:

apt-get install python-systemd

For CentOS/rhel/Fedora:

dnf install python-systemd

Or install it from source using pip e.g.

pip install systemd-python

See here for more information: https://github.com/systemd/python-systemd/

Polling the Journal File

The journald monitor polls the journal file every journal_poll_interval seconds to check for new logs. It does this by creating a polling object (https://docs.python.org/2/library/select.html#poll-objects) and calling the poll method of that object. The poll method is called with a 0 second timeout so it never blocks. After processing any new events, or if there are no events to process, the monitor thread sleeps for journal_poll_interval seconds and then polls again.

Sample Configuration

The following example will configure the agent to query the journal entries located in /var/log/journal (the default location for persisted journald logs)

monitors: [
  {
    module: "scalyr_agent.builtin_monitors.journald_monitor",
  }
]

Here is an example that queries journal entries from volatile/non-persisted journals, and filters those entries to only include ones that originate from the ssh service

monitors: [
  {
    module: "scalyr_agent.builtin_monitors.journald_monitor",
    journal_path: "/run/log/journal",
    journal_matches: ["_SYSTEMD_UNIT=ssh.service"]
  }
]

Setting parsers, redaction rules, attributes

You may set all per-log file configuration options for the logs collected by the journald monitor just as you can for files collected directly from the file system. For example, you can specify a parser, apply sampling and redaction rules, and default attributes to all collected journal entries. In fact, you can specify any field that you can use when defining an entry in the logs list, with the exception of path.

For example, suppose you wish to apply a parser called journaldParser to all journal entries collected by this monitor. You just need to define a top-level JSON array called journald_logs in the agent.json configuration file (or any .json file in the agent.d directory) like this:

journald_logs: [
  {
      journald_unit: ".*",
      parser: "journaldParser"
  }
]

This configuration stanza instructs the monitor to set journaldParser as the parser for all journald entries collected by the journald monitor with a journald unit field matching the regular expression .* (which matches all of them). This essentially sets the default parser for all entries collected by the monitor.

As the example hints, you can specify different parsers for journal entries with different values for unit. For example, this configuration assigns sshParser to the logs from the SSH service, and then journaldParser to all of the rest:

journald_logs: [
  {
       journald_unit: "ssh\\.service",
       parser: "sshParser"
  },
  {
      journald_unit: ".*",
      parser: "journaldParser"
  }
]

To determine the parser, the monitor finds the first entry in the journald_logs array with a regular expression that matches the journal entry's unit field.

Note, this will result in all log entries from the SSH service to uploaded under a different log file name than all other entries. By default, the auto-generated log file name will be /var/log/scalyr-agent-2/journald_XXXX.log where XXXX is a hash to guarantee the log file name is unique. You may override this behavior by providing a rename_logfile value, such as:

journald_logs: [
  {
       journald_unit: "ssh\\.service",
       parser: "sshParser",
       rename_logfile: "/journald/ssh_service.log"
  },
  {
      journald_unit: ".*",
      parser: "journaldParser"
  }
]

Similar to parser, you may configure sampling and redaction rules for your journald logs as well as default attributes. Here is an example specifying all three:

journald_logs: [
  {
       journald_unit: "ssh\\.service",
       parser: "sshParser",
       attributes: { "service": "ssh" },
       redaction_rules: [ { match_expression: "password", replacement: "REDACTED" } ],
       sampling_rules: [ { match_expression: "INFO", sampling_rate: 0.1} ],
  },
  {
      journald_unit: ".*",
      parser: "journaldParser",
      attributes: { "service": "unknown" }
  }
]

Overriding value escaping

By default, this monitor applies several quoting and escaping rules that some users may wish to override depending on their journald entry format. In particular, it surrounds all fields extracted from the entry with quotes and escapes their values. You may override this behavior on a per-logfile basis by using the emit_raw_details and detect_escaped_strings options.

If emit_raw_details is true, the journald MESSAGE field will not be quoted and escaped, this is useful if you know your messages will already be quoted and have special characters escaped to make parsing easier.

journald_logs: [
  {
      journald_unit: ".*",
      emit_raw_details: true
  }
]

If detect_escaped_strings is true all extracted fields (including MESSAGE and fields listed in journal_fields) will not be quoted and escaped if the first and last character of the field is a quote ("). This is useful if you expect any field to possibly be quoted strings with special characters already escaped, to avoid further encoding and escaping.

journald_logs: [
  {
      journald_unit: ".*",
      detect_escaped_strings: true
  }
]

Modular configuration

Similar to how your logs JSON array can be split across multiple modular configuration files in the agent.d directory, you may split your journald_logs JSON array across multiple files as well.

For example, you may have a filed called journald_ssh.json in agent.d with contents:

journald_logs: [
  {
       journald_unit: "ssh\\.service",
       parser: "sshParser",
       attributes: { "service": "ssh" },
       redaction_rules: [ { match_expression: "password", replacement: "REDACTED" } ],
       sampling_rules: [ { match_expression: "INFO", sampling_rate: 0.1} ],
  }
]

And a second filed called zz_journald_defaults.json with contents:

journald_logs: [
  {
      journald_unit: ".*",
      parser: "journaldParser",
      attributes: { "service": "unknown" }
  }
]

The actual contents of the journald_logs JSON array will be computed by appending all journald_logs entries in all files in alphabetical order of their file names. In the example above, we prefix use a filename beginning with zz to guarantee it will be the last entry in the journald_logs JSON array.

Configuration Reference

Here is the list of all configuration options in addition to the default options for each monitor that you may use to config the journald monitor:

Option Usage
module Always scalyr_agent.builtin_monitors.journald_monitor
journal_path Optional (defaults to /var/log/journal). Location on the filesystem of the journald logs.
journal_poll_interval Optional (defaults to 5). The number of seconds to wait for data while polling the journal file. Fractional values are supported.
journal_fields Optional dict containing journal fields to upload with each message, as well as a field name to map them to on the Scalyr website. Note: Not all fields need to exist in every message and only fields that exist will be included. Defaults to { "_SYSTEMD_UNIT": "unit", "_PID": "pid", "_MACHINE_ID": "machine_id", "_BOOT_ID": "boot_id", "_SOURCE_REALTIME_TIMESTAMP": timestamp" }
journal_matches Optional list containing 'match' strings for filtering entries. A match string follows the pattern "FIELD=value" where FIELD is a field of the journal entry e.g. _SYSTEMD_UNIT, _HOSTNAME, _GID and "value" is the value to filter that field on, so a match string equal to "_SYSTEMD_UNIT=ssh.service" would filter query results to make sure that all entries entries originated from the ssh.service system unit. The journald monitor calls the journal reader method add_match for each string in this list. See the journald filtering documentation for details on this works. If this config item is empty or None then no filtering occurs.
id Optional id used to differentiate between multiple journald monitors in the same agent.json configuration file. This is useful for configurations that define multiple journald monitors and that want to save unique checkpoints for each monitor. If specified, the id is also sent to the server along with other attributes under the monitor_id field'.
staleness_threshold_secs Optional, defaults = 600. When loading the journal events from a checkpoint, if the logs are older than this threshold, then the monitor skips to the end of the journal and only logs new entries.
max_log_rotations Optional, defaults = 2. How many rotated logs to keep before deleting them, when writing journal entries to a log for sending to Scalyr.
max_log_rotations Optional, defaults = 20MB. Max size of a log file before we rotate it out, when writing journal entries to a log for sending to Scalyr.