Import Logs From An Amazon S3 Bucket

This solution uses AWS's "Assume Role" functionality for authentication. For the documentation of the older access key authentication click here.

This Solution describes how to import log files which are deposited in an Amazon S3 bucket.

Prerequisites

1. Set up an S3 bucket in which you periodically deposit log files. It's best to add a new file every few minutes. If you use longer batches (e.g. one file per hour), that imposes a delay on the logs showing up in Scalyr.

2. Create an SQS queue, and configure your S3 bucket to publish new-object notifications to the queue.

3. Use Amazon's IAM (Identity and Access Management) tools to create an IAM role which can only be used to read this bucket and queue. For instructions, see the section Create IAM Role.

Steps

Scalyr uses "monitors" to fetch data from other services. These steps will guide you through creating a monitor to fetch log files from an S3 bucket.

1. From the navigation bar, click Dashboards, and select Monitors.

2. Click Edit Monitors to open the monitors configuration file.

3. Find the monitors section of the configuration file. If you have never edited this file before, the monitors section will look like this:

  monitors: [
    // {
    //   type:        \"http\",
    //   url:         \"http://www.example.com/foo?bar=1\"
    // },
    // {
    //   type:        \"http\",
    //   url:         \"http://www.example.com/foo?bar=1\"
    // }
  ]

4. Add a stanza for the SQS queue you created earlier. The section might now look like this:

  monitors: [
    {
      type: "s3Bucket",
      region: "us-east-1",
      roleToAssume: "arn:aws:iam::account-id:role/role-name-with-path",
      queueUrl: "https://sqs.us-east-1.amazonaws.com/nnnnnnnnnnnn/scalyr-s3-bucket-foo"
      fileFormat: "text_gzip",
      hostname: "foo",
      parser: "foo"
    }
  ]

Fill in the appropriate values for each field:

Field Value
type Always s3Bucket.
region The AWS region in which your SQS queue is located, e.g. us-east-1.
s3Region The AWS region in which your S3 bucket instance is located, e.g. us-east-1. You can omit this unless it is different than region.
roleToAssume The ARN of the IAM role you created.
queueUrl The name of the SQS queue to which your bucket sends new-object notices.
fileFormat text_gzip if each file is compressed using gzip, text if not compressed.
objectKeyFilter Optional. If you specify a value, then S3 objects are ignored unless their name (object key) contains this substring. If you have multiple logs being published to the same S3 bucket, use this option to select the appropriate subset.
hostname The server name under which your bucket access logs will appear in the Overview page.
logfile The file name under which your bucket access log will appear in the Overview page.
parser Name of a parser to apply to these logs.
logAttributes Specifies extra fields to attach to the messages imported from this log. Optional.

Always use an IAM role with limited permissions. If you haven't already done so, follow the Create IAM Role instructions to create a special role which only has access to the S3 bucket and SQS queue.

5. Click Update File to save your changes. Scalyr will begin checking for new data batches once per minute.

6. Wait for the initial batch of log data to be retrieved. It may take minutes to hours for Amazon to publish the first batch.

7. In the top navigation bar, click Overview. In the list of servers, you should see an entry named according to the hostname you specified in the monitor configuration. To the right will be a link to your bucket access logs.

Troubleshooting

If your logs don't appear, make sure you've waited at least a few minutes since saving your changes to the Monitors configuration (i.e. since clicking Update File), and that new file(s) have been added to your S3 bucket subsequent to adding the Monitors configuration. Then return to the Scalyr Overview page and refresh your browser.

If the logs still don't appear, you may have a configuration error which is preventing the Scalyr monitor from retrieving your logs. To check for error messages, in Scalyr's top navigation bar, click Search. In the Expression box, type tag='S3BucketMonitor' and click the Search button. Click Latest to jump to the most recent log messages, and click on an individual message to see details for that message. If the details page includes an "errorMessage" field, then AWS returned an error when Scalyr attempted to retrieve your logs. Some common error messages:

Cause errorMessage
Incorrect Role configuration Status Code: 403, AWS Service: AmazonSQS, AWS Request ID: xxx-xxx-xxx-xxx, AWS Error Code: AccessDenied, AWS Error Message: Access to the resource https://sqs.us-east-1.amazonaws.com/nnnnnnnnnnnn/queue-name is denied.
Incorrect Role configuration or incorrect Role ARN Status Code: 403, AWS Service: AWSSecurityTokenService, AWS Request ID: xxx-xxx-xxx-xxx, AWS Error Code: AccessDenied, AWS Error Message: User: arn:aws:iam::913057016266:user/user is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::nnnnnnnnnnnn:role/RoleName

Further Reading

To learn how to work with your imported logs, see the Search overview page, and the Query Language page.

Appendix: Create IAM Role

You can use Amazon IAM to create a role which can only be used to read your S3 bucket access logs. This allows you to grant Scalyr the ability to import the logs, without opening up any other access to your AWS resources. Create the IAM role as follows:

  1. Make a note of your AWS account ID (a 12-digit number). You can find it near the top of the AWS My Account page.
  2. Log into the Amazon AWS console. From the Services menu, choose "IAM".
  3. Go to the Roles list.
  4. Click "Create Role".
  5. Under "Select type of trusted entity" select "Another AWS account".
  6. For "Account ID" enter "913057016266".
  7. Under options check "Require external ID" and enter the value "(Log in to view External Id.)".
  8. Click "Next: Permissions", then "Create policy", this will open in a new tab.
  9. Select the following values:
      Effect: Allow
      AWS Service: Amazon S3
      Actions: check GetObject
      Amazon Resource Name: arn:aws:s3:::bucket-name/*
    Replace bucket-name with the name of the S3 bucket you specified when when setting up bucket access logging.
  10. Click "Add additional permissions".
  11. Update the form with the following values:
      Effect: Allow
      AWS Service: Amazon SQS
      Actions: check GetQueueAttributes, DeleteMessage, and ReceiveMessage
      Amazon Resource Name: arn:aws:sqs:us-east-1:account-id:queue-name
    Replace account-id with your 12-digit AWS account ID, without hyphens. Replace bucket-name with the name of the SQS queue you subscribed to the S3 bucket.
  12. Note: If the contents of your S3 bucket are encrypted you will need to also add "KMS" permissions to this policy.
  13. Click "Review policy", name it, then click "Create policy".
  14. Return to the create role tab and select your newly created policy and hit "Next".
  15. Skip past adding tags and give your role a name, then hit "Create role".