Log Management Best Practices

Searching over a large number of logs can lead to a long query times.

Searching over a large amount of logs can lead to a long duration time, the more you reduce the amount of rows searched, the faster search results will be returned.

There are several ways to address that:

1. Datasets

Datasets offer a streamlined approach to organizing log data, enhancing query speed and efficiency.

Why use datasets?

Using datasets, you can define rules to group your logs into separated tables. When querying logs from a single dataset, less irrelevant data is scanned and therefore results are retrieved faster.

What is a dataset?

A dataset consists of the following:

  • Dataset rule- defines log filters using Lumigo syntax. Each log that meets the filter will be saved to the dataset
  • Retention period --> TBD (what's the supported periods?)

Planning your datasets grouping strategy

  • Create datasets to meet your organization structure, or business questions (e.g. dataset per service, environment, etc.)
  • Define dataset rules using static filters or change infrequently (e.g. filter by team or business unit)

πŸ“˜

Upon ingestion of a new log, it's checked against all dataset rules and stored in each dataset where the rule is satisfied.

If none of the dataset rules were met, the log will be stored in the default dataset named Logs.

Create datasets using the UI

  1. Navigate to -LINK- (TBD)
  2. Click on Create a dataset button
  3. Define the dataset name
  4. Set your dataset rule using Lumigo Search
  5. Select the retention period

πŸ“˜

Dataset rules apply for newly ingested logs, and will not affect previously ingested logs.

2. Log indexing

Lumigo leverages a powerful rational database, where logs are stored in tables and divided into columns according to the log fields.

Searching on columns provides with a much faster search experience. Therefore, when a significant amount of logs is ingested, the top frequent log fields will start to get indexed into columns. The rest of the log fields are stored using attributes.

Sometimes you may perform a search on a field that wasn't indexed due to it's frequency. In these cases, the column can be created manually by our Support team.

3. Field search

  1. Log query

Using Lumigo Search Syntax, you can filter your logs on exact field value, or prefix the expected value using wildcards.

Filters can be set with a single click using the search UI:

  1. Include/exclude field from query- A click on a specific log row opens the log viewer. When hovering on a specific log field, you can either include or exclude it from the query.
  1. Query autosuggest- when typing in the query bar, you will be provided with fields/values that meet your input

  1. Visual filters- available at the left panel of the UI Search page, used to narrow down your query with a click.

Field search example

Saying we have the following log line:

{
  "asctime":"2024-03-31 15:02:55,774",
  "customer_id":"c_5ab98f20a3ad4",
  "duration":152,
  "env":"l-0331-16-15",
  "lambda_name":"l-0331-16-15_trc-inges-stsls3_get-single-transaction-async-v2",
  "levelname":"INFO",
  "message":"logz.io search stat",
  "query_type":"query_specific_invocation",
  "service_version":"1.0.1470",
  "stack_name":"trc-inges-stsls3"
 }

And we want to calculate the average of duration field. Currently, there are 2B log lines to aggregate in the selected time-range.

Querying the average of the entire 2B log lines may take up to 30 seconds.

Using filters, we can reduce significantly the amount of scanned logs to aggregate, and improve the search duration up to 10x faster.

For this example, we could use the following filters to fine-tune our query:

  1. Filter out all of the logs that do not consists of the took field (syntax:took:>0)
  2. Filter on specific environment (syntax: env:"l-0331-16-15")