Documentation

Lumigo OpenTelemetry Distribution for Python

Send Data with the Lumigo Distribution for Python applications

Overview

The Lumigo Opentelemetry distribution for Python automates application instrumentation by automatically collecting data like traces, metrics, and logs. It integrates several upstream OpenTelemetry packages with additional logic with automated quality assurance and customizations. This means you can integrate distributed tracing without modifying or updating any lines of code in your application. It combines the power and flexibility of OpenTelemetry with Lumigo’s enhancements, such as enriching your data for better debugging, and improving tracing across services.

For instructions on auto-instrumentation, see the No-code activation section. View our Lumigo OpenTelemetry Distro for Python Github page for additional context and information. If you are deploying on AWS Lambda functions, use the Lumigo Lambda Tracer for Python .

📘

Note

If you’re using the Lumigo Kubernetes Operator, your code will be automatically instrumented with the Lumigo OpenTelemetry Distribution for Python. This means you don’t need to add the distribution as described in setup section below.

Setup

There are three steps to adding the Lumigo OpenTelemetry Distribution for Python to your application.

1. Add dependencies

Add lumigo_opentelemetry to your application.

lumigo_opentelemetry
pip install lumigo_opentelemetry

2. Environment-based configuration

Configure the Lumigo OpenTelemetry Python distribution package using environment variables.

LUMIGO_TRACER_TOKEN=<token>
OTEL_SERVICE_NAME="your-service-name"
LUMIGO_ENABLE_LOGS='true'
VariableDescription
LUMIGO_TRACER_TOKENYour Lumigo token
OTEL_SERVICE_NAMEThe service name you choose for your application. If your code is running on K8s or ECS, there is no need to set this env var as it will be updated automatically.
LUMIGO_ENABLE_LOGSIftrue, logs are sent to Lumigo. By default, this is false.

3. Automatic instrumentation

There are two approaches to activating the lumigo_opentelemetry package. No-code activation imports the package via the environment. Manual activation imports the package in code. We recommend using no-code activation.

AUTOWRAPT_BOOTSTRAP=lumigo_opentelemetry
import lumigo_opentelemetry

For standalone Python scripts, cron jobs, and other similar use cases that are not built around Lumigo or OpenTelemetry-instrumented libraries, you can use the @lumigo_wrapped decorator. For further details, read [Using the lumigo decorator](### Using the @lumigo_wrapped Decorator).

Advanced Configuration

The Lumigo OpenTelemetry Distribution for Python supports both standard OpenTelemetry settings and Lumigo-specific configurations. Use the following environment variables to configure your tracing setup.

OpenTelemetry configurations

The distribution integrates several upstream OpenTelemetry packages with additional logic. Due to this, any environment variables for vanilla, unmodified OpenTelemetry also apply here. Key configurations we support include:

Lumigo-specific configurations

The lumigo_opentelemetry package additionally supports the following configuration options as environment variables:

Environment Variable

Description

LUMIGO_TRACER_TOKEN

[Required] Token required to send data to Lumigo. You can find the value in Lumigo under Settings -> Tracing -> Manual tracing.

LUMIGO_ENABLE_LOGS

If true, enables logging instrumentation. Logs will be enriched with the active span context and sent to lumigo. For more details, see Logging instrumentation ). By default, this is false.

LUMIGO_ENABLE_TRACES

If true, enables tracing instrumentation. By default, this is true.

LUMIGO_SECRET_MASKING_REGEX

Masks values of keys that match the supplied list of regular expressions. Both traces and logs are filtered. Default list is: [".*pass.*", ".*key.*", ".*secret.*", ".*credential.*", ".*passphrase.*"]

LUMIGO_SECRET_MASKING_REGEX_HTTP_REQUEST_BODIES

Secret masking for HTTP request bodies, overridesLUMIGO_SECRET_MASKING_REGEX

LUMIGO_SECRET_MASKING_REGEX_HTTP_REQUEST_HEADERS

Secret masking for HTTP request headers, overridesLUMIGO_SECRET_MASKING_REGEX

LUMIGO_SECRET_MASKING_REGEX_HTTP_RESPONSE_BODIES

Secret masking for HTTP response bodies, overridesLUMIGO_SECRET_MASKING_REGEX

LUMIGO_SECRET_MASKING_REGEX_HTTP_RESPONSE_HEADERS

Secret masking for HTTP response headers, overridesLUMIGO_SECRET_MASKING_REGEX

LUMIGO_SECRET_MASKING_REGEX_HTTP_QUERY_PARAMS

Secret masking for HTTP query parameters, overrides LUMIGO_SECRET_MASKING_REGEX

LUMIGO_SECRET_MASKING_REGEX_ENVIRONMENT

Secret masking for environment variables, overridesLUMIGO_SECRET_MASKING_REGEX

LUMIGO_SWITCH_OFF

If true, disables the Lumigo OpenTelemetry distro entirely. No instrumentation will be injected, and no tracing data will be collected. By default, this is set to false.

LUMIGO_REPORT_DEPENDENCIES

If false, disables built-in dependency reporting to Lumigo SaaS. For more information, refer to the Automated dependency reporting section. By default, this is set to false.

LUMIGO_AUTO_FILTER_EMPTY_SQS

Avoids creating traces for empty SQS messages. See Filtering out empty SQS messages section

LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX

Filters client and server endpoints using a list of regular expressions. In the format of ["regex1", "regex2"]. SeeFiltering http endpoints

LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER

Applies regex filtering exclusively to server spans. Filters according to span attributes: url.path, http.target.

LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT

Applies regex filtering exclusively to client spans. Filters according to span attributes: url.full, http.url.

LUMIGO_DEBUG

If true, enables debug logging for troubleshooting. By default, this is set to false.

LUMIGO_DEBUG_SPANDUMP

A path to a local file to which spans are written for troubleshooting purposes. Should not be used in production unless directed by Lumigo support. Example value: /path/to/spandump.log\

LUMIGO_DEBUG_LOGDUMP

A path to a local file to dump logs, used for troubleshooting. Effective only when LUMIGO_ENABLE_LOGS is set to true.

Execution Tags

Execution Tags allow you to dynamically add dimensions to your invocations so that they can be identified, searched for, and filtered in Lumigo. For example, in multi-tenanted systems, execution tags are often used to mark the identifiers of the end-users that trigger them for analysis (Such as Explore view) and for alerting purposes.

By leveraging execution tags, you can gain deeper insights into your application's runtime behavior.

Adding Execution Tags

In the Lumigo OpenTelemetry Distribution for Python, execution tags are represented as span attributes with the lumigo.execution_tags. prefix. For example, you could add an execution tag as follows:

from opentelemetry.trace import get_current_span

get_current_span().set_attribute('lumigo.execution_tags.foo','bar')

When you are using OpenTelemetry's get_current_span() API, you do not need to keep track of the current span. You can get it at any point of your program execution.

In OpenTelemetry, span attributes can be strings, numbers (double precision floating point or signed 64 bit integer), booleans (Also known as "primitive types"), and arrays of one primitive type (Such as an array of string, and array of numbers or an array of booleans). In Lumigo, booleans and numbers are transformed to strings.

When using the Span.setAttribute API multiple times on the same span for the same key, new values may overwrite the previous values instead of adding to them:

from opentelemetry.trace import get_current_span

get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
get_current_span().set_attribute('lumigo.execution_tags.foo','baz')

In the snippet above, the foo execution tag will only have the baz value in Lumigo. The bar value will have been overriden.

If you want to set multiple values for an execution tag:

from opentelemetry.trace import get_current_span

get_current_span().set_attribute('lumigo.execution_tags.foo',['bar', 'baz'])

This will include both values without either being overriden.

We also support using Tuples to specify multiple values for an execution tag:

from opentelemetry.trace import get_current_span

get_current_span().set_attribute('lumigo.execution_tags.bar',('baz','xyz',))

Using the snippet above will result in the foo tag having both bar and baz values in Lumigo.

Another option for setting multiple values is using execution Tags in different spans of an invocation.

Execution Tags in different spans of an invocation

In Lumigo, multiple spans may be merged together into one invocation. This is the entry that you see, for example, in the Explore view.The invocation will include all execution tags on all its spans, and merge their values:

from opentelemetry import trace

trace.get_current_span().set_attribute('lumigo.execution_tags.foo','bar')

tracer = trace.get_tracer(__name__)

with tracer.start_as_current_span('child_span') as child_span:
    child_span.set_attribute('lumigo.execution_tags.foo','baz')

In the example above, the invocation in Lumigo will have both bar and baz values associated with the foo execution tag. Which spans are merged in the same invocation depends on the parent-child relations among those spans. Explaining this topic is outside the scope of this documentation. A good first read to get deeper into the topic is the Traces documentation of OpenTelemetry. In case your execution tags on different spans appear on different invocations than you expect, get in touch with Lumigo support.

Execution Tag Limitations

Execution tags in Lumigo have the following limitations:

  • Up to a max of 50 execution tag keys per invocation in Lumigo, irrespective of how many spans are part of the invocation or how many values each execution tag has.
  • The key of an execution tag cannot contain the . character. For example, lumigo.execution_tags.my.tag is not a valid tag. The OpenTelemetry Span.set_attribute() API will not fail or log warnings, but that will be displayed as my in Lumigo.
  • Each execution tag key can be at most 50 characters long. The lumigo.execution_tags. prefix does not count against the 50 characters limit.
  • Each execution tag value can be at most 70 characters long.

Programmatic Errors

Programmatic Errors allow you to customize errors, on top of monitoring and troubleshooting issues that should not necessarily interfere with the service. For example, an application tries to remove a user who does not exist. These custom errors can be captured by adding just a few lines of additional code to your application.

Programmatic errors indicate that a non-fatal error occurred, such as an application error. You can also log programmatic errors, track custom error issues, and trigger Alerts.

Creating a Programmatic Error

You can create programmatic errors using the create_programmatic_error function provided by the lumigo_opentelemetry package. This function allows you to capture and report errors with minimal setup, without the need for direct interaction with the OpenTelemetry trace SDK.

For example, you can add a programmatic error as follows:

from lumigo_opentelemetry import create_programmatic_error

create_programmatic_error("The customer 123 was not found", "CustomerNotExist")

In this example, he first argument, "Error message", is a descriptive message for the error. The second argument, "ErrorType", represents the type of the error.

Alternately, programmatic errors can also be created by adding span events with a custom attribute named lumigo.type.

For example, you could add a programmatic error as follows:

from opentelemetry.trace import get_current_span

get_current_span().add_event('<error-message>', {'lumigo.type': '<error-type>'})

This allows you a flexible method of logging programmatic errors by enriching an active span with additional error context.

Using the @lumigo_wrapped Decorator

The @lumigo_wrapped decorator offers a simple way to integrate Lumigo tracing into functions in scripts that are not using libraries or frameworks instrumented by Lumigo or OpenTelemetry.

from lumigo_opentelemetry import lumigo_wrapped

@lumigo_wrapped
def your_function():
    pass

When applied, the decorator creates an OpenTelemetry span for the decorated function, adding attributes such as:

  • input_args: The positional arguments passed to the function.
  • input_kwargs: The keyword arguments passed to the function.
  • return_value: The value returned by the function.

For example, you can use the @lumigo_wrapped decorator:

from lumigo_opentelemetry import lumigo_wrapped

@lumigo_wrapped
def calculate_sum(start, end):
    """Calculates the sum of numbers in a given range."""
    return sum(range(start, end + 1))

if __name__ == "__main__":
    result = calculate_sum(1, 100)
    print(f"The sum of numbers from 1 to 100 is: {result}")

This allows you to automatically create a span when calculate_sum() is called, capturing the function arguments (start and end) and the return value (the calculated sum) as span attributes.

📘

Note

If your script does not make any outbound requests, such as when the code crashes before executing any calls, no traces will be generated in Lumigo. This is due to OpenTelemetry's requirement that spans be initiated through instrumentation or through explicit tracing calls.

Supported runtimes

The following runtimes are supported by Lumigo OpenTelemetry Distribution:

  • cpython: 3.9.x, 3.10.x, 3.11.x, 3.12.x, 3.13.x

Deprecation Notice: As of version 1.0.156, support for Python 3.7 has been deprecated. The last version of the Lumigo OpenTelemetry Distribution to support Python 3.7 is version 1.0.155.

Supported packages

See the latest list of updated packages supported out of the box and regularly tested by Lumigo here: here

Automated dependency reporting

To enhance support and inform data-driven decisions regarding which packages to support next, the Lumigo OpenTelemetry Distribution for Python reports the packages and their versions used in your application to Lumigo SaaS at startup. This report also includes OpenTelemetry resource data, enabling analytics that reveal which platforms are utilizing which dependencies.

The uploaded data consists of a set of key-value pairs representing package names and their corresponding versions. This information complements the tracing data sent to Lumigo by covering dependencies that may not yet have dedicated instrumentation in the Lumigo OpenTelemetry Distribution for Python. The sole purpose of these analytics is to ensure you receive the necessary instrumentations without having to explicitly request them.

Dependencies data is transmitted only when a LUMIGO_TRACER_TOKEN is present in the environment. You can opt out of this reporting by setting the environment variable LUMIGO_REPORT_DEPENDENCIES= to false.

Baseline setup

The Lumigo OpenTelemetry Distribution automatically creates the following OpenTelemetry constructs provided to a TraceProvider.

Resource attributes

SDK resource attributes

  • Default resource attributes:

    • telemetry.sdk.language: python
    • telemetry.sdk.name: opentelemetry
    • telemetry.sdk.version: depends on the version of the opentelemetry-sdk included in the dependencies.
  • lumigo.distro.version: Contains the version of the Lumigo OpenTelemetry Distribution for Python as specified in the VERSION file.

Process resource attributes

  • As specified in the Process Semantic Conventions, the following process.runtime.* attributes are provided:

    • process.runtime.description
    • process.runtime.name
    • process.runtime.version
  • process.environ: A non-standard resource attribute, which contains a stringified representation of the process environment, with environment variables scrubbed based on the LUMIGO_SECRET_MASKING_REGEX configuration.

Amazon ECS resource attributes

If the instrumented Python application is running on the Amazon Elastic Container Service (ECS), the following resource attributes are automatically added:

  • cloud.provider attribute with value aws
  • cloud.platform with value aws_ecs
  • container.name with the hostname of the ECS Task container
  • container.id with the ID of the Docker container (based on the cgroup id)

If the ECS task uses the ECS agent v1.4.0, and therefore has access to Task metadata endpoint version 4, the following experimental attributes are added, as specified in the AWS ECS Resource Attributes specification:

  • aws.ecs.container.arn
  • aws.ecs.cluster.arn
  • aws.ecs.launchtype
  • aws.ecs.task.arn
  • aws.ecs.task.family
  • aws.ecs.task.revision

Kubernetes resource attributes

  • k8s.pod.uid with the Pod identifier, supported for both cgroups v1 and v2

Span exporters

📘

Note

Do not use LUMIGO_DEBUG_SPANDUMP in production.

SDK configuration

  • The following SDK environment variables are supported:

    • OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
    • OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT

📘

Note

If the OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT environment variable is not set, the span attribute size limit will be taken from OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT environment variable. The default size limit when both are not set is 2048.

Advanced use cases

Access to the TracerProvider

The Lumigo OpenTelemetry Distribution provides access to the TracerProvider it configures. See the Baseline setup section for more information. TracerProvider is configured through the tracer_provider attribute of the lumigo_opentelemetry package:

from lumigo_opentelemetry import tracer_provider

# You can do tasks like adding span processors here

Ensure spans are flushed to Lumigo before shutdown

For short-running processes, the BatchProcessor configured by the Lumigo OpenTelemetry Distribution may not ensure that tracing data are sent to Lumigo. See the Baseline setup section for more information.Through access to the tracer_provider however, it is possible to ensure that all spans are flushed to Lumigo. To force a flush of all spans:

from lumigo_opentelemetry import tracer_provider

# Do some logic

tracer_provider.force_flush()

# Now the Python process can terminate, with all the so far closed spans sent to Lumigo

Consuming SQS messages with Boto3 receive_message

Messaging instrumentations that retrieve messages from queues can be counter-intuitive for end-users. When retrieving messages from an SQS queue using boto3, it is expected that all subsequent operations using these message, such as sending data to a database or sending messages to another queue,are captured as child spans of the message retrieval span. Consider the following scenario, which is supported by the boto3 SQS receive_message instrumentation of the Lumigo OpenTelemetry Distribution for Python:

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

response = client.receive_message(...)  # Instrumentation creates a `span_0` span

for message in response.get("Messages", []):
  # The SQS.ReceiveMessage span is active in this scope
  with tracer.start_as_current_span("span_1"):  # span_0 is the parent of span_1
    do_something()

Without the proper scope provided by the iterator over response["Messages"], span_1 would be without a parent span, resulting in a separate invocation and a separate transaction in Lumigo.

Filtering out empty SQS messages

SQS-based applications often continuously poll an SQS queue for messages then process them as they arrive. Empty responses can clutter your tracing data. By default, empty SQS polling messages are filtered out and not sent to Lumigo. To modify this behavior, set the boolean environment variable LUMIGO_AUTO_FILTER_EMPTY_SQS to false.

  • LUMIGO_AUTO_FILTER_EMPTY_SQS=true If true, filters out empty SQS polling messages. By default, this is set to true.

Filtering HTTP endpoints

You can selectively filter spans based on HTTP server/client endpoints for various components, including web frameworks.

Global filtering

Set the LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX environment variable to a list of regex strings. Spans with matching server/client endpoints will not be traced.

Specific Filtering

For exclusive server (inbound) or client (outbound) span filtering, use the environment variables:

  • LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER
  • LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT

The environment variable must be a valid JSON array of strings, so if you want to match endpoint with the hostname google.com the environment variable value should be ["google\\.com"].If we are filtering out an HTTP call to an opentelemetry traced component, every subsequent invocation made by that component will also not be traced.

Examples:

  • Filtering out every incoming HTTP request to the /login endpoint (will also match requests such as /login?user=foo, /login/bar):
    • LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER=["\\/login"]
  • Filtering out every outgoing HTTP request to the google.com domain (will also match requests such as google.com/foo, bar.google.com):
    • LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT=["google\\.com"]'
  • Filtering out every outgoing HTTP request to https://www.google.com (will also match requests such as https://www.google.com/, https://www.google.com/foo)
    • LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT=["https:\\/\\/www\\.google\\.com"]
  • Filtering out every HTTP request (incoming or outgoing) with the word login:
    • LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX=["login"]

Contributing

For guidelines on contributing, please see CONTRIBUTING.md.