Lumigo OpenTelemetry Distribution for Python
Send Data with the Lumigo Distribution for Python applications
Overview
The Lumigo Opentelemetry distribution for Python automates application instrumentation by automatically collecting data like traces, metrics, and logs. It integrates several upstream OpenTelemetry packages with additional logic with automated quality assurance and customizations. This means you can integrate distributed tracing without modifying or updating any lines of code in your application. It combines the power and flexibility of OpenTelemetry with Lumigo’s enhancements, such as enriching your data for better debugging, and improving tracing across services.
For instructions on auto-instrumentation, see the No-code activation section. View our Lumigo OpenTelemetry Distro for Python Github page for additional context and information. If you are deploying on AWS Lambda functions, use the Lumigo Lambda Tracer for Python .
Note
If you’re using the Lumigo Kubernetes Operator, your code will be automatically instrumented with the Lumigo OpenTelemetry Distribution for Python. This means you don’t need to add the distribution as described in setup section below.
Setup
There are three steps to adding the Lumigo OpenTelemetry Distribution for Python to your application.
1. Add dependencies
Add lumigo_opentelemetry to your application.
lumigo_opentelemetry
pip install lumigo_opentelemetry
2. Environment-based configuration
Configure the Lumigo OpenTelemetry Python distribution package using environment variables.
LUMIGO_TRACER_TOKEN=<token>
OTEL_SERVICE_NAME="your-service-name"
LUMIGO_ENABLE_LOGS='true'
Variable | Description |
---|---|
LUMIGO_TRACER_TOKEN | Your Lumigo token |
OTEL_SERVICE_NAME | The service name you choose for your application. If your code is running on K8s or ECS, there is no need to set this env var as it will be updated automatically. |
LUMIGO_ENABLE_LOGS | Iftrue , logs are sent to Lumigo. By default, this is false . |
3. Automatic instrumentation
There are two approaches to activating the lumigo_opentelemetry
package. No-code activation imports the package via the environment. Manual activation imports the package in code. We recommend using no-code activation.
AUTOWRAPT_BOOTSTRAP=lumigo_opentelemetry
import lumigo_opentelemetry
For standalone Python scripts, cron jobs, and other similar use cases that are not built around Lumigo or OpenTelemetry-instrumented libraries, you can use the @lumigo_wrapped
decorator. For further details, read [Using the lumigo decorator](### Using the @lumigo_wrapped Decorator).
Advanced Configuration
The Lumigo OpenTelemetry Distribution for Python supports both standard OpenTelemetry settings and Lumigo-specific configurations. Use the following environment variables to configure your tracing setup.
OpenTelemetry configurations
The distribution integrates several upstream OpenTelemetry packages with additional logic. Due to this, any environment variables for vanilla, unmodified OpenTelemetry also apply here. Key configurations we support include:
- General configurations
- Batch span processor configurations: The Lumigo OpenTelemetry Distribution for Python uses a batch processor for sending data to Lumigo.
Lumigo-specific configurations
The lumigo_opentelemetry
package additionally supports the following configuration options as environment variables:
Environment Variable | Description |
---|---|
| [Required] Token required to send data to Lumigo. You can find the value in Lumigo under |
| If |
| If |
| Masks values of keys that match the supplied list of regular expressions. Both traces and logs are filtered.
Default list is:
|
| Secret masking for HTTP request bodies, overrides |
| Secret masking for HTTP request headers, overrides |
| Secret masking for HTTP response bodies, overrides |
| Secret masking for HTTP response headers, overrides |
| Secret masking for HTTP query parameters, overrides |
| Secret masking for environment variables, overrides |
| If |
| If |
| Avoids creating traces for empty SQS messages. See Filtering out empty SQS messages section |
| Filters client and server endpoints using a list of regular expressions. In the format of |
| Applies regex filtering exclusively to server spans. Filters according to span attributes: |
| Applies regex filtering exclusively to client spans. Filters according to span attributes: |
| If |
| A path to a local file to which spans are written for troubleshooting purposes. Should not be used in production unless directed by Lumigo support. Example value: |
| A path to a local file to dump logs, used for troubleshooting. Effective only when |
Execution Tags
Execution Tags allow you to dynamically add dimensions to your invocations so that they can be identified, searched for, and filtered in Lumigo. For example, in multi-tenanted systems, execution tags are often used to mark the identifiers of the end-users that trigger them for analysis (Such as Explore view) and for alerting purposes.
By leveraging execution tags, you can gain deeper insights into your application's runtime behavior.
Adding Execution Tags
In the Lumigo OpenTelemetry Distribution for Python, execution tags are represented as span attributes with the lumigo.execution_tags.
prefix. For example, you could add an execution tag as follows:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
When you are using OpenTelemetry's get_current_span()
API, you do not need to keep track of the current span. You can get it at any point of your program execution.
In OpenTelemetry, span attributes can be strings
, numbers
(double precision floating point or signed 64 bit integer), booleans
(Also known as "primitive types"), and arrays of one primitive type (Such as an array of string, and array of numbers or an array of booleans). In Lumigo, booleans and numbers are transformed to strings.
When using the Span.setAttribute
API multiple times on the same span for the same key, new values may overwrite the previous values instead of adding to them:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
get_current_span().set_attribute('lumigo.execution_tags.foo','baz')
In the snippet above, the foo
execution tag will only have the baz
value in Lumigo. The bar
value will have been overriden.
If you want to set multiple values for an execution tag:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.foo',['bar', 'baz'])
This will include both values without either being overriden.
We also support using Tuples to specify multiple values for an execution tag:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.bar',('baz','xyz',))
Using the snippet above will result in the foo
tag having both bar
and baz
values in Lumigo.
Another option for setting multiple values is using execution Tags in different spans of an invocation.
Execution Tags in different spans of an invocation
In Lumigo, multiple spans may be merged together into one invocation. This is the entry that you see, for example, in the Explore view.The invocation will include all execution tags on all its spans, and merge their values:
from opentelemetry import trace
trace.get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span('child_span') as child_span:
child_span.set_attribute('lumigo.execution_tags.foo','baz')
In the example above, the invocation in Lumigo will have both bar
and baz
values associated with the foo
execution tag. Which spans are merged in the same invocation depends on the parent-child relations among those spans. Explaining this topic is outside the scope of this documentation. A good first read to get deeper into the topic is the Traces documentation of OpenTelemetry. In case your execution tags on different spans appear on different invocations than you expect, get in touch with Lumigo support.
Execution Tag Limitations
Execution tags in Lumigo have the following limitations:
- Up to a max of 50 execution tag keys per invocation in Lumigo, irrespective of how many spans are part of the invocation or how many values each execution tag has.
- The
key
of an execution tag cannot contain the.
character. For example,lumigo.execution_tags.my.tag
is not a valid tag. The OpenTelemetrySpan.set_attribute()
API will not fail or log warnings, but that will be displayed asmy
in Lumigo. - Each execution tag key can be at most 50 characters long. The
lumigo.execution_tags.
prefix does not count against the 50 characters limit. - Each execution tag value can be at most 70 characters long.
Programmatic Errors
Programmatic Errors allow you to customize errors, on top of monitoring and troubleshooting issues that should not necessarily interfere with the service. For example, an application tries to remove a user who does not exist. These custom errors can be captured by adding just a few lines of additional code to your application.
Programmatic errors indicate that a non-fatal error occurred, such as an application error. You can also log programmatic errors, track custom error issues, and trigger Alerts.
Creating a Programmatic Error
You can create programmatic errors using the create_programmatic_error
function provided by the lumigo_opentelemetry
package. This function allows you to capture and report errors with minimal setup, without the need for direct interaction with the OpenTelemetry trace SDK.
For example, you can add a programmatic error as follows:
from lumigo_opentelemetry import create_programmatic_error
create_programmatic_error("The customer 123 was not found", "CustomerNotExist")
In this example, he first argument, "Error message", is a descriptive message for the error. The second argument, "ErrorType", represents the type of the error.
Alternately, programmatic errors can also be created by adding span events with a custom attribute named lumigo.type
.
For example, you could add a programmatic error as follows:
from opentelemetry.trace import get_current_span
get_current_span().add_event('<error-message>', {'lumigo.type': '<error-type>'})
This allows you a flexible method of logging programmatic errors by enriching an active span with additional error context.
Using the @lumigo_wrapped Decorator
The @lumigo_wrapped
decorator offers a simple way to integrate Lumigo tracing into functions in scripts that are not using libraries or frameworks instrumented by Lumigo or OpenTelemetry.
from lumigo_opentelemetry import lumigo_wrapped
@lumigo_wrapped
def your_function():
pass
When applied, the decorator creates an OpenTelemetry span for the decorated function, adding attributes such as:
input_args
: The positional arguments passed to the function.input_kwargs
: The keyword arguments passed to the function.return_value
: The value returned by the function.
For example, you can use the @lumigo_wrapped
decorator:
from lumigo_opentelemetry import lumigo_wrapped
@lumigo_wrapped
def calculate_sum(start, end):
"""Calculates the sum of numbers in a given range."""
return sum(range(start, end + 1))
if __name__ == "__main__":
result = calculate_sum(1, 100)
print(f"The sum of numbers from 1 to 100 is: {result}")
This allows you to automatically create a span when calculate_sum() is called, capturing the function arguments (start
and end
) and the return value (the calculated sum) as span attributes.
Note
If your script does not make any outbound requests, such as when the code crashes before executing any calls, no traces will be generated in Lumigo. This is due to OpenTelemetry's requirement that spans be initiated through instrumentation or through explicit tracing calls.
Supported runtimes
The following runtimes are supported by Lumigo OpenTelemetry Distribution:
- cpython: 3.9.x, 3.10.x, 3.11.x, 3.12.x, 3.13.x
Deprecation Notice: As of version 1.0.156, support for Python 3.7 has been deprecated. The last version of the Lumigo OpenTelemetry Distribution to support Python 3.7 is version 1.0.155.
Supported packages
See the latest list of updated packages supported out of the box and regularly tested by Lumigo here: here
Automated dependency reporting
To enhance support and inform data-driven decisions regarding which packages to support next, the Lumigo OpenTelemetry Distribution for Python reports the packages and their versions used in your application to Lumigo SaaS at startup. This report also includes OpenTelemetry resource data, enabling analytics that reveal which platforms are utilizing which dependencies.
The uploaded data consists of a set of key-value pairs representing package names and their corresponding versions. This information complements the tracing data sent to Lumigo by covering dependencies that may not yet have dedicated instrumentation in the Lumigo OpenTelemetry Distribution for Python. The sole purpose of these analytics is to ensure you receive the necessary instrumentations without having to explicitly request them.
Dependencies data is transmitted only when a LUMIGO_TRACER_TOKEN
is present in the environment. You can opt out of this reporting by setting the environment variable LUMIGO_REPORT_DEPENDENCIES=
to false.
Baseline setup
The Lumigo OpenTelemetry Distribution automatically creates the following OpenTelemetry constructs provided to a TraceProvider
.
Resource attributes
SDK resource attributes
-
Default resource attributes:
telemetry.sdk.language
:python
telemetry.sdk.name
:opentelemetry
telemetry.sdk.version
: depends on the version of theopentelemetry-sdk
included in the dependencies.
-
lumigo.distro.version
: Contains the version of the Lumigo OpenTelemetry Distribution for Python as specified in the VERSION file.
Process resource attributes
-
As specified in the Process Semantic Conventions, the following
process.runtime.*
attributes are provided:process.runtime.description
process.runtime.name
process.runtime.version
-
process.environ
: A non-standard resource attribute, which contains a stringified representation of the process environment, with environment variables scrubbed based on theLUMIGO_SECRET_MASKING_REGEX
configuration.
Amazon ECS resource attributes
If the instrumented Python application is running on the Amazon Elastic Container Service (ECS), the following resource attributes are automatically added:
cloud.provider
attribute with valueaws
cloud.platform
with valueaws_ecs
container.name
with the hostname of the ECS Task containercontainer.id
with the ID of the Docker container (based on the cgroup id)
If the ECS task uses the ECS agent v1.4.0, and therefore has access to Task metadata endpoint version 4, the following experimental attributes are added, as specified in the AWS ECS Resource Attributes specification:
aws.ecs.container.arn
aws.ecs.cluster.arn
aws.ecs.launchtype
aws.ecs.task.arn
aws.ecs.task.family
aws.ecs.task.revision
Kubernetes resource attributes
k8s.pod.uid
with the Pod identifier, supported for both cgroups v1 and v2
Span exporters
- If the
LUMIGO_TRACER_TOKEN
environment variable is set: a BatchSpanProcessor, which uses anOTLPSpanExporter
to push tracing data to Lumigo is used. - If the
LUMIGO_DEBUG_SPANDUMP
environment variable is set: aSimpleSpanProcessor
, which uses anConsoleSpanExporter
to save to file the spans collected is used.
Note
Do not use
LUMIGO_DEBUG_SPANDUMP
in production.
SDK configuration
-
The following SDK environment variables are supported:
OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT
Note
If the
OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
environment variable is not set, the span attribute size limit will be taken fromOTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT
environment variable. The default size limit when both are not set is 2048.
Advanced use cases
Access to the TracerProvider
The Lumigo OpenTelemetry Distribution provides access to the TracerProvider
it configures. See the Baseline setup section for more information. TracerProvider
is configured through the tracer_provider
attribute of the lumigo_opentelemetry
package:
from lumigo_opentelemetry import tracer_provider
# You can do tasks like adding span processors here
Ensure spans are flushed to Lumigo before shutdown
For short-running processes, the BatchProcessor
configured by the Lumigo OpenTelemetry Distribution may not ensure that tracing data are sent to Lumigo. See the Baseline setup section for more information.Through access to the tracer_provider
however, it is possible to ensure that all spans are flushed to Lumigo. To force a flush of all spans:
from lumigo_opentelemetry import tracer_provider
# Do some logic
tracer_provider.force_flush()
# Now the Python process can terminate, with all the so far closed spans sent to Lumigo
Consuming SQS messages with Boto3 receive_message
Messaging instrumentations that retrieve messages from queues can be counter-intuitive for end-users. When retrieving messages from an SQS queue using boto3, it is expected that all subsequent operations using these message, such as sending data to a database or sending messages to another queue,are captured as child spans of the message retrieval span. Consider the following scenario, which is supported by the boto3
SQS receive_message
instrumentation of the Lumigo OpenTelemetry Distribution for Python:
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
response = client.receive_message(...) # Instrumentation creates a `span_0` span
for message in response.get("Messages", []):
# The SQS.ReceiveMessage span is active in this scope
with tracer.start_as_current_span("span_1"): # span_0 is the parent of span_1
do_something()
Without the proper scope provided by the iterator over response["Messages"]
, span_1
would be without a parent span, resulting in a separate invocation and a separate transaction in Lumigo.
Filtering out empty SQS messages
SQS-based applications often continuously poll an SQS queue for messages then process them as they arrive. Empty responses can clutter your tracing data. By default, empty SQS polling messages are filtered out and not sent to Lumigo. To modify this behavior, set the boolean environment variable LUMIGO_AUTO_FILTER_EMPTY_SQS
to false
.
LUMIGO_AUTO_FILTER_EMPTY_SQS=true
If true, filters out empty SQS polling messages. By default, this is set to true.
Filtering HTTP endpoints
You can selectively filter spans based on HTTP server/client endpoints for various components, including web frameworks.
Global filtering
Set the LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX
environment variable to a list of regex strings. Spans with matching server/client endpoints will not be traced.
Specific Filtering
For exclusive server (inbound) or client (outbound) span filtering, use the environment variables:
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT
The environment variable must be a valid JSON array of strings, so if you want to match endpoint with the hostname google.com
the environment variable value should be ["google\\.com"]
.If we are filtering out an HTTP call to an opentelemetry traced component, every subsequent invocation made by that component will also not be traced.
Examples:
- Filtering out every incoming HTTP request to the
/login
endpoint (will also match requests such as/login?user=foo
,/login/bar
):LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER=["\\/login"]
- Filtering out every outgoing HTTP request to the
google.com
domain (will also match requests such asgoogle.com/foo
,bar.google.com
):LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT=["google\\.com"]
'
- Filtering out every outgoing HTTP request to
https://www.google.com
(will also match requests such ashttps://www.google.com/
,https://www.google.com/foo
)LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT=["https:\\/\\/www\\.google\\.com"]
- Filtering out every HTTP request (incoming or outgoing) with the word
login
:LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX=["login"]
Contributing
For guidelines on contributing, please see CONTRIBUTING.md.
Updated 2 days ago