Instrumentation
There are two main ways to instrument your application with the Langfuse SDKs:
- Using our native integrations for popular LLM and agent libraries such as OpenAI, LangChain or the Vercel AI SDK. They automatically create observations and traces and capture prompts, responses, usage, and errors.
- Manually instrumenting your application with the Langfuse SDK. The SDKs provide 3 ways to create observations:
All approaches are interoperable. You can nest a decorator-created observation inside a context manager or mix manual spans with our native integrations.
Custom instrumentation
Instrument your application with the Langfuse SDK using the following methods:
Context manager
The context manager allows you to create a new span and set it as the currently active observation in the OTel context for its duration. All new observations created within this block will automatically be its children.
start_as_current_observation() is the primary way to create observations while ensuring the active OpenTelemetry context is updated. Any child observations created inside the with block inherit the parent automatically.
Observations can have different types by setting the as_type parameter.
from langfuse import get_client, propagate_attributes
langfuse = get_client()
with langfuse.start_as_current_observation(
as_type="span",
name="user-request-pipeline",
input={"user_query": "Tell me a joke"},
) as root_span:
with propagate_attributes(user_id="user_123", session_id="session_abc"):
with langfuse.start_as_current_observation(
as_type="generation",
name="joke-generation",
model="gpt-4o",
) as generation:
generation.update(output="Why did the span cross the road?")
root_span.update(output={"final_joke": "..."})startActiveObservation accepts a callback, makes the new span active for the callback scope, and ends it automatically, even across async boundaries.
Observations can have different types by setting the asType parameter.
import { startActiveObservation, startObservation } from "@langfuse/tracing";
await startActiveObservation("user-request", async (span) => {
span.update({ input: { query: "Capital of France?" } });
const generation = startObservation(
"llm-call",
{ model: "gpt-4", input: [{ role: "user", content: "Capital of France?" }] },
{ asType: "generation" }
);
generation.update({ output: { content: "Paris." } }).end();
span.update({ output: "Answered." });
});Observe wrapper
The observe decorator is an easy way to automatically capture inputs, outputs, timings, and errors of a wrapped function without modifying the function's internal logic.
Use observe() to decorate a function and automatically capture inputs, outputs, timings, and errors.
Observations can have different types by setting the as_type parameter.
from langfuse import observe
@observe()
def my_data_processing_function(data, parameter):
return {"processed_data": data, "status": "ok"}
@observe(name="llm-call", as_type="generation")
async def my_async_llm_call(prompt_text):
return "LLM response"Capturing large inputs/outputs may add overhead. Disable IO capture per decorator (capture_input=False, capture_output=False) or via the LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED env var.
Use observe() to wrap a function and automatically capture inputs, outputs, timings, and errors.
Observations can have different types by setting the asType parameter.
import { observe, updateActiveObservation } from "@langfuse/tracing";
async function fetchData(source: string) {
updateActiveObservation({ metadata: { source: "API" } });
return { data: `some data from ${source}` };
}
const tracedFetchData = observe(fetchData, {
name: "fetch-data",
asType: "span",
});
const result = await tracedFetchData("API");Capturing large inputs/outputs may add overhead. Disable IO capture per decorator (captureInput=False, captureOutput=False) or via the LANGFUSE_OBSERVE_DECORATOR_IO_CAPTURE_ENABLED env var.
Manual observations
You can also manually create observations. This is useful when you need to:
- Record work that is self-contained or happens in parallel to the main execution flow but should still be part of the same overall trace (e.g., a background task initiated by a request).
- Manage the observation's lifecycle explicitly, perhaps because its start and end are determined by non-contiguous events.
- Obtain an observation object reference before it's tied to a specific context block.
Use start_observation() when you need manual control without changing the active context.
You can pass the as_type parameter to specify the type of observation to create.
from langfuse import get_client
langfuse = get_client()
span = langfuse.start_observation(name="manual-span")
span.update(input="Data for side task")
child = span.start_observation(name="child-span", as_type="generation")
child.end()
span.end()Manual Ending Required
If you use start_observation(), you are
responsible for calling .end() on the returned observation object. Failure
to do so will result in incomplete or missing observations in Langfuse. Their
start_as_current_... counterparts used with a with statement handle this
automatically.
Key Characteristics:
- No Context Shift: Unlike their
start_as_current_...counterparts, these methods do not set the new observation as the active one in the OpenTelemetry context. The previously active span (if any) remains the current context for subsequent operations in the main execution flow. - Parenting: The observation created by
start_observation()will still be a child of the span that was active in the context at the moment of its creation. - Manual Lifecycle: These observations are not managed by a
withblock and therefore must be explicitly ended by calling their.end()method. - Nesting Children:
- Subsequent observations created using the global
langfuse.start_as_current_observation()(or similar global methods) will not be children of these "manual" observations. Instead, they will be parented by the original active span. - To create children directly under a "manual" observation, you would use methods on that specific observation object (e.g.,
manual_span.start_as_current_observation(...)).
- Subsequent observations created using the global
Example with more complex nesting:
from langfuse import get_client
langfuse = get_client()
# This outer span establishes an active context.
with langfuse.start_as_current_observation(as_type="span", name="main-operation") as main_operation_span:
# 'main_operation_span' is the current active context.
# 1. Create a "manual" span using langfuse.start_observation().
# - It becomes a child of 'main_operation_span'.
# - Crucially, 'main_operation_span' REMAINS the active context.
# - 'manual_side_task' does NOT become the active context.
manual_side_task = langfuse.start_observation(name="manual-side-task")
manual_side_task.update(input="Data for side task")
# 2. Start another operation that DOES become the active context.
# This will be a child of 'main_operation_span', NOT 'manual_side_task',
# because 'manual_side_task' did not alter the active context.
with langfuse.start_as_current_observation(as_type="span", name="core-step-within-main") as core_step_span:
# 'core_step_span' is now the active context.
# 'manual_side_task' is still open but not active in the global context.
core_step_span.update(input="Data for core step")
# ... perform core step logic ...
core_step_span.update(output="Core step finished")
# 'core_step_span' ends. 'main_operation_span' is the active context again.
# 3. Complete and end the manual side task.
# This could happen at any point after its creation, even after 'core_step_span'.
manual_side_task.update(output="Side task completed")
manual_side_task.end() # Manual end is crucial for 'manual_side_task'
main_operation_span.update(output="Main operation finished")
# 'main_operation_span' ends automatically here.
# Expected trace structure in Langfuse:
# - main-operation
# |- manual-side-task
# |- core-step-within-main
# (Note: 'core-step-within-main' is a sibling to 'manual-side-task', both children of 'main-operation')startObservation gives you full control over creating observations.
You can pass the asType parameter to specify the type of observation to create.
When you call one of these functions, the new observation is automatically linked as a child of the currently active operation in the OpenTelemetry context. However, it does not make this new observation the active one. This means any further operations you trace will still be linked to the original parent, not the one you just created.
To create nested observations manually, use the methods on the returned object (e.g., parentSpan.startObservation(...)).
import { startObservation } from "@langfuse/tracing";
// Start a root span for a user request
const span = startObservation(
// name
"user-request",
// params
{
input: { query: "What is the capital of France?" },
}
);
// Create a nested span for, e.g., a tool call
const toolCall = span.startObservation(
// name
"fetch-weather",
// params
{
input: { city: "Paris" },
},
// Specify observation type in asType
// This will type the attributes argument accordingly
// Default is 'span'
{ asType: "tool" }
);
// Simulate work and end the tool call span
await new Promise((resolve) => setTimeout(resolve, 100));
toolCall.update({ output: { temperature: "15°C" } }).end();
// Create a nested generation for the LLM call
const generation = span.startObservation(
"llm-call",
{
model: "gpt-4",
input: [{ role: "user", content: "What is the capital of France?" }],
},
{ asType: "generation" }
);
generation.update({
usageDetails: { input: 10, output: 5 },
output: { content: "The capital of France is Paris." },
});
generation.end();
// End the root span
span.update({ output: "Successfully answered user request." }).end();Manual Ending Required
If you use startObservation(), you are responsible for calling .end() on
the returned observation object. Failure to do so will result in incomplete or
missing observations in Langfuse.
Nesting observations
The Langfuse SDKs methods automatically handle the nesting of observations.
Observe Decorator
If you use the observe wrapper, the function call hierarchy is automatically captured and reflected in the trace.
from langfuse import observe
@observe
def my_data_processing_function(data, parameter):
# ... processing logic ...
return {"processed_data": data, "status": "ok"}
@observe
def main_function(data, parameter):
return my_data_processing_function(data, parameter)Context Manager
If you use the context manager, nesting is handled automatically by OpenTelemetry's context propagation. When you create a new observation using start_as_current_observation(), it becomes a child of the observation that was active in the context when it was created.
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="span", name="outer-process") as outer_span:
# outer_span is active
with langfuse.start_as_current_observation(as_type="generation", name="llm-step-1") as gen1:
# gen1 is active, child of outer_span
gen1.update(output="LLM 1 output")
with outer_span.start_as_current_observation(name="intermediate-step") as mid_span:
# mid_span is active, also a child of outer_span
# This demonstrates using the yielded span object to create children
with mid_span.start_as_current_observation(as_type="generation", name="llm-step-2") as gen2:
# gen2 is active, child of mid_span
gen2.update(output="LLM 2 output")
mid_span.update(output="Intermediate processing done")
outer_span.update(output="Outer process finished")Manual Observations
If you are creating observations manually, you can use the methods on the parent LangfuseSpan or LangfuseGeneration object to create children. These children will not become the current context unless their _as_current_ variants are used (see context manager).
from langfuse import get_client
langfuse = get_client()
parent = langfuse.start_observation(name="manual-parent")
child_span = parent.start_observation(name="manual-child-span")
# ... work ...
child_span.end()
child_gen = parent.start_observation(name="manual-child-generation", as_type="generation")
# ... work ...
child_gen.end()
parent.end()Nesting happens automatically via OpenTelemetry context propagation. When you create a new observation with startActiveObservation, it becomes a child of whatever was active at the time.
import { startActiveObservation } from "@langfuse/tracing";
await startActiveObservation("outer-process", async () => {
await startActiveObservation("llm-step-1", async (span) => {
span.update({ output: "LLM 1 output" });
});
await startActiveObservation("intermediate-step", async (span) => {
await startActiveObservation("llm-step-2", async (child) => {
child.update({ output: "LLM 2 output" });
});
span.update({ output: "Intermediate processing done" });
});
});Update observations
You can update observations with new information as your code executes.
- For observations created via context managers or assigned to variables: use the
.update()method on the object. - To update the currently active observation in the context (without needing a direct reference to it): use
langfuse.update_current_span()orlangfuse.update_current_generation().
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="generation", name="llm-call", model="gpt-5-mini") as gen:
gen.update(input={"prompt": "Why is the sky blue?"})
# ... make LLM call ...
response_text = "Rayleigh scattering..."
gen.update(
output=response_text,
usage_details={"input_tokens": 5, "output_tokens": 50},
metadata={"confidence": 0.9}
)
# Alternatively, update the current observation in context:
with langfuse.start_as_current_observation(as_type="span", name="data-processing"):
# ... some processing ...
langfuse.update_current_span(metadata={"step1_complete": True})
# ... more processing ...
langfuse.update_current_span(output={"result": "final_data"})Update the active observation with observation.update().
import { startActiveObservation } from "@langfuse/tracing";
await startActiveObservation("user-request", async (span) => {
span.update({
input: { path: "/api/process" },
output: { status: "success" },
});
});Add attributes to observations
You can add attributes to observations to help you better understand your application and to correlate observations in Langfuse:
To update the input and output of the trace, see trace-level inputs/outputs.
Use propagate_attributes() to add attributes to observations.
from langfuse import get_client, propagate_attributes
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="span", name="user-workflow"):
with propagate_attributes(
user_id="user_123",
session_id="session_abc",
metadata={"experiment": "variant_a"},
version="1.0",
trace_name="user-workflow",
):
with langfuse.start_as_current_observation(as_type="generation", name="llm-call"):
passWhen using the @observe() decorator:
from langfuse import observe, propagate_attributes
@observe()
def my_llm_pipeline(user_id: str, session_id: str):
# Propagate early in the trace
with propagate_attributes(
user_id=user_id,
session_id=session_id,
metadata={"pipeline": "main"}
):
# All nested @observe functions inherit these attributes
result = call_llm()
return result
@observe()
def call_llm():
# This automatically has user_id, session_id, metadata from parent
passUse propagateAttributes() to add attributes to observations.
import { startActiveObservation, propagateAttributes, startObservation } from "@langfuse/tracing";
await startActiveObservation("user-workflow", async () => {
await propagateAttributes(
{
userId: "user_123",
sessionId: "session_abc",
metadata: { experiment: "variant_a", env: "prod" },
version: "1.0",
traceName: "user-workflow",
},
async () => {
const generation = startObservation("llm-call", { model: "gpt-4" }, { asType: "generation" });
generation.end();
}
);
});- Values must be strings ≤200 characters
- Metadata keys: Alphanumeric characters only (no whitespace or special characters)
- Call early in your trace to ensure all observations are covered. This way you make sure that all Metrics in Langfuse are accurate.
- Invalid values are dropped with a warning
Cross-service propagation
For distributed tracing across multiple services, use the as_baggage parameter (see OpenTelemetry documentation for more details) to propagate attributes via HTTP headers.
from langfuse import get_client, propagate_attributes
import requests
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="span", name="api-request"):
with propagate_attributes(
user_id="user_123",
session_id="session_abc",
as_baggage=True,
):
requests.get("https://service-b.example.com/api")import { propagateAttributes, startActiveObservation } from "@langfuse/tracing";
await startActiveObservation("api-request", async () => {
await propagateAttributes(
{
userId: "user_123",
sessionId: "session_abc",
asBaggage: true,
},
async () => {
await fetch("https://service-b.example.com/api");
}
);
});Security Warning: When baggage propagation is enabled, attributes are added to all outbound HTTP headers. Only use it for non-sensitive values needed for distributed tracing.
Update trace
By default, trace input/output mirror whatever you set on the root observation, the first observation in your trace. You can customize the trace level information if you need to for LLM-as-a-Judge, AB-tests, or UI clarity.
LLM-as-a-Judge workflows in Langfuse might rely on trace-level inputs/outputs. Make sure to set them deliberately rather than relying on the root observation if your evaluation payload differs.
Default Behavior
from langfuse import get_client
langfuse = get_client()
# Using the context manager
with langfuse.start_as_current_observation(
as_type="span",
name="user-request",
input={"query": "What is the capital of France?"} # This becomes the trace input
) as root_span:
with langfuse.start_as_current_observation(
as_type="generation",
name="llm-call",
model="gpt-4o",
input={"messages": [{"role": "user", "content": "What is the capital of France?"}]}
) as gen:
response = "Paris is the capital of France."
gen.update(output=response)
# LLM generation input/output are separate from trace input/output
root_span.update(output={"answer": "Paris"}) # This becomes the trace outputOverride Default Behavior
Use observation.set_trace_io() or langfuse.set_current_trace_io() if you need different trace inputs/outputs than the root observation:
set_trace_io() and set_current_trace_io() are deprecated and exist only for backward compatibility with trace-level LLM-as-a-judge evaluators that rely on trace input/output. For new code, set input/output on the root observation directly. See the Python v3 → v4 migration guide.
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="span", name="complex-pipeline") as root_span:
# Root span has its own input/output
root_span.update(input="Step 1 data", output="Step 1 result")
# But trace should have different input/output (e.g., for LLM-as-a-judge)
root_span.set_trace_io(
input={"original_query": "User's actual question"},
output={"final_answer": "Complete response", "confidence": 0.95}
)
# Now trace input/output are independent of root span input/output
# Using the observe decorator
@observe()
def process_user_query(user_question: str):
# LLM processing...
answer = call_llm(user_question)
# Explicitly set trace input/output for evaluation features
langfuse.set_current_trace_io(
input={"question": user_question},
output={"answer": answer}
)
return answerUse propagateAttributes to set correlating trace attributes, and setTraceIO for trace-level input/output.
.setTraceIO() is deprecated and exists only for backward compatibility with trace-level LLM-as-a-judge evaluators. See the JS/TS v4 → v5 migration guide.
import { propagateAttributes, startObservation } from "@langfuse/tracing";
const userId = "user-123";
const sessionId = "session-abc";
propagateAttributes(
{
userId: userId,
sessionId: sessionId,
tags: ["authenticated-user"],
metadata: { plan: "premium" },
},
() => {
const rootSpan = startObservation("data-processing");
const generation = rootSpan.startObservation(
"llm-call",
{},
{ asType: "generation" }
);
generation.end();
rootSpan.end();
}
);Trace and observation IDs
Langfuse follows the W3C Trace Context standard:
- trace IDs are 32-character lowercase hex strings (16 bytes)
- observation IDs are 16-character lowercase hex strings (8 bytes)
You cannot set arbitrary observation IDs, but you can generate deterministic trace IDs to correlate with external systems.
See Trace IDs & Distributed Tracing for more information on correlating traces across services.
Use create_trace_id() to generate a trace ID. If a seed is provided, the ID is deterministic. Use the same seed to get the same ID. This is useful for correlating external IDs with Langfuse traces.
from langfuse import get_client, Langfuse
langfuse = get_client()
external_request_id = "req_12345"
deterministic_trace_id = langfuse.create_trace_id(seed=external_request_id)Use get_current_trace_id() to get the current trace ID and get_current_observation_id to get the current observation ID.
You can also use observation.trace_id and observation.id to access the trace and observation IDs directly from a LangfuseSpan or LangfuseGeneration object.
from langfuse import get_client, Langfuse
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="span", name="my-op") as current_op:
trace_id = langfuse.get_current_trace_id()
observation_id = langfuse.get_current_observation_id()
print(trace_id, observation_id)Use createTraceId to generate a deterministic trace ID from a seed.
import { createTraceId, startObservation } from "@langfuse/tracing";
const externalId = "support-ticket-54321";
const langfuseTraceId = await createTraceId(externalId);
const rootSpan = startObservation(
"process-ticket",
{},
{
parentSpanContext: {
traceId: langfuseTraceId,
spanId: "0123456789abcdef",
traceFlags: 1,
},
}
);Use getActiveTraceId to get the active trace ID and getActiveSpanId to get the current observation ID.
import { startObservation, getActiveTraceId } from "@langfuse/tracing";
await startObservation("run", async (span) => {
const traceId = getActiveTraceId();
console.log(`Current trace ID: ${traceId}`);
});Link to existing traces
When integrating with upstream services that already have trace IDs, supply the W3C trace context so Langfuse spans join the existing tree rather than creating a new one.
Use the trace_context parameter to set custom trace context information.
from langfuse import get_client
langfuse = get_client()
existing_trace_id = "abcdef1234567890abcdef1234567890"
existing_parent_span_id = "fedcba0987654321"
with langfuse.start_as_current_observation(
as_type="span",
name="process-downstream-task",
trace_context={
"trace_id": existing_trace_id,
"parent_span_id": existing_parent_span_id,
},
):
passUse the parentSpanContext parameter to set custom trace context information.
import { startObservation } from "@langfuse/tracing";
const span = startObservation(
"downstream-task",
{},
{
parentSpanContext: {
traceId: "abcdef1234567890abcdef1234567890",
spanId: "fedcba0987654321",
traceFlags: 1,
},
}
);
span.end();Client lifecycle & flushing
As the Langfuse SDKs are asynchronous, they buffer spans in the background. Always flush() or shutdown() the client in short-lived processes (scripts, serverless functions, workers) to avoid losing data.
Manually triggers the sending of all buffered observations (spans, generations, scores, media metadata) to the Langfuse API. This is useful in short-lived scripts or before exiting an application to ensure all data is persisted.
from langfuse import get_client
langfuse = get_client()
# ... create traces and observations ...
langfuse.flush() # Ensures all pending data is sentThe flush() method blocks until the queued data is processed by the respective background threads.
Gracefully shuts down the Langfuse client. This includes:
- Flushing all buffered data (similar to
flush()). - Waiting for background threads (for data ingestion and media uploads) to finish their current tasks and terminate.
It's crucial to call shutdown() before your application exits to prevent data loss and ensure clean resource release. The SDK automatically registers an atexit hook to call shutdown() on normal program termination, but manual invocation is recommended in scenarios like:
- Long-running daemons or services when they receive a shutdown signal.
- Applications where
atexitmight not reliably trigger (e.g., certain serverless environments or forceful terminations).
from langfuse import get_client
langfuse = get_client()
# ... application logic ...
# Before exiting:
langfuse.shutdown()Export the processor from your OTEL SDK setup file in order to call forceFlush() later.
import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
// Export the processor to be able to flush it
export const langfuseSpanProcessor = new LangfuseSpanProcessor({
exportMode: "immediate" // optional: configure immediate span export in serverless environments
});
const sdk = new NodeSDK({
spanProcessors: [langfuseSpanProcessor],
});
sdk.start();In your serverless function handler, call forceFlush() on the LangfuseSpanProcessor before the function exits.
import { langfuseSpanProcessor } from "./instrumentation";
export async function handler(event, context) {
// ... your application logic ...
// Flush before exiting
await langfuseSpanProcessor.forceFlush();
}Export the processor from your instrumentation.ts file in order to flush it later.
import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
// Export the processor to be able to flush it
export const langfuseSpanProcessor = new LangfuseSpanProcessor();
const sdk = new NodeSDK({
spanProcessors: [langfuseSpanProcessor],
});
sdk.start();In Vercel Cloud Functions, please use the after utility to schedule a flush after the request has completed.
import { after } from "next/server";
import { langfuseSpanProcessor } from "./instrumentation.ts";
export async function POST() {
// ... existing request logic ...
// Schedule flush after request has completed
after(async () => {
await langfuseSpanProcessor.forceFlush();
});
// ... send response ...
}