Sampling

Sampling can be used to control the volume of traces collected by Langfuse. Sampling is handled client-side.

You can configure the sample rate by setting the LANGFUSE_SAMPLE_RATE environment variable or by using the sample_rate/sampleRate constructor parameter. The value has to be between 0 and 1.

The default value is 1, meaning that all traces are collected. A value of 0.2 means that only 20% of the traces are collected. The SDK samples on the trace level meaning that if a trace is sampled, all observations and scores within that trace will be sampled as well.

With the Python SDK, you can configure sampling when initializing the client:

from langfuse import Langfuse, get_client
import os

# Method 1: Set environment variable
os.environ["LANGFUSE_SAMPLE_RATE"] = "0.5"  # As string in env var
langfuse = get_client()

# Method 2: Initialize with constructor parameter then get client
Langfuse(sample_rate=0.5)  # 50% of traces will be sampled
langfuse = get_client()

When using the @observe() decorator:

from langfuse import observe, Langfuse, get_client

# Initialize the client with sampling
Langfuse(sample_rate=0.3)  # 30% of traces will be sampled

@observe()
def process_data():
    # Only ~30% of calls to this function will generate traces
    # The decision is made at the trace level (first span)
    pass

If a trace is not sampled, none of its observations (spans or generations) or associated scores will be sent to Langfuse, which can significantly reduce data volume for high-traffic applications.

Langfuse respects OpenTelemetry's sampling decisions. You can configure a sampler in your OTEL SDK to control which traces are sent to Langfuse. This is useful for managing costs and reducing noise in high-volume applications.

Here is an example of how to configure a TraceIdRatioBasedSampler to send only 20% of traces:

import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
import { TraceIdRatioBasedSampler } from "@opentelemetry/sdk-trace-base";

const sdk = new NodeSDK({
  // Sample 20% of all traces
  sampler: new TraceIdRatioBasedSampler(0.2),
  spanProcessors: [new LangfuseSpanProcessor()],
});

See JS/TS SDK docs for more details.

Here is an example of how to configure a TraceIdRatioBasedSampler to send only 20% of traces:

import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
import { TraceIdRatioBasedSampler } from "@opentelemetry/sdk-trace-base";

const sdk = new NodeSDK({
  // Sample 20% of all traces
  sampler: new TraceIdRatioBasedSampler(0.2),
  spanProcessors: [new LangfuseSpanProcessor()],
});

Initialize the OpenAI integration as usual:

import OpenAI from "openai";
import { observeOpenAI } from "@langfuse/openai";

const openai = observeOpenAI(new OpenAI());

See OpenAI Integration (JS/TS) for more details.

Langfuse respects OpenTelemetry's sampling decisions for Langchain (JS/TS) as well. Configure sampling in your OTEL SDK first, then initialize the callback handler.

import { NodeSDK } from "@opentelemetry/sdk-node";
import { LangfuseSpanProcessor } from "@langfuse/otel";
import { TraceIdRatioBasedSampler } from "@opentelemetry/sdk-trace-base";

const sdk = new NodeSDK({
  // Sample 20% of all traces
  sampler: new TraceIdRatioBasedSampler(0.2),
  spanProcessors: [new LangfuseSpanProcessor()],
});

After setting up tracing and sampling, initialize the Langchain callback handler as usual:

import { CallbackHandler } from "langfuse-langchain";

const handler = new CallbackHandler();

See Langchain Integration (JS/TS) for more details.

When using the Vercel AI SDK Integration

import { registerOTel } from "@vercel/otel";
import { LangfuseExporter } from "langfuse-vercel";

export function register() {
  registerOTel({
    serviceName: "langfuse-vercel-ai-nextjs-example",
    traceExporter: new LangfuseExporter({ sampleRate: 0.5 }),
  });
}

GitHub Discussions

Was this page helpful?

Support

Sampling

GitHub Discussions

On this page