Langfuse joins ClickHouse! Learn more β†’
IntegrationsGatewaysHelicone

Helicone Integration

Helicone has moved into maintenance mode following its acquisition by Mintlify. If you're looking to migrate, see our Helicone to Langfuse migration guide.

In this guide, we'll show you how to integrate Langfuse with Helicone.

What is Helicone? Helicone is an open-source AI gateway enabling you access to 100+ AI models through an OpenAI-compatible interface. It offers features like intelligent routing, automatic failover, caching, cost tracking, and more.

What is Langfuse? Langfuse is an open source LLM engineering platform that helps teams trace LLM calls, monitor performance, and debug issues in their AI applications.

Since Helicone is OpenAI-compatible, we can utilize Langfuse's native integration with the OpenAI SDK, available in both Python and TypeScript.

Get started

  1. In your terminal, install the following packages if you haven't already:
pip install langfuse openai python-dotenv
  1. Then, create a .env file in your project and add your environment variables:
HELICONE_API_KEY=sk-helicone-... # Get it from your Helicone dashboard

LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com # πŸ‡ͺπŸ‡Ί EU region
# LANGFUSE_BASE_URL=https://us.cloud.langfuse.com πŸ‡ΊπŸ‡Έ US region
  1. In your terminal, install the following packages if you haven't already:
npm install langfuse openai
  1. Create a .env file in your project and add your environment variables:
HELICONE_API_KEY=sk-helicone-... # Get it from your Helicone dashboard

LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_BASE_URL=https://cloud.langfuse.com # πŸ‡ͺπŸ‡Ί EU region
# LANGFUSE_BASE_URL=https://us.cloud.langfuse.com πŸ‡ΊπŸ‡Έ US region

Example 1: Simple LLM Call

We use Langfuse's OpenAI SDK wrapper to automatically log Helicone calls as generations in Langfuse.

  • The base_url is set to Helicone's AI Gateway endpoint.
  • You can replace "gpt-4o" with any model available in Helicone's model registry.
  • The api_key uses your Helicone API key to handle authentication with model providers.
from langfuse.openai import openai
import os
from dotenv import load_dotenv

load_dotenv()

# Create an OpenAI client with Helicone's gateway endpoint
client = openai.OpenAI(
    api_key=os.getenv("HELICONE_API_KEY"),
    base_url="https://ai-gateway.helicone.ai/"
)

# Make a chat completion request
response = client.chat.completions.create(
    model="gpt-4o", # Or any other 100+ models
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a fun fact about space."}
    ],
    name="fun-fact-request"  # Optional: Name of the generation in Langfuse
)

print(response.choices[0].message.content)
import { observeOpenAI } from "langfuse";
import OpenAI from "openai";

const openaiClient = new OpenAI({
    apiKey: process.env.HELICONE_API_KEY,
    baseURL: "https://ai-gateway.helicone.ai/"
});

// Create an observed client with Langfuse options
const client = observeOpenAI(openaiClient, {
    generationName: "fun-fact-request"  // Optional: Name of the generation in Langfuse
});

// Make a chat completion request
const response = await client.chat.completions.create({
    model: "gpt-4o", // Or any other 100+ models
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Tell me a fun fact about space." }
    ]
});

console.log(response.choices[0].message.content);

Example 2: Nested LLM Calls

By using the @observe() decorator, we capture execution details of any Python function, including nested LLM calls, inputs, outputs, and execution times. This provides in-depth observability with minimal code changes.

  • The @observe() decorator captures inputs, outputs, and execution details of the functions.
  • Nested functions summarize_text and analyze_sentiment are also decorated, creating a hierarchy of traces.
  • Each LLM call within the functions is logged, providing a detailed trace of the execution flow.

By using the @observe() decorator, we can capture execution details of any Python function. The @observe() decorator captures inputs, outputs, and execution details of the functions.

from langfuse import observe
from langfuse.openai import openai
import os
from dotenv import load_dotenv

load_dotenv()

# Create an OpenAI client with Helicone's base URL
client = openai.OpenAI(
    base_url="https://ai-gateway.helicone.ai/",
    api_key=os.getenv("HELICONE_API_KEY")
)

@observe()  # This decorator enables tracing of the function
def analyze_text(text: str):
    # First LLM call: Summarize the text
    summary_response = summarize_text(text)
    summary = summary_response.choices[0].message.content

    # Second LLM call: Analyze the sentiment of the summary
    sentiment_response = analyze_sentiment(summary)
    sentiment = sentiment_response.choices[0].message.content

    return {
        "summary": summary,
        "sentiment": sentiment
    }

@observe()  # Nested function to be traced
def summarize_text(text: str):
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You summarize texts in a concise manner."},
            {"role": "user", "content": f"Summarize the following text:\n{text}"}
        ],
        name="summarize-text"
    )

@observe()  # Nested function to be traced
def analyze_sentiment(summary: str):
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You analyze the sentiment of texts."},
            {"role": "user", "content": f"Analyze the sentiment of the following summary:\n{summary}"}
        ],
        name="analyze-sentiment"
    )

# Example usage
text_to_analyze = "OpenAI's GPT-4 model has significantly advanced the field of AI, setting new standards for language generation."
analyze_text(text_to_analyze)

By using the Langfuse SDK, we can manually create traces and generations to capture nested LLM calls.

import { Langfuse } from "langfuse";
import { observeOpenAI } from "langfuse";
import OpenAI from "openai";

const langfuse = new Langfuse();

const openaiClient = new OpenAI({
    baseURL: "https://ai-gateway.helicone.ai/",
    apiKey: process.env.HELICONE_API_KEY
});

async function analyzeText(text: string) {
    // Create a trace for the entire analysis
    const trace = langfuse.trace({
        name: "analyze-text",
        input: { text }
    });

    try {
        // First LLM call: Summarize the text
        const summaryResponse = await summarizeText(text, trace);
        const summary = summaryResponse.choices[0].message.content;

        // Second LLM call: Analyze the sentiment of the summary
        const sentimentResponse = await analyzeSentiment(summary!, trace);
        const sentiment = sentimentResponse.choices[0].message.content;

        const result = {
            summary,
            sentiment
        };

        // Update trace with output
        trace.update({ output: result });

        return result;
    } catch (error) {
        trace.update({ output: { error: error instanceof Error ? error.message : String(error) } });
        throw error;
    }
}

async function summarizeText(text: string, trace: any) {
    const client = observeOpenAI(openaiClient, {
        generationName: "summarize-text",
        parent: trace
    });

    return await client.chat.completions.create({
        model: "gpt-4o",
        messages: [
            { role: "system", content: "You summarize texts in a concise manner." },
            { role: "user", content: `Summarize the following text:\n${text}` }
        ]
    });
}

async function analyzeSentiment(summary: string, trace: any) {
    const client = observeOpenAI(openaiClient, {
        generationName: "analyze-sentiment",
        parent: trace
    });

    return await client.chat.completions.create({
        model: "gpt-4o",
        messages: [
            { role: "system", content: "You analyze the sentiment of texts." },
            { role: "user", content: `Analyze the sentiment of the following summary:\n${summary}` }
        ]
    });
}

// Example usage
const textToAnalyze = "OpenAI's GPT-4 model has significantly advanced the field of AI, setting new standards for language generation.";
analyzeText(textToAnalyze);

Example 3: Streaming Responses

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Write a haiku about a robot."}
    ],
    stream=True,
    name="streaming-story"
)

print("πŸ€– Assistant (streaming):")
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)
print("\n")
const stream = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
        { role: "user", content: "Write a haiku about a robot." }
    ],
    stream: true
});

console.log("πŸ€– Assistant (streaming):");
for await (const chunk of stream) {
    if (chunk.choices[0]?.delta?.content) {
        process.stdout.write(chunk.choices[0].delta.content);
    }
}
console.log("\n");

Example 4: Multi-Provider Access

Helicone provides access to 100+ LLM providers through a single interface. Simply change the model name to use different providers:

# Use Anthropic Claude
response = client.chat.completions.create(
    model="claude-3.5-sonnet-v2/anthropic",
    messages=[{"role": "user", "content": "Hello!"}]
)

# Use Gemini Pro if Anthropic's Claude Sonnet 3.5 is not available
response = client.chat.completions.create(
    model="claude-3.5-sonnet-v2/anthropic,gemini-2.5-flash-lite/google-ai-studio",
    messages=[{"role": "user", "content": "Hello!"}]
)
// Use Anthropic Claude
const response = await client.chat.completions.create({
    model: "claude-3.5-sonnet-v2/anthropic",
    messages: [{ role: "user", content: "Hello!" }]
});

// Use Gemini Pro if Anthropic's Claude Sonnet 3.5 is not available
const response2 = await client.chat.completions.create({
    model: "claude-3.5-sonnet-v2/anthropic,gemini-2.5-flash-lite/google-ai-studio",
    messages: [{ role: "user", content: "Hello!" }]
});

Learn More


Was this page helpful?