NLP(一百一十三)使用Langfuse提升LLM和Agents的可观测性

本文将会介绍如何使用Langfuse来提升大模型(LLM)和Agents的可观测性。

在日常使用大模型(LLM)和 Agents 时,我们常常需要追踪它们的响应情况和成本。借助 Langfuse,我们可以大幅提升 LLM 和 Agents 的可观测性,实现更精细的监控与优化。

简介

Langfuse是一个开源的大型语言模型(LLM)工程平台,专注于为开发者和研究人员提供灵活且高效的语言模型开发环境。它旨在解决 LLM 应用的工程化挑战,包括模型训练、部署、监控和优化等方面的问题。

主要特点:

  • 高度可定制化:Langfuse提供丰富的配置选项和灵活的 API 接口,允许用户根据实际需求定制 LLM 的功能和性能。

  • 高效资源管理:用户可以轻松管理和调度各种计算资源,提高资源利用率。

  • 完善的监控运维体系:内置强大的监控和运维工具,实时监控 LLM 的运行状态和性能指标。

  • 多功能支持:包括 LLM 可观测性、提示管理、LLM 评估、数据集管理、LLM 指标分析等功能。

应用场景

Langfuse适用于构建生产级 LLM 应用,特别是在需要快速开发和优化自定义对话系统、机器翻译系统等场景。它支持多种编程语言和框架,降低了开发门槛,使得初学者也能快速上手。

接下来,笔者将会分别介绍Langfuse如何提升LLM与Agents的可观测性。

Langfuse与LLM

首先我们需要创建一个Langfuse账号,同时生成LANGFUSE_SECRET_KEY、LANGFUSE_PUBLIC_KEY和LANGFUSE_HOST变量。我们将这些变量以及LLM的API key都放在环境变量中。

  • Langfuse与OpenAI

Langfuse已支持OpenAI,因此调用方式非常简单,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
from dotenv import load_dotenv
from langfuse.decorators import observe
from langfuse.openai import openai

load_dotenv()


@observe()
def story():
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is Langfuse?"}],
)
return response.choices[0].message.content


@observe()
def main():
return story()


main()

Langfuse提供了Tracing功能,它能帮助我们来追踪LLM的响应情况与成本。

从上图中,我们可以看到该LLM的调用路径为main() -> story() -> OpenAI-generation,还能观察到调用的模型、参数、响应时间、输入与输出token数量、token成本等信息,这对于我们掌握LLM的响应无疑有很大的帮助。

  • Langfuse与任意LLM

更为强大的是,Langfuse还具有任意LLM的追踪能力。我们以Anthropic的Claude系列模型为例,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import os

from langfuse.decorators import observe, langfuse_context
import anthropic
from dotenv import load_dotenv

load_dotenv()

anthopic_client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))


# Wrap LLM function with decorator
@observe(as_type="generation")
def anthropic_completion(**kwargs):
# optional, extract some fields from kwargs
kwargs_clone = kwargs.copy()
_input = kwargs_clone.pop('messages', None)
model = kwargs_clone.pop('model', None)
langfuse_context.update_current_observation(
input=_input,
model=model,
metadata=kwargs_clone
)

response = anthopic_client.messages.create(**kwargs)

# See docs for more details on token counts and usd cost in Langfuse
# https://langfuse.com/docs/model-usage-and-cost
langfuse_context.update_current_observation(
usage_details={
"input": response.usage.input_tokens,
"output": response.usage.output_tokens
}
)
return response.content[0].text


@observe()
def main():
return anthropic_completion(
model="claude-3-opus-20240229",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)


main()

从上面的代码中,我们可以看到Tracing(追踪)过程中的参数为message, model, 输入、输出token以及其它参数等。

  • 自定义配置

Langfuse中,你可以自己设置调用LLM的任务名称,trace_id和session_id等信息。笔者使用Langfuse的Low-Level SDK实现示例代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import os
from uuid import uuid4
from dotenv import load_dotenv
from langfuse import Langfuse
from openai import OpenAI

load_dotenv()


def story(**kwargs):
langfuse = Langfuse(environment="development")
trace = langfuse.trace(
id=kwargs.get("langfuse_observation_id"),
name=kwargs.get("name"),
tags=kwargs.get("tags"),
session_id=kwargs.get("session_id")
)

model_name = "gpt-4o"
max_tokens = 1000
temperature = 0.5
messages = [{"role": "user", "content": "What is Langfuse?"}]
# creates generation
generation = trace.generation(
name="my-first-generation",
model=model_name,
model_parameters={"maxTokens": max_tokens, "temperature": temperature},
input=messages
)
# creates chat completion
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
response = client.chat.completions.create(
model=model_name,
messages=messages
)
# update span and sets end_time
generation.end(
output=response.choices[0].message.content,
usage_details=response.usage
)

return response.choices[0].message.content


def main():
name = "My first trace v5"
custom_observation_id = str(uuid4())
session_id = str(uuid4())
tags = ["langfuse", "openai", "gpt-4o"]
print("trace_id: ", custom_observation_id)
print("session_id: ", session_id)
return story(langfuse_observation_id=custom_observation_id, name=name, tags=tags, session_id=session_id)


main()

观测结果如下图所示:

Langfuse与Agents

OpenAI在前几天开源了它们的Multi Agents框架openai-agents-python,该工具允许你很方便地创建多智能体应用,其安装命令为:

1
pip install openai-agents
  • Single Agent

以单智能体为例,Langfuse的观测代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import os
import base64
import logfire
import asyncio
from dotenv import load_dotenv
from agents import Agent, Runner

load_dotenv()

# Build Basic Auth header.
LANGFUSE_AUTH = base64.b64encode(
f"{os.environ.get('LANGFUSE_PUBLIC_KEY')}:{os.environ.get('LANGFUSE_SECRET_KEY')}".encode()
).decode()

# Configure OpenTelemetry endpoint & headers
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = os.environ.get("LANGFUSE_HOST") + "/api/public/otel"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"

# Configure logfire instrumentation.
logfire.configure(
service_name='my_agent_service',
send_to_logfire=False
)
# This method automatically patches the OpenAI Agents SDK to send logs via OTLP to Langfuse.
logfire.instrument_openai_agents()


async def main():
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant.",
)
result = await Runner.run(agent, "What is the captial of France?")
print(result.final_output)


if __name__ == "__main__":
asyncio.run(main())

输出结果为:

1
2
3
4
13:34:39.454 OpenAI Agents trace: Agent workflow
13:34:39.455 Agent run: 'Assistant'
13:34:39.460 Responses API with 'gpt-4o'
The capital of France is Paris.

观测结果如下图:

single agent
  • Multi Agents

以多智能体为例(本例中构建了3个Agent,分别负责中译英、英译中、Agent选择),Langfuse的观测代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# -*- coding: utf-8 -*-
import os
import base64
import logfire
import asyncio
from dotenv import load_dotenv
from agents import Agent, Runner


load_dotenv()

# Build Basic Auth header.
LANGFUSE_AUTH = base64.b64encode(
f"{os.environ.get('LANGFUSE_PUBLIC_KEY')}:{os.environ.get('LANGFUSE_SECRET_KEY')}".encode()
).decode()

# Configure OpenTelemetry endpoint & headers
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = os.environ.get("LANGFUSE_HOST") + "/api/public/otel"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"

# Configure logfire instrumentation.
logfire.configure(
service_name='my_agent_service',
send_to_logfire=False
)
# This method automatically patches the OpenAI Agents SDK to send logs via OTLP to Langfuse.
logfire.instrument_openai_agents()

zh2en_agent = Agent(
name="Chinese to English agent",
instructions="You are a translator from Chinese to English.",
)

en2zh_agent = Agent(
name="English to Chinese agent",
instructions="You are a translator from English to Chinese.",
)

translation_agent = Agent(
name="Translation agent",
instructions="You are a translation agent. If the input is in Chinese, translate it to English."
" If the input is in English, translate it to Chinese.",
handoffs=[zh2en_agent, en2zh_agent],
)


async def main():
result = await Runner.run(translation_agent, input="The Shawshank Redemption")
print(result.final_output)


if __name__ == "__main__":
asyncio.run(main())

输出结果如下:

1
2
3
4
5
6
7
13:38:56.614 OpenAI Agents trace: Agent workflow
13:38:56.615 Agent run: 'Translation agent'
13:38:56.620 Responses API with 'gpt-4o'
13:38:58.060 Handoff: Translation agent -> None
13:38:58.061 Agent run: 'English to Chinese agent'
13:38:58.061 Responses API with 'gpt-4o'
《肖申克的救赎》

观测结果如下图:

multi agents
  • Multi Agents with function calling

openai-agents-python框架支持多智能体的同时还支持工具调用(function calling),笔者以Weather Agent和Time Agent为例,分别调用get_weather和get_time函数,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import os
import base64
import logfire
import asyncio
from datetime import datetime
from dotenv import load_dotenv
from agents import Agent, Runner, function_tool


load_dotenv()

# Build Basic Auth header.
LANGFUSE_AUTH = base64.b64encode(
f"{os.environ.get('LANGFUSE_PUBLIC_KEY')}:{os.environ.get('LANGFUSE_SECRET_KEY')}".encode()
).decode()

# Configure OpenTelemetry endpoint & headers
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = os.environ.get("LANGFUSE_HOST") + "/api/public/otel"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {LANGFUSE_AUTH}"

# Configure logfire instrumentation.
logfire.configure(
service_name='my_agent_service',
send_to_logfire=False
)
# This method automatically patches the OpenAI Agents SDK to send logs via OTLP to Langfuse.
logfire.instrument_openai_agents()


@function_tool
def get_weather(city: str) -> str:
return f"The weather in {city} is sunny."


@function_tool
def get_now_time() -> str:
now_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
return f"The current time is {now_time}."


weather_agent = Agent(
name="Weather agent",
instructions="You are a weather agent.",
tools=[get_weather],
)

time_agent = Agent(
name="Time agent",
instructions="You are a time agent.",
tools=[get_now_time],
)

agent = Agent(
name="Agent",
instructions="You are an helpful agent.",
handoffs=[weather_agent, time_agent],
)


async def main():
result1 = await Runner.run(agent, input="What's the weather in Tokyo?")
print(result1.final_output)
# The weather in Tokyo is sunny.
result2 = await Runner.run(agent, input="What's the time now?")
print(result2.final_output)

if __name__ == "__main__":
asyncio.run(main())

Agent调用情况如下图:

multi agents with function calling

调用工具输出如下图:

function output

总结

本文介绍了如何使用Langfuse 来提升大模型(LLM)和多智能体(Agents)的可观测性。Langfuse 作为一个开源 LLM 工程平台,提供了强大的 Tracing(追踪) 功能,使开发者能够详细监控 LLM 的调用路径、输入输出、响应时间和成本等关键指标。文章展示了如何在 OpenAI 和 Anthropic 的 LLM 调用中集成 Langfuse,并通过自定义配置进一步增强可观测性。

此外,文章还介绍了 Langfuse 与 OpenAI Agents 框架 结合使用的案例,包括单智能体、多智能体以及工具调用(function calling)的场景。通过 Langfuse,开发者可以清晰地跟踪智能体的决策路径、交互过程以及外部 API 的调用情况,从而优化 LLM 应用的稳定性和性能。

当然,Langfuse的功能是十分强大的,远不仅于此,笔者后续将会继续探索~

欢迎关注我的公众号NLP奇幻之旅,原创技术文章第一时间推送。

欢迎关注我的知识星球“自然语言处理奇幻之旅”,笔者正在努力构建自己的技术社区。


NLP(一百一十三)使用Langfuse提升LLM和Agents的可观测性
https://percent4.github.io/NLP(一百一十三)使用Langfuse提升LLM和Agents的可观测性/
作者
Jclian91
发布于
2025年4月27日
许可协议