← 返回

使用 LLM API

最快开始使用 LLM 的方式是调用 API。无需本地部署，按需付费。

主流 API 服务对比

服务	模型	输入价格	输出价格	特点
OpenAI	GPT-4o	$2.50	$10.00	最强综合
OpenAI	GPT-4o-mini	$0.15	$0.60	性价比高
Anthropic	Claude 3.5 Sonnet	$3.00	$15.00	编码最强
Anthropic	Claude 3 Haiku	$0.25	$1.25	快速便宜
Google	Gemini 1.5 Pro	$1.25	$5.00	1M 上下文
阿里云	Qwen-Plus	¥0.4	¥1.2	中文优化
智谱	GLM-4	¥0.5	¥0.5	便宜
DeepSeek	DeepSeek-V3	$0.14	$0.28	超低价

价格为每 1M tokens (USD)，实际以官方为准

如何选择 API？

任务复杂：GPT-4o, Claude 3.5 Sonnet
代码：Claude 3.5 Sonnet, GPT-4o
中文：Qwen, DeepSeek, GPT-4o
长文本：Gemini 1.5 Pro (1M), Claude (200K)
成本敏感：DeepSeek, GPT-4o-mini, Claude Haiku
低延迟：GPT-4o-mini, Claude Haiku

OpenAI API (Python)

from openai import OpenAI

# 初始化
client = OpenAI(api_key="your-api-key")

# 基础调用
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "你是一个有帮助的助手。"},
        {"role": "user", "content": "什么是量子计算？"}
    ],
    max_tokens=500,
    temperature=0.7
)

print(response.choices[0].message.content)

# 流式输出
for chunk in client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "讲个故事"}],
    stream=True
):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Claude API (Python)

import anthropic

client = anthropic.Anthropic(api_key="your-api-key")

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)

print(message.content[0].text)

国内 API (以 DeepSeek 为例)

from openai import OpenAI

# DeepSeek 兼容 OpenAI SDK
client = OpenAI(
    api_key="your-deepseek-key",
    base_url="https://api.deepseek.com"
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "你好"}]
)

print(response.choices[0].message.content)

Function Calling

# 定义函数
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "获取城市天气",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string", "description": "城市名"}
                },
                "required": ["city"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "北京天气怎么样？"}],
    tools=tools
)

# 模型决定调用函数
tool_call = response.choices[0].message.tool_calls[0]
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)

# 执行函数
result = get_weather(arguments["city"])

# 把结果返回给模型
final_response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "北京天气怎么样？"},
        response.choices[0].message,
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(result)
        }
    ]
)

成本估算

import tiktoken

def estimate_cost(text, model="gpt-4o"):
    enc = tiktoken.encoding_for_model(model)
    tokens = len(enc.encode(text))
    input_cost = tokens / 1e6 * 2.50  # 输入价格
    output_cost = tokens / 1e6 * 10.00  # 输出价格
    return tokens, input_cost, output_cost

tokens, input_cost, output_cost = estimate_cost("你的文本...")
print(f"Token 数: {tokens}")
print(f"输入成本: ${input_cost:.4f}")
print(f"输出成本: ${output_cost:.4f}")

实用技巧

1. 减少成本

使用更小的模型（GPT-4o-mini）
精简 prompt，去掉冗余
缓存常见回答
批处理合并请求

2. 提升质量

清晰的 system prompt
提供示例（Few-shot）
指定输出格式
使用 temperature 控制随机性

3. 错误处理

重试机制（指数退避）
超时设置
速率限制处理
降级方案

安全建议

API Key 存储在环境变量，不要硬编码
在生产环境使用密钥管理服务
监控 API 使用量和成本
实现速率限制防止滥用
不把用户敏感数据发给第三方

主流 API 服务对比

如何选择 API？

OpenAI API (Python)

Claude API (Python)

国内 API (以 DeepSeek 为例)

Function Calling

成本估算

实用技巧

安全建议

官方文档