AI Agent 架构设计与实现:构建能自主行动的智能体
2026/5/17 4:15:02 网站建设 项目流程

AI Agent 架构设计与实现:构建能自主行动的智能体

前言

AI Agent(人工智能代理)是当前大模型应用领域最热门的话题之一。与简单的问答系统不同,Agent 具有规划、推理和执行动作的能力,能够自主完成复杂的多步骤任务。

在我最近的项目中,我设计并实现了一个基于大模型的代码审查 Agent,它能够理解代码库、自动分析问题、生成修复建议并与用户进行多轮交互。这段经历让我对 Agent 的架构设计有了更深入的理解。今天想分享一些我在实践中总结的经验。

AI Agent 的核心概念

什么是 AI Agent

简单来说,AI Agent 是一个能够感知环境、做出决策并执行动作的系统。与传统程序不同,Agent 的行为不是完全预设的,而是由大模型根据当前状态动态生成。

一个典型的 Agent 包括以下组件:

  • 规划(Planning):将复杂任务分解为可执行的步骤
  • 记忆(Memory):存储和检索历史信息
  • 工具(Tools):调用外部系统完成任务的能力
  • 行动(Action):执行具体动作并观察结果

Agent 与传统程序的区别

特性传统程序AI Agent
逻辑预设确定大模型生成
分支显式条件判断动态推理
错误处理try-catch自我纠正
扩展修改代码更新 prompt

ReAct 架构

ReAct(Reasoning + Acting)是最基础的 Agent 架构,它让模型交替进行推理和行动:

class ReActAgent: def __init__(self, model, tools, max_iterations=10): self.model = model self.tools = tools self.max_iterations = max_iterations def run(self, task): """运行 Agent 处理任务""" observations = [] thoughts = [] actions = [] prompt = self._build_prompt(task, observations, thoughts, actions) for i in range(self.max_iterations): # 1. 推理 response = self.model.generate(prompt) parsed = self._parse_response(response) thought = parsed.get("thought") action = parsed.get("action") action_input = parsed.get("action_input") thoughts.append(thought) actions.append(f"{action}({action_input})") # 2. 执行动作 if action == "finish": return parsed.get("output") if action == "observe": observations.append(action_input) else: tool = self.tools.get(action) if tool: result = tool.execute(action_input) observations.append(result) else: observations.append(f"Unknown tool: {action}") # 3. 更新 prompt prompt = self._build_prompt(task, observations, thoughts, actions) return "Max iterations reached" def _build_prompt(self, task, observations, thoughts, actions): """构建 prompt""" prompt = f"""Task: {task} You are a helpful AI Agent. You have access to the following tools: {self._format_tools()} Solve the task by selecting and executing actions. Each action has a result. You will see the observation after each action. To begin, you should think about what to do. """ if thoughts: prompt += "\nPrevious thoughts:\n" for t in thoughts: prompt += f"- {t}\n" if actions: prompt += "\nPrevious actions:\n" for a in actions: prompt += f"- {a}\n" if observations: prompt += "\nObservations:\n" for o in observations: prompt += f"- {o}\n" prompt += "\nWhat should you do next? (thought, action, action_input)" return prompt

Agent 系统架构

整体架构

一个完整的 Agent 系统通常包含以下层次:

┌─────────────────────────────────────┐ │ User Interface │ ├─────────────────────────────────────┤ │ Agent Core │ │ ┌─────────┬─────────┬──────────┐ │ │ │ Planner │ Memory │ Executor │ │ │ └─────────┴─────────┴──────────┘ │ ├─────────────────────────────────────┤ │ Tool Layer │ │ ┌───────┬───────┬───────┬──────┐ │ │ │Search │ Code │ File │ API │ │ │ │ │ Runner│ System│ │ │ │ └───────┴───────┴───────┴──────┘ │ ├─────────────────────────────────────┤ │ LLM Engine │ └─────────────────────────────────────┘

Planner 模块

Planner 负责将复杂任务分解为可执行的步骤:

class Planner: def __init__(self, model): self.model = model def plan(self, task, context=None): """生成任务执行计划""" prompt = f"""Task: {task} Context: {context or "No additional context"} 分解这个任务为具体的执行步骤。每个步骤应该: 1. 有明确的开始和结束 2. 可以通过工具执行 3. 有可验证的输出 请按顺序列出执行步骤: """ response = self.model.generate(prompt) return self._parse_plan(response) def _parse_plan(self, response): """解析计划文本为结构化数据""" steps = [] for line in response.split('\n'): line = line.strip() if line and (line[0].isdigit() or line.startswith('-')): step_text = line.lstrip('0123456789.-) ') steps.append({ "description": step_text, "status": "pending" }) return steps def refine_plan(self, plan, feedback): """根据执行反馈调整计划""" prompt = f"""Current plan: {self._format_plan(plan)} Execution feedback: {feedback} 请分析反馈并更新计划。如果某个步骤失败,说明原因并提供替代方案。 """ response = self.model.generate(prompt) return self._parse_plan(response)

Memory 模块

Memory 模块管理 Agent 的历史信息和状态:

from dataclasses import dataclass, field from typing import List, Dict from datetime import datetime import json @dataclass class MemoryItem: """记忆条目""" type: str # "observation", "thought", "action", "result" content: str timestamp: datetime = field(default_factory=datetime.now) metadata: Dict = field(default_factory=dict) class Memory: def __init__(self, max_items=1000): self.items: List[MemoryItem] = [] self.max_items = max_items def add(self, memory_type: str, content: str, metadata=None): """添加记忆""" item = MemoryItem( type=memory_type, content=content, metadata=metadata or {} ) self.items.append(item) # 清理超出的记忆 if len(self.items) > self.max_items: self.items = self.items[-self.max_items:] def get_recent(self, n: int = 10, memory_type: str = None): """获取最近 n 条记忆""" items = self.items if memory_type: items = [i for i in items if i.type == memory_type] return items[-n:] def search(self, query: str) -> List[MemoryItem]: """基于语义搜索记忆""" # 简化实现,实际应使用 embedding 相似度 query_lower = query.lower() return [ item for item in self.items if query_lower in item.content.lower() ] def summarize(self, lookback: int = 50) -> str: """总结最近的经验""" recent = self.get_recent(lookback) if not recent: return "No recent history." prompt = f"""请总结以下 Agent 行为历史中的关键信息: {chr(10).join([f"[{m.type}] {m.content}" for m in recent])} 总结: """ return self._summarize_text(prompt)

Executor 模块

Executor 负责执行具体的动作:

class Executor: def __init__(self, tools: Dict): self.tools = tools def execute(self, action: str, params: dict) -> str: """执行动作并返回结果""" tool = self.tools.get(action) if not tool: return f"Error: Unknown tool '{action}'" try: result = tool.execute(**params) return str(result) except Exception as e: return f"Error executing {action}: {str(e)}" def validate_action(self, action: str, params: dict) -> bool: """验证动作参数是否合法""" tool = self.tools.get(action) if not tool: return False required = tool.get_required_params() return all(p in params for p in required)

工具设计

工具接口

from abc import ABC, abstractmethod class Tool(ABC): """工具基类""" @property @abstractmethod def name(self) -> str: """工具名称""" pass @property @abstractmethod def description(self) -> str: """工具描述""" pass @property @abstractmethod def parameters(self) -> dict: """参数 schema""" pass @abstractmethod def execute(self, **kwargs) -> str: """执行工具""" pass class SearchTool(Tool): """搜索工具""" @property def name(self) -> str: return "search" @property def description(self) -> str: return "Search the web for information. Returns top search results." @property def parameters(self) -> dict: return { "query": { "type": "string", "description": "The search query" }, "num_results": { "type": "integer", "description": "Number of results to return", "default": 5 } } def execute(self, query: str, num_results: int = 5) -> str: # 实际实现调用搜索 API results = search_engine.query(query, num_results) return json.dumps(results) class CodeExecutorTool(Tool): """代码执行工具""" @property def name(self) -> str: return "execute_code" @property def description(self) -> str: return "Execute Python code and return the result." @property def parameters(self) -> dict: return { "code": { "type": "string", "description": "Python code to execute" }, "timeout": { "type": "integer", "description": "Execution timeout in seconds", "default": 30 } } def execute(self, code: str, timeout: int = 30) -> str: import io from contextlib import redirect_stdout output = io.StringIO() try: with redirect_stdout(output): exec(code) return output.getvalue() or "Code executed successfully (no output)" except Exception as e: return f"Error: {type(e).__name__}: {str(e)}"

工具注册

class ToolRegistry: """工具注册中心""" def __init__(self): self._tools: Dict[str, Tool] = {} def register(self, tool: Tool): """注册工具""" self._tools[tool.name] = tool def get(self, name: str) -> Tool: """获取工具""" return self._tools.get(name) def list_tools(self) -> List[dict]: """列出所有工具""" return [ { "name": tool.name, "description": tool.description, "parameters": tool.parameters } for tool in self._tools.values() ] def to_prompt_format(self) -> str: """转换为 prompt 格式""" lines = [] for tool in self._tools.values(): lines.append(f"## {tool.name}") lines.append(f"Description: {tool.description}") lines.append(f"Parameters:") for param_name, param_info in tool.parameters.items(): required = "" if "default" in param_info else " (required)" lines.append(f" - {param_name}: {param_info['description']}{required}") lines.append("") return "\n".join(lines)

Agent 实现示例:代码审查 Agent

class CodeReviewAgent: """代码审查 Agent""" def __init__(self, llm): self.llm = llm self.memory = Memory() self.planner = Planner(llm) # 注册工具 self.tools = ToolRegistry() self.tools.register(SearchTool()) self.tools.register(CodeExecutorTool()) self.tools.register(FileReadTool()) self.executor = Executor(self.tools._tools) def review(self, code: str, language: str = "python") -> dict: """审查代码""" self.memory.add("task", f"Review {language} code") self.memory.add("observation", f"Code to review: {len(code)} chars") # 1. 规划审查步骤 plan_prompt = f"""请为以下 {language} 代码制定审查计划。 重点关注:安全性、性能、可读性、最佳实践。 代码: ```{language} {code}

审查步骤:
"""

plan_response = self.llm.generate(plan_prompt) plan = self.planner._parse_plan(plan_response) findings = [] # 2. 按计划执行审查 for step in plan: self.memory.add("thought", step["description"]) # 根据步骤类型选择工具 if "安全" in step["description"]: result = self._check_security(code, language) elif "性能" in step["description"]: result = self._check_performance(code, language) else: result = self.llm.generate( f"分析以下代码的{step['description']}:\n```{language}\n{code}\n```" ) self.memory.add("result", result) findings.append({ "category": step["description"], "finding": result }) # 3. 生成最终报告 report = self._generate_report(findings) return { "findings": findings, "report": report } def _check_security(self, code: str, language: str) -> str: """安全检查""" prompt = f"""检查以下 {language} 代码的安全问题:
  1. SQL 注入
  2. XSS 攻击
  3. 敏感信息泄露
  4. 认证授权问题

代码:

{code}

安全分析:
"""
return self.llm.generate(prompt)

def _check_performance(self, code: str, language: str) -> str: """性能检查""" prompt = f"""检查以下 {language} 代码的性能问题:
  1. 算法复杂度
  2. 资源泄露
  3. 缓存使用
  4. 数据库查询优化

代码:

{code}

性能分析:
"""
return self.llm.generate(prompt)

def _generate_report(self, findings: List[dict]) -> str: """生成审查报告""" prompt = f"""基于以下审查发现,生成结构化的代码审查报告:

{chr(10).join([f"【{f['category']}】\n{f['finding']}" for f in findings])}

报告要求:

  1. 问题按严重程度排序(高、中、低)
  2. 每个问题提供具体代码位置和建议
  3. 提供整体评价和改进建议
    """
    return self.llm.generate(prompt)
## 自我纠正机制 Agent 需要能够识别错误并自我纠正: ```python class SelfCorrectionAgent: """带自我纠正能力的 Agent""" def __init__(self, model, max_retries=3): self.model = model self.max_retries = max_retries def run_with_correction(self, task: str) -> str: """运行任务,必要时进行自我纠正""" for attempt in range(self.max_retries): result = self.execute_attempt(task, attempt) # 验证结果 is_valid, feedback = self.validate(result) if is_valid: return result # 自我纠正 corrected = self.self_correct(task, result, feedback) result = corrected return f"Failed after {self.max_retries} attempts" def validate(self, result: str) -> tuple: """验证结果是否正确""" prompt = f"""验证以下结果是否正确解决了任务: 任务:{task} 结果: {result} 请判断: 1. 任务是否完全解决? 2. 结果是否准确? 3. 是否有遗漏的部分? 验证结果和反馈: """ response = self.model.generate(prompt) # 解析验证结果... is_valid = "完成" in response or "正确" in response return is_valid, response def self_correct(self, task: str, previous_result: str, feedback: str) -> str: """根据反馈进行自我纠正""" prompt = f"""任务:{task} 之前的尝试结果: {previous_result} 验证反馈: {feedback} 请分析错误原因,并给出修正后的解决方案: """ return self.model.generate(prompt)

总结

构建一个可靠的 AI Agent 是一个复杂的系统工程,需要考虑规划、记忆、工具调用、自我纠正等多个方面。

在实际项目中,我发现最关键的几个点:

  1. 工具设计要清晰:每个工具的职责要单一,接口要明确
  2. 记忆管理要高效:合理压缩和检索历史信息
  3. 错误处理要完善:Agent 难免会犯错,要有完善的纠正机制
  4. 迭代优化:通过实际运行不断改进 prompt 和流程

希望这篇分享对大家理解 Agent 架构有所帮助。

需要专业的网站建设服务?

联系我们获取免费的网站建设咨询和方案报价,让我们帮助您实现业务目标

立即咨询