pydantic dev

Rating

Information

PydanticAI Agents Introduction Installation Getting Help Contributing Troubleshooting Documentation Agents Agents Introduction Running Agents Iterating Over an Agent's Graph async for iteration Using .next(...) manually Accessing usage and the final result Streaming Additional Configuration Usage Limits Model (Run) Settings Model specific settings Runs vs. Conversations Type safe by design System Prompts Reflection and self-correction Model errors Models Dependencies Function Tools Common Tools Results Messages and chat history Testing and Evals Debugging and Monitoring Multi-agent Applications Graphs Image, Audio & Document Input Command Line Interface (CLI) Examples Pydantic Model Weather agent Bank support SQL Generation Flight booking RAG Stream markdown Stream whales Chat App with FastAPI Question Graph API Reference pydantic_ai.agent pydantic_ai.tools pydantic_ai.common_tools pydantic_ai.result pydantic_ai.messages pydantic_ai.exceptions pydantic_ai.settings pydantic_ai.usage pydantic_ai.format_as_xml pydantic_ai.models pydantic_ai.models.openai pydantic_ai.models.anthropic pydantic_ai.models.bedrock pydantic_ai.models.cohere pydantic_ai.models.gemini pydantic_ai.models.vertexai pydantic_ai.models.groq pydantic_ai.models.mistral pydantic_ai.models.test pydantic_ai.models.function pydantic_ai.models.fallback pydantic_ai.providers pydantic_graph pydantic_graph.nodes pydantic_graph.persistence pydantic_graph.mermaid pydantic_graph.exceptions Introduction Running Agents Iterating Over an Agent's Graph async for iteration Using .next(...) manually Accessing usage and the final result Streaming Additional Configuration Usage Limits Model (Run) Settings Model specific settings Runs vs. Conversations Type safe by design System Prompts Reflection and self-correction Model errors Foobar list [ str ] roulette_wheel.py from pydantic_ai import Agent , RunContext roulette_agent = Agent ( Create an agent, which expects an integer dependency and returns a boolean result. This agent will have type Agent[int, bool]. Agent [ int , bool ] 'openai:gpt-4o' , deps_type = int , result_type = bool , system_prompt = ( 'Use the `roulette_wheel` function to see if the ' 'customer has won based on the number they provide.' ), ) @roulette_agent . tool async def roulette_wheel ( ctx : RunContext [ int ], square : int ) -> str : Define a tool that checks if the square is a winner. Here RunContext is parameterized with the dependency type int; if you got the dependency type wrong you'd get a typing error. """check if the square is a winner""" return 'winner' if square == ctx . deps else 'loser' # Run the agent success_number = 18 In reality, you might want to use a random number here e.g. random.randint(0, 36). result = roulette_agent . run_sync ( 'Put my money on square eighteen' , deps = success_number ) print ( result . data ) result.data will be a boolean indicating if the square is a winner. Pydantic performs the result validation, it'll be typed as a bool since its type is derived from the result_type generic parameter of the agent. #> True result = roulette_agent . run_sync ( 'I bet five is the winner' , deps = success_number ) print ( result . data ) #> False run_agent.py from pydantic_ai import Agent agent = Agent ( 'openai:gpt-4o' ) result_sync = agent . run_sync ( 'What is the capital of Italy?' ) print ( result_sync . data ) #> Rome async def main (): result = await agent . run ( 'What is the capital of France?' ) print ( result . data ) #> Paris async with agent . run_stream ( 'What is the capital of the UK?' ) as response : print ( await response . get_data ()) #> London agent_iter_async_for.py from pydantic_ai import Agent agent = Agent ( 'openai:gpt-4o' ) async def main (): nodes = [] # Begin an AgentRun, which is an async-iterable over the nodes of the agent's graph async with agent . iter ( 'What is the capital of France?' ) as agent_run : async for node in agent_run : # Each node represents a step in the agent's execution nodes . append ( node ) print ( nodes ) """ [ ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='Paris', part_kind='text')], model_name='gpt-4o', timestamp=datetime.datetime(...), kind='response', ) ), End(data=FinalResult(data='Paris', tool_name=None, tool_call_id=None)), ] """ print ( agent_run . result . data ) #> Paris agent_iter_next.py from pydantic_ai import Agent from pydantic_graph import End agent = Agent ( 'openai:gpt-4o' ) async def main (): async with agent . iter ( 'What is the capital of France?' ) as agent_run : node = agent_run . next_node We start by grabbing the first node that will be run in the agent's graph. all_nodes = [ node ] # Drive the iteration manually: while not isinstance ( node , End ): The agent run is finished once an End node has been produced; instances of End cannot be passed to next. node = await agent_run . next ( node ) When you call await agent_run.next(node), it executes that node in the agent's graph, updates the run's history, and returns the next node to run. all_nodes . append ( node ) You could also inspect or mutate the new node here as needed. print ( all_nodes ) """ [ UserPromptNode( user_prompt='What is the capital of France?', system_prompts=(), system_prompt_functions=[], system_prompt_dynamic_functions={}, ), ModelRequestNode( request=ModelRequest( parts=[ UserPromptPart( content='What is the capital of France?', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ) ), CallToolsNode( model_response=ModelResponse( parts=[TextPart(content='Paris', part_kind='text')], model_name='gpt-4o', timestamp=datetime.datetime(...), kind='response', ) ), End(data=FinalResult(data='Paris', tool_name=None, tool_call_id=None)), ] """ streaming.py import asyncio from dataclasses import dataclass from datetime import date from pydantic_ai import Agent from pydantic_ai.messages import ( FinalResultEvent , FunctionToolCallEvent , FunctionToolResultEvent , PartDeltaEvent , PartStartEvent , TextPartDelta , ToolCallPartDelta , ) from pydantic_ai.tools import RunContext @dataclass class WeatherService : async def get_forecast ( self , location : str , forecast_date : date ) -> str : # In real code: call weather API, DB queries, etc. return f 'The forecast in { location } on { forecast_date } is 24°C and sunny.' async def get_historic_weather ( self , location : str , forecast_date : date ) -> str : # In real code: call a historical weather API or DB return ( f 'The weather in { location } on { forecast_date } was 18°C and partly cloudy.' ) weather_agent = Agent [ WeatherService , str ]( 'openai:gpt-4o' , deps_type = WeatherService , result_type = str , # We'll produce a final answer as plain text system_prompt = 'Providing a weather forecast at the locations the user provides.' , ) @weather_agent . tool async def weather_forecast ( ctx : RunContext [ WeatherService ], location : str , forecast_date : date , ) -> str : if forecast_date >= date . today (): return await ctx . deps . get_forecast ( location , forecast_date ) else : return await ctx . deps . get_historic_weather ( location , forecast_date ) output_messages : list [ str ] = [] async def main (): user_prompt = 'What will the weather be like in Paris on Tuesday?' # Begin a node-by-node, streaming iteration async with weather_agent . iter ( user_prompt , deps = WeatherService ()) as run : async for node in run : if Agent . is_user_prompt_node ( node ): # A user prompt node => The user has provided input output_messages . append ( f '=== UserPromptNode: { node . user_prompt } ===' ) elif Agent . is_model_request_node ( node ): # A model request node => We can stream tokens from the model's request output_messages . append ( '=== ModelRequestNode: streaming partial request tokens ===' ) async with node . stream ( run . ctx ) as request_stream : async for event in request_stream : if isinstance ( event , PartStartEvent ): output_messages . append ( f '[Request] Starting part { event . index } : { event . part !r} ' ) elif isinstance ( event , PartDeltaEvent ): if isinstance ( event . delta , TextPartDelta ): output_messages . append ( f '[Request] Part { event . index } text delta: { event . delta . content_delta !r} ' ) elif isinstance ( event . delta , ToolCallPartDelta ): output_messages . append ( f '[Request] Part { event . index } args_delta= { event . delta . args_delta } ' ) elif isinstance ( event , FinalResultEvent ): output_messages . append ( f '[Result] The model produced a final result (tool_name= { event . tool_name } )' ) elif Agent . is_call_tools_node ( node ): # A handle-response node => The model returned some data, potentially calls a tool output_messages . append ( '=== CallToolsNode: streaming partial response & tool usage ===' ) async with node . stream ( run . ctx ) as handle_stream : async for event in handle_stream : if isinstance ( event , FunctionToolCallEvent ): output_messages . append ( f '[Tools] The LLM calls tool= { event . part . tool_name !r} with args= { event . part . args } (tool_call_id= { event . part . tool_call_id !r} )' ) elif isinstance ( event , FunctionToolResultEvent ): output_messages . append ( f '[Tools] Tool call { event . tool_call_id !r} returned => { event . result . content } ' ) elif Agent . is_end_node ( node ): assert run . result . data == node . data . data # Once an End node is reached, the agent run is complete output_messages . append ( f '=== Final Agent Output: { run . result . data } ===' ) if __name__ == '__main__' : asyncio . run ( main ()) print ( output_messages ) """ [ '=== ModelRequestNode: streaming partial request tokens ===', '[Request] Starting part 0: ToolCallPart(tool_name=\'weather_forecast\', args=\'{"location":"Pa\', tool_call_id=\'0001\', part_kind=\'tool-call\')', '[Request] Part 0 args_delta=ris","forecast_', '[Request] Part 0 args_delta=date":"2030-01-', '[Request] Part 0 args_delta=01"}', '=== CallToolsNode: streaming partial response & tool usage ===', '[Tools] The LLM calls tool=\'weather_forecast\' with args={"location":"Paris","forecast_date":"2030-01-01"} (tool_call_id=\'0001\')', "[Tools] Tool call '0001' returned => The forecast in Paris on 2030-01-01 is 24°C and sunny.", '=== ModelRequestNode: streaming partial request tokens ===', "[Request] Starting part 0: TextPart(content='It will be ', part_kind='text')", '[Result] The model produced a final result (tool_name=None)', "[Request] Part 0 text delta: 'warm and sunny '", "[Request] Part 0 text delta: 'in Paris on '", "[Request] Part 0 text delta: 'Tuesday.'", '=== CallToolsNode: streaming partial response & tool usage ===', '=== Final Agent Output: It will be warm and sunny in Paris on Tuesday. ===', ] """ from pydantic_ai import Agent from pydantic_ai.exceptions import UsageLimitExceeded from pydantic_ai.usage import UsageLimits agent = Agent ( 'anthropic:claude-3-5-sonnet-latest' ) result_sync = agent . run_sync ( 'What is the capital of Italy? Answer with just the city.' , usage_limits = UsageLimits ( response_tokens_limit = 10 ), ) print ( result_sync . data ) #> Rome print ( result_sync . usage ()) """ Usage(requests=1, request_tokens=62, response_tokens=1, total_tokens=63, details=None) """ try : result_sync = agent . run_sync ( 'What is the capital of Italy? Answer with a paragraph.' , usage_limits = UsageLimits ( response_tokens_limit = 10 ), ) except UsageLimitExceeded as e : print ( e ) #> Exceeded the response_tokens_limit of 10 (response_tokens=32) from typing_extensions import TypedDict from pydantic_ai import Agent , ModelRetry from pydantic_ai.exceptions import UsageLimitExceeded from pydantic_ai.usage import UsageLimits class NeverResultType ( TypedDict ): """ Never ever coerce data to this type. """ never_use_this : str agent = Agent ( 'anthropic:claude-3-5-sonnet-latest' , retries = 3 , result_type = NeverResultType , system_prompt = 'Any time you get a response, call the `infinite_retry_tool` to produce another response.' , ) @agent . tool_plain ( retries = 5 ) This tool has the ability to retry 5 times before erroring, simulating a tool that might get stuck in a loop. def infinite_retry_tool () -> int : raise ModelRetry ( 'Please try again.' ) try : result_sync = agent . run_sync ( 'Begin infinite retry loop!' , usage_limits = UsageLimits ( request_limit = 3 ) This run will error after 3 requests, preventing the infinite tool calling. ) except UsageLimitExceeded as e : print ( e ) #> The next request would exceed the request_limit of 3 from pydantic_ai import Agent agent = Agent ( 'openai:gpt-4o' ) result_sync = agent . run_sync ( 'What is the capital of Italy?' , model_settings = { 'temperature' : 0.0 } ) print ( result_sync . data ) #> Rome from pydantic_ai import Agent , UnexpectedModelBehavior from pydantic_ai.models.gemini import GeminiModelSettings agent = Agent ( 'google-gla:gemini-1.5-flash' ) try : result = agent . run_sync ( 'Write a list of 5 very rude things that I might say to the universe after stubbing my toe in the dark:' , model_settings = GeminiModelSettings ( temperature = 0.0 , # general model settings can also be specified gemini_safety_settings = [ { 'category' : 'HARM_CATEGORY_HARASSMENT' , 'threshold' : 'BLOCK_LOW_AND_ABOVE' , }, { 'category' : 'HARM_CATEGORY_HATE_SPEECH' , 'threshold' : 'BLOCK_LOW_AND_ABOVE' , }, ], ), ) except UnexpectedModelBehavior as e : print ( e ) This error is raised because the safety thresholds were exceeded.Generally, result would contain a normal ModelResponse. """ Safety settings triggered, body: """ conversation_example.py from pydantic_ai import Agent agent = Agent ( 'openai:gpt-4o' ) # First run result1 = agent . run_sync ( 'Who was Albert Einstein?' ) print ( result1 . data ) #> Albert Einstein was a German-born theoretical physicist. # Second run, passing previous messages result2 = agent . run_sync ( 'What was his most famous equation?' , message_history=result1.new_messages(), Continue the conversation; without message_history the model would not know who "his" was referring to. message_history = result1 . new_messages (), Continue the conversation; without message_history the model would not know who "his" was referring to. ) print ( result2 . data ) #> Albert Einstein's most famous equation is (E = mc^2). type_mistakes.py from dataclasses import dataclass from pydantic_ai import Agent , RunContext @dataclass class User : name : str agent = Agent ( 'test' , deps_type = User , The agent is defined as expecting an instance of User as deps. result_type = bool , ) @agent.system_prompt @agent . system_prompt def add_user_name ( ctx : RunContext [ str ]) -> str : But here add_user_name is defined as taking a str as the dependency, not a User. return f "The user's name is { ctx . deps } ." def foobar ( x : bytes ) -> None : pass result = agent . run_sync ( 'Does their name start with "A"?' , deps = User ( 'Anne' )) foobar(result.data) Since the agent is defined as returning a bool, this will raise a type error since foobar expects bytes. foobar ( result . data ) Since the agent is defined as returning a bool, this will raise a type error since foobar expects bytes. 1 "system_prompt" "Agent" type "Callable[[RunContext[str]], str]" ; "Callable[[RunContext[User]], str]" [ ] 1 "foobar" type "bool" ; "bytes" [ ] 2 in 1 ( 1 source ) system_prompts.py from datetime import date from pydantic_ai import Agent , RunContext agent = Agent ( 'openai:gpt-4o' , deps_type = str , The agent expects a string dependency. system_prompt = "Use the customer's name while replying to them." , Static system prompt defined at agent creation time. ) @agent . system_prompt Dynamic system prompt defined via a decorator with RunContext, this is called just after run_sync, not when the agent is created, so can benefit from runtime information like the dependencies used on that run. def add_the_users_name ( ctx : RunContext [ str ]) -> str : return f "The user's name is { ctx . deps } ." @agent . system_prompt def add_the_date () -> str : Another dynamic system prompt, system prompts don't have to have the RunContext parameter. return f 'The date is { date . today () } .' result = agent . run_sync ( 'What is the date?' , deps = 'Frank' ) print ( result . data ) #> Hello Frank, the date today is 2032-01-02. tool_retry.py from pydantic import BaseModel from pydantic_ai import Agent , RunContext , ModelRetry from fake_database import DatabaseConn class ChatResult ( BaseModel ): user_id : int message : str agent = Agent ( 'openai:gpt-4o' , deps_type = DatabaseConn , result_type = ChatResult , ) @agent . tool ( retries = 2 ) def get_user_by_name ( ctx : RunContext [ DatabaseConn ], name : str ) -> int : """Get a user's ID from their full name.""" print ( name ) #> John #> John Doe user_id = ctx . deps . users . get ( name = name ) if user_id is None : raise ModelRetry ( f 'No user found with name { name !r} , remember to provide their full name' ) return user_id result = agent . run_sync ( 'Send a message to John Doe asking for coffee next week' , deps = DatabaseConn () ) print ( result . data ) """ user_id=123 message='Hello John, would you be free for coffee sometime next week? Let me know what works for you!' """ agent_model_errors.py from pydantic_ai import Agent , ModelRetry , UnexpectedModelBehavior , capture_run_messages agent = Agent ( 'openai:gpt-4o' ) @agent . tool_plain def calc_volume ( size : int ) -> int : Define a tool that will raise ModelRetry repeatedly in this case. if size == 42 : return size ** 3 else : raise ModelRetry ( 'Please try again.' ) with capture_run_messages () as messages : capture_run_messages is used to capture the messages exchanged during the run. try : result = agent . run_sync ( 'Please get me the volume of a box with size 6.' ) except UnexpectedModelBehavior as e : print ( 'An error occurred:' , e ) #> An error occurred: Tool exceeded max retries count of 1 print ( 'cause:' , repr ( e . __cause__ )) #> cause: ModelRetry('Please try again.') print ( 'messages:' , messages ) """ messages: [ ModelRequest( parts=[ UserPromptPart( content='Please get me the volume of a box with size 6.', timestamp=datetime.datetime(...), part_kind='user-prompt', ) ], kind='request', ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id=None, part_kind='tool-call', ) ], model_name='gpt-4o', timestamp=datetime.datetime(...), kind='response', ), ModelRequest( parts=[ RetryPromptPart( content='Please try again.', tool_name='calc_volume', tool_call_id=None, timestamp=datetime.datetime(...), part_kind='retry-prompt', ) ], kind='request', ), ModelResponse( parts=[ ToolCallPart( tool_name='calc_volume', args={'size': 6}, tool_call_id=None, part_kind='tool-call', ) ], model_name='gpt-4o', timestamp=datetime.datetime(...), kind='response', ), ] """ else : print ( result . data ) Agents are PydanticAI's primary interface for interacting with LLMs. In some use cases a single Agent will control an entire application or component,but multiple agents can also interact to embody more complex workflows. The Agent class has full API documentation, but conceptually you can think of an agent as a container for: In typing terms, agents are generic in their dependency and result types, e.g., an agent which required dependencies of type Foobar and returned results of type list[str] would have type Agent[Foobar, list[str]]. In practice, you shouldn't need to care about this, it should just mean your IDE can tell you when you have the right type, and if you choose to use static type checking it should work well with PydanticAI. Here's a toy example of an agent that simulates a roulette wheel: Agents are designed for reuse, like FastAPI Apps Agents are intended to be instantiated once (frequently as module globals) and reused throughout your application, similar to a small FastAPI app or an APIRouter. There are four ways to run an agent: Here's a simple example demonstrating the first three: You can also pass messages from previous runs to continue a conversation or provide context, as described in Messages and Chat History. Under the hood, each Agent in PydanticAI uses pydantic-graph to manage its execution flow. pydantic-graph is a generic, type-centric library for building and running finite state machines in Python. It doesn't actually depend on PydanticAI — you can use it standalone for workflows that have nothing to do with GenAI — but PydanticAI makes use of it to orchestrate the handling of model requests and model responses in an agent's run. In many scenarios, you don't need to worry about pydantic-graph at all; calling agent.run(...) simply traverses the underlying graph from start to finish. However, if you need deeper insight or control — for example to capture each tool invocation, or to inject your own logic at specific stages — PydanticAI exposes the lower-level iteration process via Agent.iter. This method returns an AgentRun, which you can async-iterate over, or manually drive node-by-node via the next method. Once the agent's graph returns an End, you have the final result along with a detailed history of all steps. Here's an example of using async for with iter to record each node the agent executes: You can also drive the iteration manually by passing the node you want to run next to the AgentRun.next(...) method. This allows you to inspect or modify the node before it executes or skip nodes based on your own logic, and to catch errors in next() more easily: You can retrieve usage statistics (tokens, requests, etc.) at any time from the AgentRun object via agent_run.usage(). This method returns a Usage object containing the usage data. Once the run finishes, agent_run.final_result becomes a AgentRunResult object containing the final output (and related metadata). Here is an example of streaming an agent run in combination with async for iteration: PydanticAI offers a UsageLimits structure to help you limit yourusage (tokens and/or requests) on model runs. You can apply these settings by passing the usage_limits argument to the run{_sync,_stream} functions. Consider the following example, where we limit the number of response tokens: Restricting the number of requests can be useful in preventing infinite loops or excessive tool calling: Note This is especially relevant if you've registered many tools. The request_limit can be used to prevent the model from calling them in a loop too many times. PydanticAI offers a settings.ModelSettings structure to help you fine tune your requests.This structure allows you to configure common parameters that influence the model's behavior, such as temperature, max_tokens,timeout, and more. There are two ways to apply these settings:1. Passing to run{_sync,_stream} functions via the model_settings argument. This allows for fine-tuning on a per-request basis.2. Setting during Agent initialization via the model_settings argument. These settings will be applied by default to all subsequent run calls using said agent. However, model_settings provided during a specific run call will override the agent's default settings. For example, if you'd like to set the temperature setting to 0.0 to ensure less random behavior,you can do the following: If you wish to further customize model behavior, you can use a subclass of ModelSettings, like GeminiModelSettings, associated with your model of choice. For example: An agent run might represent an entire conversation — there's no limit to how many messages can be exchanged in a single run. However, a conversation might also be composed of multiple runs, especially if you need to maintain state between separate interactions or API calls. Here's an example of a conversation comprised of multiple runs: (This example is complete, it can be run "as is") PydanticAI is designed to work well with static type checkers, like mypy and pyright. Typing is (somewhat) optional PydanticAI is designed to make type checking as useful as possible for you if you choose to use it, but you don't have to use types everywhere all the time. That said, because PydanticAI uses Pydantic, and Pydantic uses type hints as the definition for schema and validation, some types (specifically type hints on parameters to tools, and the result_type arguments to Agent) are used at runtime. We (the library developers) have messed up if type hints are confusing you more than helping you, if you find this, please create an issue explaining what's annoying you! In particular, agents are generic in both the type of their dependencies and the type of results they return, so you can use the type hints to ensure you're using the right types. Consider the following script with type mistakes: Running mypy on this will give the following output: Running pyright would identify the same issues. System prompts might seem simple at first glance since they're just strings (or sequences of strings that are concatenated), but crafting the right system prompt is key to getting the model to behave as you want. Generally, system prompts fall into two categories: You can add both to a single agent; they're appended in the order they're defined at runtime. Here's an example using both types of system prompts: (This example is complete, it can be run "as is") Validation errors from both function tool parameter validation and structured result validation can be passed back to the model with a request to retry. You can also raise ModelRetry from within a tool or result validator function to tell the model it should retry generating a response. Here's an example: If models behave unexpectedly (e.g., the retry limit is exceeded, or their API returns 503), agent runs will raise UnexpectedModelBehavior. In these cases, capture_run_messages can be used to access the messages exchanged during the run to help diagnose the issue. (This example is complete, it can be run "as is") Note If you call run, run_sync, or run_stream more than once within a single capture_run_messages context, messages will represent the messages exchanged during the first call only.

Prompts

Reviews

Write Your Review

Detailed Ratings

ALL

Correctness

Helpfulness

Interesting

Upload Pictures and Videos

Name

Size

Type

Download

Last Modified

Upload Files

Community

Add Discussion

Upload Pictures and Videos

Chatbot close

Bot
Hi there
How can I help you today?

Send