LLMs natively output unstructured text. Tool Calling (originally introduced as Function Calling by OpenAI) is the process of providing an LLM with a schema of available tools (like a weather API or a database query function). The model evaluates the user's prompt, determines if one of the tools is necessary to fulfill the request, and instead of replying with conversational text, it halts generation and outputs a perfectly structured JSON object matching the tool's expected arguments. The application executes the external code using those arguments and feeds the result back into the model to generate the final human-readable response.
How It Works
- Schema Injection: The developer passes a JSON Schema outlining the available tools (e.g., `get_weather(location, unit)`) alongside the system prompt.
- Model Decision: The LLM decides it cannot answer the user query based on internal weights and chooses to invoke a tool.
- JSON Output: The model outputs a specific token (like `
`) followed by `{"location": "Seattle", "unit": "celsius"}`. - Execution: The developer's backend intercepts this, runs the actual Python code, and appends the result (e.g., `15°C`) to the message history for the LLM to read.
Common Use Cases
- Empowering autonomous agents to interact with the real world (sending emails, modifying databases).
- Fetching real-time data that an LLM could not possibly know (stock prices, current news).