Tiny Python Agents: Harnessing MCP for Tool Integration in 70 Lines of Code

Inspired by the concept of Tiny Agents implemented in JavaScript, a project has been undertaken to port and extend this idea to Python, leveraging the Hugging Face hub client SDK to create an MCP (Model Context Protocol) client. MCP standardizes how Large Language Models (LLMs) interact with external tools and APIs, significantly reducing the complexity of integrating new functionalities into LLMs. This article provides a comprehensive guide on setting up and using Tiny Agents in Python, demonstrating their potential with web browsing and image generation tasks. How to Run the Demo To get started, you need to install the latest version of huggingface_hub with the mcp extra. This can be done using the following command: sh pip install "huggingface_hub[mcp]>=0.32.0" You can run an agent using the CLI. If you don’t provide a path to a specific agent configuration, it will default to connecting to the Qwen/Qwen2.5-72B-Instruct model via the Nebius inference provider and a playwright MCP server. The agent config can be loaded from the Hugging Face Hub’s tiny-agents/tiny-agents dataset. Here’s an example of a web-browsing agent: sh tiny-agents run --path tiny-agents/tiny-agents/web_browser_agent When you run this command, the agent loads and lists the tools it discovers from connected MCP servers, ready to receive prompts. For instance: sh do a Web Search for HF inference providers on Brave Search and open the first result and then give me the list of the inference providers supported on Hugging Face Alternatively, Gradio Spaces can be used as MCP servers. For example, you can connect to an image generation HF Space: sh tiny-agents run --path tiny-agents/tiny-agents/image_generator_agent And use a prompt like: sh Generate a 1024x1024 image of a tiny astronaut hatching from an egg on the surface of the moon. Agent Configuration The behavior of each agent is defined by an agent.json file. This file specifies the LLM model, inference provider, and the MCP servers it will connect to. Here’s a sample configuration: json { "model": "Qwen/Qwen2.5-72B-Instruct", "provider": "nebius", "servers": [ { "type": "stdio", "config": { "command": "npx", "args": ["@playwright/mcp@latest"] } } ] } The PROMPT.md file can be used to provide a more detailed system prompt. This setup allows agents to access a variety of tools, including web browsers and image generators, via standardized protocols. LLMs Can Use Tools Modern LLMs are designed to call functions, which enables users to build applications for specific tasks. Each function has a schema that defines its purpose and expected input arguments. The Agent orchestrates the execution of these functions and feeds the results back to the LLM. For example, a tool to get the weather might look like this: json { "type": "function", "function": { "name": "get_weather", "description": "Get current temperature for a given location.", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "City and country e.g. Paris, France" } }, "required": ["location"], }, } } The inference client communicates with LLMs using the OpenAI Chat Completions API standard, ensuring compatibility and ease of use. Building the Python MCP Client The MCPClient is at the core of the tool-use functionality and is now integrated into huggingface_hub. It uses the AsyncInferenceClient to communicate with LLMs asynchronously. The add_mcp_server method establishes connections to different types of servers (stdio, sse, http) and populates the list of available tools. Key Responsibilities of the MCPClient Establishing Connection: The method sets up the connection based on the server type (local or remote). Creating an MCP ClientSession: An asynchronous session is created to handle communication with the MCP server. Listing Tools: The agent requests a list of available tools from the server and stores them for later use. Using the Tools: Streaming and Processing The process_single_turn_with_tools method handles the interaction with the LLM. It prepares the list of available tools, including special "exit loop" tools for agent control, and sends a streaming request to the LLM. As the LLM generates responses, each chunk is processed and yielded to the caller. The method also reconstructs complete tool call requests and processes their results. Executing Tools Once the LLM requests a tool call, the MCPClient processes it. If an "exit loop" tool is called, the method yields a message and terminates. Otherwise, it finds the appropriate MCP session and executes the tool. The result is formatted and added to the conversation history. Our Tiny Python Agent: It's (Almost) Just a Loop! The Agent class, inheriting from MCPClient, manages the conversational loop. Initialization involves setting up the conversation history and loading tools from the specified MCP servers using the load_tools method. Initializing the Agent The Agent constructor takes the model, provider, server configurations, and system prompt. It initializes the conversation history and loads the tools: ```python class Agent(MCPClient): def init( self, *, model: str, servers: Iterable[Dict], provider: Optional[PROVIDER_OR_POLICY_T] = None, api_key: Optional[str] = None, prompt: Optional[str] = None ): super().init(model=model, provider=provider, api_key=api_key) self._servers_cfg = list(servers) self.messages = [{"role": "system", "content": prompt or DEFAULT_SYSTEM_PROMPT}] async def load_tools(self) -> None: for cfg in self._servers_cfg: await self.add_mcp_server(cfg["type"], **cfg["config"]) ``` The Agent’s Core: The Loop The run method processes user inputs asynchronously. It interacts with the LLM and tools, managing the conversation turns and exit conditions: ```python async def run(self, user_input: str, *, abort_event: Optional[asyncio.Event] = None, ...) -> AsyncGenerator[...]: while True: async for item in self.process_single_turn_with_tools(self.messages, ...): yield item if last.get("role") == "tool" and last.get("name") in {t.function.name for t in EXIT_LOOP_TOOLS}: return if last.get("role") != "tool" and num_turns > MAX_NUM_TURNS: return if last.get("role") != "tool" and next_turn_should_call_tools: return next_turn_should_call_tools = (last_message.get("role") != "tool") ``` This loop ensures the agent continues to process user queries until the task is completed or an exit condition is met. Next Steps The possibilities with the MCP Client and Tiny Agents are vast. You can explore expanding the capabilities by adding more tools, creating new agent configurations, or contributing to the open-source project. Pull requests and contributions are welcomed, allowing the community to enrich and enhance these tools further. Industry Evaluation and Company Profiles Industry experts have praised the MCP protocol for its ability to simplify integration between LLMs and external tools, making the development of intelligent agents more accessible. The Hugging Face platform, known for its extensive library of models and datasets, has taken a significant step forward by incorporating MCP into its hub client SDK. This move not only accelerates the adoption of advanced AI applications but also fosters innovation by enabling developers to focus more on creating value rather than dealing with complex integrations. The simplicity and flexibility of Tiny Agents make them a valuable addition to the Python ecosystem for anyone looking to harness the power of LLMs.

Tiny Python Agents: Harnessing MCP for Tool Integration in 70 Lines of Code

Related Links