Hub Python Library documentation

MCP Client

You are viewing v0.32.3 version. A newer version v1.0.0.rc7 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

MCP Client

The huggingface_hub library now includes an MCPClient, designed to empower Large Language Models (LLMs) with the ability to interact with external Tools via the Model Context Protocol (MCP). This client extends an AsyncInferenceClient to seamlessly integrate Tool usage.

The MCPClient connects to MCP servers (local stdio scripts or remote http/sse services) that expose tools. It feeds these tools to an LLM (via AsyncInferenceClient). If the LLM decides to use a tool, MCPClient manages the execution request to the MCP server and relays the Tool’s output back to the LLM, often streaming results in real-time.

We also provide a higher-level Agent class. This ‘Tiny Agent’ simplifies creating conversational Agents by managing the chat loop and state, acting as a wrapper around MCPClient.

MCP Client

class huggingface_hub.MCPClient

< >

( model: typing.Optional[str] = None provider: typing.Union[typing.Literal['black-forest-labs', 'cerebras', 'cohere', 'fal-ai', 'fireworks-ai', 'hf-inference', 'hyperbolic', 'nebius', 'novita', 'nscale', 'openai', 'replicate', 'sambanova', 'together'], typing.Literal['auto'], NoneType] = None base_url: typing.Optional[str] = None api_key: typing.Optional[str] = None )

Parameters

  • model (str, optional) — The model to run inference with. Can be a model id hosted on the Hugging Face Hub, e.g. meta-llama/Meta-Llama-3-8B-Instruct or a URL to a deployed Inference Endpoint or other local or remote endpoint.
  • provider (str, optional) — Name of the provider to use for inference. Defaults to “auto” i.e. the first of the providers available for the model, sorted by the user’s order in https://hf.co/settings/inference-providers. If model is a URL or base_url is passed, then provider is not used.
  • base_url (str, optional) — The base URL to run inference. Defaults to None.
  • api_key (str, optional) — Token to use for authentication. Will default to the locally Hugging Face saved token if not provided. You can also use your own provider API key to interact directly with the provider’s service.

Client for connecting to one or more MCP servers and processing chat completions with tools.

This class is experimental and might be subject to breaking changes in the future without prior notice.

add_mcp_server

< >

( type: typing.Literal['stdio', 'sse', 'http'] **params: typing.Any )

Parameters

  • type (str) — Type of the server to connect to. Can be one of:
    • “stdio”: Standard input/output server (local)
    • “sse”: Server-sent events (SSE) server
    • “http”: StreamableHTTP server
  • **params (Dict[str, Any]) — Server parameters that can be either:
    • For stdio servers:
      • command (str): The command to run the MCP server
      • args (List[str], optional): Arguments for the command
      • env (Dict[str, str], optional): Environment variables for the command
      • cwd (Union[str, Path, None], optional): Working directory for the command
    • For SSE servers:
      • url (str): The URL of the SSE server
      • headers (Dict[str, Any], optional): Headers for the SSE connection
      • timeout (float, optional): Connection timeout
      • sse_read_timeout (float, optional): SSE read timeout
    • For StreamableHTTP servers:
      • url (str): The URL of the StreamableHTTP server
      • headers (Dict[str, Any], optional): Headers for the StreamableHTTP connection
      • timeout (timedelta, optional): Connection timeout
      • sse_read_timeout (timedelta, optional): SSE read timeout
      • terminate_on_close (bool, optional): Whether to terminate on close

Connect to an MCP server

cleanup

< >

( )

Clean up resources

process_single_turn_with_tools

< >

( messages: typing.List[typing.Union[typing.Dict, huggingface_hub.inference._generated.types.chat_completion.ChatCompletionInputMessage]] exit_loop_tools: typing.Optional[typing.List[huggingface_hub.inference._generated.types.chat_completion.ChatCompletionInputTool]] = None exit_if_first_chunk_no_tool: bool = False )

Parameters

  • messages (List[Dict]) — List of message objects representing the conversation history
  • exit_loop_tools (List[ChatCompletionInputTool], optional) — List of tools that should exit the generator when called
  • exit_if_first_chunk_no_tool (bool, optional) — Exit if no tool is present in the first chunks. Default to False.

Process a query using self.model and available tools, yielding chunks and tool outputs.

Agent

class huggingface_hub.Agent

< >

( model: Optional[str] = None servers: Iterable[Dict] provider: Optional[PROVIDER_OR_POLICY_T] = None base_url: Optional[str] = None api_key: Optional[str] = None prompt: Optional[str] = None )

Parameters

  • model (str, optional) — The model to run inference with. Can be a model id hosted on the Hugging Face Hub, e.g. meta-llama/Meta-Llama-3-8B-Instruct or a URL to a deployed Inference Endpoint or other local or remote endpoint.
  • servers (Iterable[Dict]) — MCP servers to connect to. Each server is a dictionary containing a type key and a config key. The type key can be "stdio" or "sse", and the config key is a dictionary of arguments for the server.
  • provider (str, optional) — Name of the provider to use for inference. Defaults to “auto” i.e. the first of the providers available for the model, sorted by the user’s order in https://hf.co/settings/inference-providers. If model is a URL or base_url is passed, then provider is not used.
  • base_url (str, optional) — The base URL to run inference. Defaults to None.
  • api_key (str, optional) — Token to use for authentication. Will default to the locally Hugging Face saved token if not provided. You can also use your own provider API key to interact directly with the provider’s service.
  • prompt (str, optional) — The system prompt to use for the agent. Defaults to the default system prompt in constants.py.

Implementation of a Simple Agent, which is a simple while loop built right on top of an MCPClient.

This class is experimental and might be subject to breaking changes in the future without prior notice.

run

< >

( user_input: str abort_event: Optional[asyncio.Event] = None )

Parameters

  • user_input (str) — The user input to run the agent with.
  • abort_event (asyncio.Event, optional) — An event that can be used to abort the agent. If the event is set, the agent will stop running.

Run the agent with the given user input.

< > Update on GitHub