You should not leave out --tool-server when you want to use MCP tools with the Responses API. The --tool-server flag is required to register the MCP tool server(s) with vLLM at startup, so that the backend knows how to route tool calls specified in your JSON payload. The MCP tool definitions in your request (such as {"type": "mcp", ...}) are matched to the tool servers registered via --tool-server at launch. If you omit --tool-server, vLLM will not be able to connect to or invoke the MCP tools you specify in your request, and tool calls will fail or be ignored. This is confirmed in the vllm/entrypoints/tool_server.py implementation and official documentation.
Regarding environment variables: there are no special environment variables required for MCP tools in general, unless your specific tool server requires authentication (e.g., API keys) or other configuration. For built-in tools like the Python interpreter, you may need to set PYTHON_EXECUTION_BACKEND=dangerously_use_uv if you want to avoid Docker-based execution, as described in the gpt-oss tool use guide. For MCP tools, ensure your tool server is running and accessible at the address you provide to --tool-server.
Yes, you are required to install the mcp Python package (e.g., via pip install mcp) for MCP tool support in vLLM. If mcp is not installed, vLLM will raise an ImportError and instruct you to install it, as shown in the MCPToolServer class implementation in vllm/entrypoints/tool_server.py and official API docs.