api

gradio.api(···)

Description

Sets up an API or MCP endpoint for a generic function without needing define events listeners or components. Derives its typing from type hints in the provided function's signature rather than the components.

Example Usage

import gradio as gr
with gr.Blocks() as demo:
    with gr.Row():
        input = gr.Textbox()
        button = gr.Button("Submit")
    output = gr.Textbox()
    def fn(a: int, b: int, c: list[str]) -> tuple[int, str]:
        return a + b, c[a:b]
    gr.api(fn, api_name="add_and_slice")
_, url, _ = demo.launch()

from gradio_client import Client
client = Client(url)
result = client.predict(
        a=3,
        b=5,
        c=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
        api_name="/add_and_slice"
)
print(result)

Initialization

🔗

fn: Callable | Literal['decorator']

default = "decorator"

the function to call when this event is triggered. Often a machine learning model's prediction function. The function should be fully typed, and the type hints will be used to derive the typing information for the API/MCP endpoint.

🔗

api_name: str | None

default = None

defines how the endpoint appears in the API docs. Can be a string or None. If set to a string, the endpoint will be exposed in the API docs with the given name. If None (default), the name of the function will be used as the API endpoint.

🔗

api_description: str | None

default = None

Description of the API endpoint. Can be a string, None, or False. If set to a string, the endpoint will be exposed in the API docs with the given description. If None, the function's docstring will be used as the API endpoint description. If False, then no description will be displayed in the API docs.

🔗

queue: bool

default = True

If True, will place the request on the queue, if the queue has been enabled. If False, will not put this event on the queue, even if the queue has been enabled. If None, will use the queue setting of the gradio app.

🔗

batch: bool

default = False

If True, then the function should process a batch of inputs, meaning that it should accept a list of input values for each parameter. The lists should be of equal length (and be up to length `max_batch_size`). The function is then *required* to return a tuple of lists (even if there is only 1 output component), with each list in the tuple corresponding to one output component.

🔗

max_batch_size: int

default = 4

Maximum number of inputs to batch together if this is called from the queue (only relevant if batch=True)

🔗

concurrency_limit: int | None | Literal['default']

default = "default"

If set, this is the maximum number of this event that can be running simultaneously. Can be set to None to mean no concurrency_limit (any number of this event can be running simultaneously). Set to "default" to use the default concurrency limit (defined by the `default_concurrency_limit` parameter in `Blocks.queue()`, which itself is 1 by default).

🔗

concurrency_id: str | None

default = None

If set, this is the id of the concurrency group. Events with the same concurrency_id will be limited by the lowest set concurrency_limit.

🔗

api_visibility: Literal['public', 'private', 'undocumented']

default = "public"

controls the visibility and accessibility of this endpoint. Can be "public" (shown in API docs and callable by clients), "private" (hidden from API docs and not callable by clients), or "undocumented" (hidden from API docs but callable by clients and via gr.load). If fn is None, api_visibility will automatically be set to "private".

🔗