skip to content
TJ Miller

Sparkle: Streaming with Tools

/ 3 min read

Streaming chat with tools has been one of the features I’ve been chasing for a while now and I’ve FINALLY got it working… at least for OpenAI (Anthropic and Ollama next). One of the biggest challenges was that OpenAI chat streaming tool responses spread the function arguments across multiple JSON objects.

In the example response below you can see the arguments for a weather tool being split as partial JSON objects that I had to concatenate together to pass to the function.

Example tool stream response
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"role":"assistant","tool_calls":[{"id":"call_uNvzUajPsJAH0mGzETpJqgUh","type":"function","function":{"name":"weather","arguments":""}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":"{\n"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":" "}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":" \""}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":"city"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":"\":"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":" \""}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":"Detroit"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":"\"\n"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":{"tool_calls":[{"function":{"arguments":"}"}}]},"finish_reason":null}]}
data: {"id":"chatcmpl-9xEIYwtItHYY0VfVb4pqDEaDnoacr","object":"chat.completion.chunk","created":1723903342,"model":"gpt-4-0613","choices":[{"index":0,"delta":[],"finish_reason":"tool_calls"}]}
data: [DONE]

Let’s look at what this looks like when we put it together!

First, we create the agent and define the tools. We are defining two tools, a weather tool and a search tool. To keep things simple I am simulating the response.

app/Agents/MultiToolAgent.php
<?php
class MultiToolAgent
{
public function __invoke(): Agent
{
return Agent::create(
llm: OpenAI::create(
Model::GPT4,
[
'top_p' => 1,
'temperature' => 0.8,
'max_tokens' => 2048,
]
),
prompt: $this->prompt(),
tools: $this->tools(),
);
}
private function prompt(): string
{
$prompt = "MODEL ADOPTS ROLE of [PERSONA: Nyx the Cthulhu]! \r\n";
$prompt .= "Nyx is the cutest, most friendly, Cthulhu around. \r\n";
$prompt .= 'The current datetime is '.now()->toDateTimeString();
return $prompt;
}
/** @return array<int, Tool> */
public function tools(): array
{
return [
Tool::create(
'search',
'useful when you need to search for current events',
[
'query' => [
'type' => 'string',
'description' => 'the search query string',
],
],
function (string $query): string {
// simulate API request
sleep(3);
return 'The tigers game is at 3pm eastern in Detroit';
}
),
Tool::create(
'weather',
'useful when you need to search for current weather conditions',
[
'city' => [
'type' => 'string',
'description' => 'The city that you want the weather for',
],
'datetime' => [
'type' => 'string',
'description' => 'the datetime for the weather conditions. format 2022-08-14 20:24:38',
],
],
function (string $city, string $datetime): string {
// simulate API request
sleep(3);
return 'The weather will be 75° and sunny';
}
),
];
}
}

Next, we have to register the agent with Sparkle Server.

app/Providers/AppServiceProvider.php
<?php
class AppServiceProvider extends ServiceProvider
{
public function register(): void
{
SparkleServer::register(
'agent-with-tools',
new MultiToolAgent
);
}
}

Finally, we start our application server php artisan serve.

You can see the chat in real-time use Open WebUI:

When we ask for an introduction there are no requests for tools, because there isn’t a reason to use one. The agent will choose what tools and when to use them. In subsequent messages, we ask the agent questions that involve using the tools we defined.

For easier viewing here are the results as static images: chat observer

I’m super stoked on this, now its time for tests and refactors!

I couldn’t leave ya’ll hanging, here’s a picture of Nyx: Nyx

Subscribe

For more thoughts, links, project and personal updates.