Commit Graph

8 Commits

Author SHA1 Message Date
Parth Sareen c114987523
logprob: add bytes to logprobs (#13068) 2025-11-13 13:49:25 -08:00
Baptiste Jamin 59241c5bee
server: add logprobs and top_logprobs support to Ollama's API (#12899)
Adds logprobs support to Ollama's API including support for Ollama's
OpenAI-compatible API. By specifying the new 'logprobs' boolean parameter
in the API, Ollama will return the log probabilities for each token generated.
'top_logprobs', an integer value can also be specified up to the value 20.
When specified, the API will also provide the number of most likely tokens to
return at each token position

Co-authored-by: Baptiste Jamin <baptiste@crisp.chat>
2025-11-11 08:49:50 -08:00
nicole pardal 7dd4862a89
embeddings: removed redundant TestAPIEmbeddings test (#12863)
This PR removes a redundant test from TestAPIEmbeddings
Contents of this test already exists in embed_test.go and model_arch_test.go
2025-10-30 17:12:33 -07:00
Patrick Devine 76eb7d0fff
testing: test more models with tool calling (#12867) 2025-10-30 13:19:21 -07:00
Daniel Hiltgen c23e6f4cae
tests: add single threaded history test (#12295)
* tests: add single threaded history test

Also tidies up some existing tests to handle more model output variation

* test: add support for testing specific architectures
2025-09-22 11:23:14 -07:00
Parth Sareen 20b53eaa72
tests: add tool calling integration test (#12232) 2025-09-09 14:01:11 -07:00
Daniel Hiltgen 517807cdf2
perf: build graph for next batch async to keep GPU busy (#11863)
* perf: build graph for next batch in parallel to keep GPU busy

This refactors the main run loop of the ollama runner to perform the main GPU
intensive tasks (Compute+Floats) in a go routine so we can prepare the next
batch in parallel to reduce the amount of time the GPU stalls waiting for the
next batch of work.

* tests: tune integration tests for ollama engine

This tunes the integration tests to focus more on models supported
by the new engine.
2025-08-29 14:20:28 -07:00
Daniel Hiltgen ed4e139314
Integration test improvements (#9654)
Add some new test coverage for various model architectures,
and switch from orca-mini to the small llama model.
2025-04-16 14:25:55 -07:00