ollama/ml
Daniel Hiltgen c146a138e3
ggml: handle all streams (#13350)
Follow up from #12992

Free all streams, and keep the alloc logic aligned across streams.
2025-12-05 16:10:33 -08:00
..
backend ggml: handle all streams (#13350) 2025-12-05 16:10:33 -08:00
nn ggml: Enable flash attention for vision encoders 2025-12-04 15:19:06 -08:00
backend.go ggml: Enable flash attention for vision encoders 2025-12-04 15:19:06 -08:00
device.go CUDA: filter devices on secondary discovery (#13317) 2025-12-03 12:58:16 -08:00
path.go cpu: always ensure LibOllamaPath included (#12890) 2025-10-31 14:37:29 -07:00