ollama/ml
Daniel Hiltgen a4770107a6
vulkan: enable flash attention (#12937)
Also adjusts the vulkan windows build pattern to match recent changes in other backends
so incremental builds are faster.
2025-11-04 10:31:22 -08:00
..
backend ggml: Increase maximum graph size 2025-11-03 16:05:37 -08:00
nn interleaved mrope (#12807) 2025-10-30 11:29:00 -07:00
backend.go ggml: Enable op_offload to improve partial offload performance 2025-10-30 13:53:10 -07:00
device.go vulkan: enable flash attention (#12937) 2025-11-04 10:31:22 -08:00
path.go cpu: always ensure LibOllamaPath included (#12890) 2025-10-31 14:37:29 -07:00