Grace
a013693f80
DeepseekV3 Family Parser ( #13484 )
2025-12-16 18:56:30 -08:00
Michael Yang
f6a016f49d
revert granite-embedding ( #13505 )
2025-12-16 15:44:52 -08:00
Michael Yang
2dd029de12
remove unnecessary code ( #13502 )
...
slog is already lazily evaluated so this code is completely redundant
2025-12-16 15:11:26 -08:00
Michael Yang
903b1fc97f
use ollama engine for bert models ( #13501 )
...
register bpe tokenizer which enables granite-embedding
2025-12-16 11:29:19 -08:00
Parth Sareen
89eb795293
parsers/renderers: use think from user for nemotron ( #13492 )
2025-12-15 18:55:17 -08:00
Parth Sareen
7e3ea813c1
llama/parsers/renderers: nemotron 3 nano ( #13489 )
...
---------
Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2025-12-15 18:00:08 -08:00
Grace
7b95087b9d
Adding tool definitions to DeepseekV3 renderer ( #13491 )
2025-12-15 17:57:06 -08:00
Michael Yang
971d62595a
fix: qwen2.5 vl rope ( #13486 )
...
* qwen25vl: bump max pixels
* qwen25vl: mrope
fix qwen2.5vl window
* qwen25vl: vision rope
2025-12-15 17:30:33 -08:00
Parth Sareen
ffbe8e076d
model: add olmo3 and olmo3.1 ( #13415 )
2025-12-15 15:20:04 -08:00
Grace
2c639431b1
DeepseekV3 family renderer ( #13180 )
2025-12-15 14:50:52 -08:00
Parth Sareen
e3731fb160
renderers: add olmo3.1 and olmo3 fixes ( #13447 )
2025-12-15 11:26:43 -08:00
Jeffrey Morgan
4ff8a691bc
model: default gemma 3 rope scale to 1.0, apply corrections based on layer counts ( #13453 )
2025-12-12 17:51:56 -08:00
Jeffrey Morgan
1b308e1d2a
model: fix global layer rope scale values for gemma 3 ( #13452 )
2025-12-12 16:29:01 -08:00
Jeffrey Morgan
3af5d3b738
model: force rope factor 1.0 for Gemma 3 ( #13445 )
2025-12-12 13:27:08 -08:00
Jeffrey Morgan
2dfb74410d
model: fix rotary embeddings for ministral 3 ( #13432 )
2025-12-11 16:02:05 -08:00
Jeffrey Morgan
a838421ea3
model: conversion and hyperparameter fixes for ministral and devstral ( #13424 )
2025-12-11 13:04:00 -08:00
nicole pardal
76f88caf43
nomic-embed-text:v2: model implementation ( #13162 )
2025-12-09 14:24:51 -08:00
Parth Sareen
2bccf8c624
renderers/parsers: olmo3 instruct ( #13383 )
2025-12-09 11:12:27 -08:00
Parth Sareen
0c5e5f6630
parsers/renderers: olmo3 think ( #13290 )
2025-12-09 10:41:47 -08:00
Jeffrey Morgan
d2f334c1f7
model: add rnj-1 inference support ( #13354 )
2025-12-08 16:49:17 -08:00
Michael Yang
603ceefaa6
refactor rope
...
change to a flatter directory structure and group the options with the
function
update models to call rope in one place
2025-12-08 14:42:22 -08:00
Patrick Devine
d3e0a0dee4
model: ministral w/ llama4 scaling ( #13292 )
...
This change:
* fixes rope scaling in the mistral converter
* updates ministral to include llama4 scaling
* includes a new ministral parser for parsing reasoning and tool calling
---------
Co-authored-by: jmorganca <jmorganca@gmail.com>
2025-12-01 23:20:14 -08:00
Grace
d70e935526
Parser for Cogito v2 ( #13145 )
2025-11-19 17:21:07 -08:00
Michael Yang
5c1063df7f
deepseek2: upgrade to run v3+ models ( #13166 )
...
the check for mla omits v3 and r1 which should not return unsupported.
instead check the tokenizer for compatibility
2025-11-19 17:05:39 -08:00
Patrick Devine
604e43b28d
models: enable deepseek2 (deepseek v3.1 w/ MLA) on the new engine ( #13151 )
2025-11-18 22:03:50 -08:00
Grace
91935631ac
Renderer for Cogito v2 ( #13139 )
2025-11-18 19:06:34 -08:00
nicole pardal
8de30b568a
nomic-embed-text model implementation ( #13071 )
2025-11-18 18:28:10 -08:00
Michael Yang
92981ae3f2
deepseekocr
2025-11-18 16:11:37 -08:00
Michael Yang
440a3823a6
fix(tokenizer): add special tokens to empty inputs ( #13091 )
2025-11-18 11:16:56 -08:00
Grace
584e2d646f
Add deepseek v3.1 ( #13063 )
...
* Add mla for flash attention
* Revert to using chunks
2025-11-17 18:03:21 -08:00
Michael Yang
333203d871
chore: update models to use slice/chunk/chunksections ( #12934 )
...
* use slice/chunks
* bert
* llama4
* gemma3n
* gptoss
* mistral3
* qwen3vl
* qwen25vl
* deepseek2
* remove unused ops
2025-11-13 15:20:12 -08:00
Daniel Hiltgen
544b6739dd
ggml update to b6840 ( #12791 )
2025-11-06 10:19:22 -08:00
Michael Yang
ce3eb0a315
chore(gptoss): cleanup dead code ( #12932 )
2025-11-03 11:27:15 -08:00
Michael Yang
f67a6df110
interleaved mrope ( #12807 )
...
* ml(ggml): mrope
* interleave mrope
2025-10-30 11:29:00 -07:00
Michael Yang
d432ade714
fix: qwen2.5vl, qwen3vl composite image ( #12841 )
...
this change fixes images with an alpha channel by overlaying the image
onto a white background
2025-10-30 10:33:19 -07:00
Grace
0a2d92081b
Removing whitespace between Thinking and Content in Qwen3VL ( #12838 )
...
Eats extra whitespace at the end/beginning of content
2025-10-29 15:14:28 -07:00
Michael Yang
7d25b9e194
feat(model): add qwen3vl ( #12665 )
2025-10-28 17:39:47 -07:00
Michael Yang
1188f408dd
s/From*Slice/From*s/ ( #12255 )
2025-10-28 12:08:49 -07:00
Michael Yang
ec9eb28f4c
gemma3: make embedding non-causal ( #12297 )
2025-10-27 19:54:08 -07:00
Jeffrey Morgan
94f110b35a
model/parsers: remove warning for missing <think> tag for qwen3-vl ( #12713 )
2025-10-20 16:03:43 -07:00
Daniel Hiltgen
bc1a818fdc
contiguous input per layer ( #12686 )
...
Co-authored-by: Michael Yang <git@mxy.ng>
2025-10-17 18:39:18 -07:00
Jeffrey Morgan
65fb3ff49d
renderers: add global flag for setting [img] tags ( #12669 )
...
Adds a temporary global flag to renderers that causes renderers to always
render images as [img]. In a follow up change, we will consider making this
the default, and this flag could eventually be removed
2025-10-16 16:37:32 -07:00
Grace
e2a0b24435
Grace/qwen3 thinking ( #12647 )
...
* changing initial status to take into consideration prefill
* Add seperate strings for content and thinking builder
* thinking tests
* remove white space from string before closing think tag
2025-10-16 15:29:41 -07:00
Devon Rifkin
08fbb60bb2
qwen3-coder: support anyOf when parsing tool calls
2025-10-14 15:33:05 -07:00
Devon Rifkin
ddaca643d0
add registries for parsers/renderers
2025-10-14 01:13:54 -07:00
Grace
05982a95cb
Qwen3VL Cloud Parser and Renderer ( #12526 )
...
* working (other than tool call is the incorrect order) for tool calls and tools
* Tests work, other than image tags (tests do not go through server) and tools (not in the correct order, but contents are the same)
* testing for qwen3vl parser - toolparser is working
* made changes to JSON tool parser, wraps the TollCallFunction with a TollCall object
* Working parser for thinking models - assumes state of thinking, emits unambiguous content in thinking, does not call tool call in thinking
* changed the parser to start with collecting content
* thinking prefill
* add hasThinkingSupport parameter to parser
* qwen3-vl -> qwen3-vl-instruct for renderer/parser
* Add hasThinkingSupport=false to QwenVLParser
---------
Co-authored-by: Devon Rifkin <drifkin@drifkin.net>
2025-10-13 16:52:33 -07:00
Michael Yang
6c833d5f8d
fix(qwen3): deepseek distill
...
deepseek's qwen3 distill uses a different rope scheme so support both
2025-10-13 13:30:30 -07:00
yajianggroup
df411c4b02
refactor: using testing.B.Loop
...
Signed-off-by: yajianggroup <yajianggroup@outlook.com>
2025-10-10 13:25:29 -07:00
shengxinjing
47298fce39
refactor: use builtin max and min
2025-10-09 16:17:52 -07:00
shengxinjing
4a48937ef1
refactor: use builtin max and min
2025-10-09 16:17:52 -07:00