Commit Graph

564 Commits

Author SHA1 Message Date
Georgi Gerganov b12abefa9b sync : llama.cpp 2025-11-17 21:05:46 +02:00
KITAITI Makoto 27f485a14c
vad : Silero VAD v6.2.0 (#3524)
* Add ggml-silero-v6.2.0 to download candidates

* Make default VAD model ggml-silero-v6.2.0

* Make VAD model in documentations ggml-silero-v6.2.0
2025-11-17 22:26:17 +09:00
Georgi Gerganov a1867e0dad sync : llama.cpp 2025-11-09 23:38:03 +02:00
Orel-A f16c12f3f5
wasm : fix Hebrew ID (#3487)
whisper_lang_id: unknown language 'iw'
2025-10-27 08:49:32 +02:00
Georgi Gerganov 322c2adb75 talk-llama : sync llama.cpp 2025-10-22 12:58:11 +03:00
Georgi Gerganov 23c19308d8
server : set no_context == true (#3482) 2025-10-20 15:39:48 +03:00
Georgi Gerganov 8ba3c13b0c talk-llama : sync llama.cpp 2025-10-15 09:29:17 +03:00
Georgi Gerganov ff4c1a5a53 talk-llama : sync llama.cpp 2025-10-12 11:16:23 +03:00
Andreas Lubbe 85871a9469
whisper : add support for --carry-initial-prompt (#3395)
* Add support for --carry-initial-prompt

* PR fixes for ruby and go

* Refactoring for readability

* WIP 1

* WIP 2

* PR fixes

* More PR fixes

* PR fix

* Further simplification

* d'oh

* One more logic fix

* Update src/whisper.cpp

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* Truncate prompt_past0 upon initialization

* Slight simplification

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-10-10 19:51:15 +03:00
Andreas Lubbe a0ca50f3b9
cli: Fix assignment for vad_min_silence_duration_ms (#3467)
* cli: Fix assignment for vad_min_silence_duration_ms

Found and fixed this simple copy/paste error

* server : fix vad_min_silence_duration_ms assignment

---------

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2025-10-10 15:21:03 +02:00
Georgi Gerganov 8a67c55c8a
wchess : fix link [no ci] 2025-09-30 21:28:03 +03:00
Daniel Bevenius 5904d00dbb
examples : add wchess.wasm to wasm examples build (#3443)
* examples : add wchess.wasm to wasm examples build

This commit add the wchess.wasm example to the wasm examples that are
deployed to https://ggml.ai/whisper.cpp.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3434#issuecomment-3346980420
2025-09-30 16:23:01 +02:00
Georgi Gerganov 0b3587acdd
whisper : enable flash attention by default (#3441) 2025-09-30 15:47:20 +03:00
Georgi Gerganov a77d11d91e
bench : warm-up all kernels (#3438) 2025-09-29 17:27:53 +03:00
Georgi Gerganov fcf0181ee2
talk-llama : sync llama.cpp 2025-09-29 15:18:41 +03:00
Georgi Gerganov 36778bd8b8
talk-llama : sync llama.cpp 2025-09-20 13:58:28 +03:00
Georgi Gerganov fc45bb8625 talk-llama : sync llama.cpp
ggml-ci
2025-08-18 20:30:45 +03:00
Georgi Gerganov 7fd2fbde45 common : handle mxfp4 enum
ggml-ci
2025-08-18 20:30:45 +03:00
Daniel Bevenius 040510a132
node : add win platform check for require path (#3363)
This commit adds a check to the platform in use and adjust the path to
the addon.node shared library.

The motivation for this change is that on windows addon.node library is
built into build\bin\Release and on linux into build/Release.

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3360
2025-08-15 14:54:23 +02:00
Georgi Gerganov b02242d0ad
wasm : change ggml model host to HF (#3369) 2025-08-10 13:00:17 +03:00
Daniel Bevenius 0becabc8d6
stream.wasm : add language selection support (#3354)
* stream.wasm : add language selection support

This commit adds support for selecting the language in the stream.wasm
example. This is includes adding the model `base` which supports
multilingual transcription, and allowing the user to select a language
from a dropdown menu in the HTML interface.

The motivation for this is that it allows users to transcribe audio in
various languages.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3347

* squash! stream.wasm : add language selection support

Remove strdup() for language in stream.wasm and update butten text for
base (should not be "base.en" but just "base").
2025-08-02 07:03:04 +02:00
Georgi Gerganov d0a9d8c7f8 talk-llama : sync llama.cpp 2025-07-28 13:02:32 +03:00
Daniel Bevenius 7de8dd783f
examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332)
This commit adds a note to the README files of the WASM examples
about the `WHISPER_WASM_SINGLE_FILE` option.

The motivation for this is that currently this option is not documented
and might be surprising to users who expect a separate .wasm file to be
generated.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3290
2025-07-24 16:06:48 +02:00
Sacha Arbonel 1f5cf0b288
server : hide language probabilities option behind flag (#3328)
* examples/server: hide language probabilities option behind flag

* code review

* fix
2025-07-21 13:03:54 +02:00
Greg Sadetsky a16da91365
examples : update links in wasm examples (#3318)
* fix 404 link

* update link in whisper.wasm example

* update example in command.wasm

* update link in bench.wasm example

* update link in stream.wasm example
2025-07-12 23:22:35 +02:00
Georgi Gerganov 6ddff4d96a talk-llama : sync llama.cpp
ggml-ci
2025-07-12 19:23:56 +03:00
accessiblepixel 869335f2d5
server : add dtw.params for v3-large-turbo (#3307)
* Add DTW model large-v3-turbo parameters to server.cpp example

DTW support is available in whispercpp and the large-v3-turbo model has already been added to the sources, but the large-v3-turbo model hasn't been added to the server.cpp file to make use of it. This commit hopefully corrects that issue.

* match original linebreak of original server.cpp file after adding large.v3.turbo dtw
2025-07-07 12:51:15 +03:00
Lin Xiaodong d9999d54c8
feat: support vad for addon.node (#3301)
Co-authored-by: linxiaodong <calm.lin@wukongsch.com>
2025-07-02 13:14:29 +03:00
Georgi Gerganov 1f816de7da talk-llama : sync llama.cpp 2025-07-01 17:54:53 +03:00
Daniel Bevenius 32cf4e2aba
whisper : add version function (#3289)
* whisper : add version function

This commit adds a version function to the whisper API.

The motivation for this is that it might be convenient to have a way to
programmatically check the version.

Example usage:
```c++
printf("Using whisper version: %s\n", whisper_version());
```
Will output:
```console
Using whisper version: 1.7.6
```

* examples : add version to android example CMakeLists.txt
2025-06-26 18:09:42 +02:00
Georgi Gerganov dc8dda60ee
bench : print system info before ctx check 2025-06-25 16:01:32 +03:00
Daniel Bevenius 1ad258ca31
stream : add nullptr check of whisper_context (#3283)
* stream : add nullptr check of whisper_context

This commit adds a check to ensure that the `whisper_context` is not
null after initialization.

The motivation for this is that currently, if the initialization fails,
the program continues to run leading to a segmentation fault. This sort
of check is performed by others examples like whisper-cli.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3280#issuecomment-3003778035

* examples : add nullptr check for whisper_context
2025-06-25 14:16:31 +02:00
Aaron Ang 4d6ae52ed3
command: output commands to text file (#3273)
This commit implements code for the command line argument `-f --file FNAME` which is currently missing.
2025-06-24 06:41:21 +02:00
Georgi Gerganov e6c10cf3d5 talk-llama : sync llama.cpp
ggml-ci
2025-06-21 07:34:17 +03:00
Daniel Bevenius 3e65f518dd
android : update CMakeLists.txt to use FetchContent for ggml (#3268)
* android : update CMakeLists.txt to use FetchContent for ggml

This commit updates the CMakeLists.txt file for the Android Whisper
example to use FetchContent for managing the ggml library.

The motivation for this change is avoid having to make manual changes to
the CMakeLists.txt file after syncing the ggml library.

I've built and run the example locally to verify that it works as
expected.

Refs: https://github.com/ggml-org/whisper.cpp/pull/3265#issuecomment-2986715717

* android.java : update cmake to use FetchContent for ggml

This commit updates the CMake configuration for the Android Java example
to use `FetchContent` for including the `ggml` library. Do be able to
use FetchContent we also update the `compileSdkVersion` and
`targetSdkVersion` to 31, and the `buildToolsVersion` to '30.0.3'.
This also required a an update to the Gradle plugin version to 7.4.0.

The motivation for this change is avoid having to make manual changes to
the CMakeLists.txt file after syncing the ggml library.
2025-06-19 16:06:42 +02:00
Georgi Gerganov 17bece1885
cmake : fix android build (#3265)
* cmake : fix android build

---------

Co-authored-by: Daniel Bevenius <daniel.bevenius@gmail.com>
2025-06-19 08:24:41 +02:00
Daniel Bevenius ecb8f3c2b4
examples : add stereo to mono conversion in read_audio_data (#3266)
This commit adds a conversion from stereo to mono in the
`read_audio_data` function of `common-whisper.cpp`.

The motivation for this change is prior to Commit
7d3da68f79 ("examples : use miniaudio for
direct decoding flac, mp3, ogg and wav (#2759)", there was a step that
read stereo int16 data -> pcm16 (448512 samples), and then converted to
mono (224256 samples), and then also convert to stereo in `pcmf32s.

The middle step here seems to have been missed when rewriting the code to
use Miniaudio and caused issues then transcribing stereo audio files.

For example, currently using the audio sample in the linked issue the
output is:
```console
[00:00:00.000 --> 00:00:03.000]  (speaker 1) Sous-titres réalisés para la communauté d'Amara.org
```

And with the change in this commit the output is:
```
[00:00:00.000 --> 00:00:01.500]  (speaker 1) *sonnerie de téléphone*
[00:00:01.500 --> 00:00:07.000]  (speaker 1) Salut jeune homme !
[00:00:07.000 --> 00:00:08.500]  (speaker 0) C'est vrai que je te dérange ?
[00:00:08.500 --> 00:00:10.500]  (speaker 1) Ah pas du tout, pas du tout, pas du tout !
[00:00:10.500 --> 00:00:12.500]  (speaker 1) J'étais en train de...
[00:00:12.500 --> 00:00:14.500]  (speaker 1) de préparer un courrier
```

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3092
2025-06-18 17:41:43 +02:00
Georgi Gerganov 2f60ebc3c2 talk-llama : sync llama.cpp
ggml-ci
2025-06-18 12:40:34 +03:00
Daniel Bevenius f3ff80ea8d
examples : set the C++ standard to C++17 for server (#3261)
This commit updates the server example to use C++17 as the standard.

The motivation for this change is that currently the ci-run
`ggml-100-mac-m4` is failing when compiling the server example on
macOS. The `talk-llama` example also has this setting so it looks like
an alright change to make.

ggml-ci

Refs: https://github.com/ggml-org/ci/tree/results/whisper.cpp/2a/4d6db7d90899aff3d58d70996916968e4e0d27/ggml-100-mac-m4
2025-06-17 11:29:48 +02:00
w1redch4d 2a4d6db7d9
examples : update usage/help in yt-wsp.sh (#3251)
This commit updates the usage/help message to be more readable and include the environment variables available to set options.
2025-06-16 12:21:16 +02:00
Sacha Arbonel 107c303e69
server : graceful shutdown, atomic server state, and health endpoint Improvements (#3243)
* feat(server): implement graceful shutdown and server state management

* refactor(server): use lambda capture by reference in server.cpp
2025-06-16 10:14:26 +02:00
Daniel Bevenius 0a4d85cf8a
server : add Voice Activity Detection (VAD) support (#3246)
* server : add Voice Activity Detection (VAD) support

This commit adds support for Voice Activity Detection (VAD) in the
server example.

The motivation for this is to enable VAD processing when using
whisper-server.

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3089

* server : add VAD parameters to usage in README.md [no ci]

This commit also adds a few missing parameters.

* server : fix conflicting short options [no ci]
2025-06-13 13:24:03 +02:00
Daniel Bevenius 9df8d54bcb
cli : fix short name conflict for vad options [no ci] (#3247)
This commit fixes a short name conflict whisper-cli for
`--vad-min-speech-duration-ms` and `--vad-min-silence-duration-ms` which
currently have the same short name `-vsd`.

Refs: https://github.com/ggml-org/whisper.cpp/pull/3246#pullrequestreview-2923800114
2025-06-13 10:25:25 +02:00
Georgi Gerganov 962361bd79 android : fix builds (#0)
ggml-ci
2025-06-10 12:40:33 +03:00
Georgi Gerganov db264d6220 talk-llama : sync llama.cpp
ggml-ci
2025-06-10 12:40:33 +03:00
Daniel Bevenius b505539670
node : add language detection support (#3190)
This commit add support for language detection in the Whisper Node.js
addon example. It also updates the node addon to return an object
instead of an array as the results.

The motivation for this change is to enable the inclusion of the
detected language in the result, in addition to the transcription
segments.

For example, when using the `detect_language` option, the result will
now be:
```console
{ language: 'en' }
```

And if the `language` option is set to "auto", it will also return:
```console
{
  language: 'en',
  transcription: [
    [
      '00:00:00.000',
      '00:00:07.600',
      ' And so my fellow Americans, ask not what your country can do for you,'
    ],
    [
      '00:00:07.600',
      '00:00:10.600',
      ' ask what you can do for your country.'
    ]
  ]
}
```
2025-06-02 14:58:05 +02:00
Georgi Gerganov 7fd6fa8097 talk-llama : sync llama.cpp
ggml-ci
2025-06-01 15:14:44 +03:00
Daniel Bevenius 73a8c5fb94
whisper : remove whisper_load_backends function (#3196)
* whisper : remove whisper_load_backends function

This commit removes the `whisper_load_backends` function, which was used
to load all GGML backends.

The motivation for this change push the responsibility of loading
backends to user applications to give them more control over which
backends to load and when. See the references below for more context.

Resolves: https://github.com/ggml-org/whisper.cpp/issues/3182
Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801778733
Refs: https://github.com/ggml-org/whisper.cpp/pull/3042#issuecomment-2801928990

* ruby : add check for rwc is NULL

This commit adds a check to ensure that the `rwc` pointer is not NULL
before attempting to mark its members in the garbage collector.

The motivation for this is an attempt to see if this fixed the CI build
as I'm not able to reproduce the issue locally.

Refs: https://github.com/ggml-org/whisper.cpp/actions/runs/15299612277/job/43036694928?pr=3196
2025-05-29 08:03:17 +02:00
Georgi Gerganov 26eb48cb08 talk-llama : sync llama.cpp
ggml-ci
2025-05-27 18:03:00 +03:00
Daniel Bevenius 450de0787e
node : enable no_prints to suppress all output (#3189)
This commit enable the node addon to suppress all output, even the
result of the transcription if the no_prints parameter is set to true.

The motivation for this is that for the node addon there is a
fullfilment handler/success callback to process the transcription
result. And it might be useful to be able to disable the printing of
the transcription result to the console, so that the user can handle
the result in their own way.

Refs: https://github.com/ggml-org/whisper.cpp/issues/3176
2025-05-27 05:51:47 +02:00