valkey

Commit Graph

Author	SHA1	Message	Date
Binbin	8ea7f1330c	Update dual channel replication conf to mention the local buffer is imited by COB (#2824 ) After introducing the dual channel replication in #60, we decided in #915 not to add a new configuration item to limit the replica's local replication buffer, just use "client-output-buffer-limit replica hard" to limit it. We need to document this behavior and mention that once the limit is reached, all future data will accumulate in the primary side. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-11-23 23:27:50 +08:00
Binbin	8189fe5c42	Add rdb_transmitted to replstateToString so that we can see it in INFO (#2833 ) In dual channel replication, when the rdb channel client finish the RDB transfer, it will enter REPLICA_STATE_RDB_TRANSMITTED state. During this time, there will be a brief window that we are not able to see the connection in the INFO REPLICATION. In the worst case, we might not see the connection for the DEFAULT_WAIT_BEFORE_RDB_CLIENT_FREE seconds. I guess there is no harm to list this state, showing connected_slaves but not showing the connection is bad when troubleshooting. Note that this also affects the `valkey-cli --rdb` and `--functions-rdb` options. Before the client is in the `rdb_transmitted` state and is released, we will now see it in the info (see the example later). Before, not showing the replica info ``` role:master connected_slaves:1 ``` After, for dual channel replication: ``` role:master connected_slaves:1 slave0:ip=xxx,port=xxx,state=rdb_transmitted,offset=0,lag=0,type=rdb-channel ``` After, for valkey-cli --rdb-only and --functions-rdb: ``` role:master connected_slaves:1 slave0:ip=xxx,port=xxx,state=rdb_transmitted,offset=0,lag=0,type=replica ``` Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-11-21 18:31:31 +08:00
Ricardo Dias	05540af405	Add script function flags in the module API (#2836 ) This commit adds script function flags to the module API, which allows function scripts to specify the function flags programmatically. When the scripting engine compiles the script code can extract the flags from the code and set the flags on the compiled function objects. --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-11-20 10:23:00 +00:00
Hanxi Zhang	ed8856bdfc	Fix cluster slot migration flaky test (#2756 ) The original test code only checks: The original test code only checks: 1. wait_for_cluster_size 4, which calls cluster_size_consistent for every node. Inside that function, for each node, cluster_size_consistent queries cluster_known_nodes, which is calculated as (unsigned long long)dictSize(server.cluster->nodes). However, when a new node is added to the cluster, it is first created in the HANDSHAKE state, and clusterAddNode adds it to the nodes hash table. Therefore, it is possible for the new node to still be in HANDSHAKE status (processed asynchronously) even though it appears that all nodes “know” there are 4 nodes in the cluster. 2. cluster_state for every node, but when a new node is added, server.cluster->state remains FAIL. Some handshake processes may not have completed yet, which likely causes the flakiness. To address this, added a --cluster check to ensure that the config state is consistent. Fixes #2693. Signed-off-by: Hanxi Zhang <hanxizh@amazon.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2025-11-20 15:07:16 +08:00
aradz44	e19ceb7a6d	deflake "Hash field TTL and active expiry propagates correctly" (#2856 ) Fix a little miss in "Hash field TTL and active expiry propagates correctly through chain replication" test in `hashexpire.tcl`. The test did not wait for the initial sync of the chained replica and thus made the test flakey Signed-off-by: Arad Zilberstein <aradz@amazon.com>	2025-11-19 11:33:55 +02:00
Venkat Pamulapati	3c3a1966ec	Perform data cleanup during RDB load on successful version/signature validation (#2600 ) Addresses: https://github.com/valkey-io/valkey/issues/2588 ## Overview Previously we call `emptyData()` during a fullSync before validating the RDB version is compatible. This change adds an rdb flag that allows us to flush the database from within `rdbLoadRioWithLoadingCtx`. THhis provides the option to only flush the data if the rdb has a valid version and signature. In the case where we do have an invalid version and signature, we don't emptyData, so if a full sync fails for that reason a replica can still serve stale data instead of clients experiencing cache misses. ## Changes - Added a new flag `RDBFLAGS_EMPTY_DATA` that signals to flush the database after rdb validation - Added logic to call `emptyData` in `rdbLoadRioWithLoadingCtx` in `rdb.c` - Added logic to not clear data if the RDB validation fails in `replication.c` using new return type `RDB_INCOMPATIBLE` - Modified the signature of `rdbLoadRioWithLoadingCtx` to return RDB success codes and updated all calling sites. ## Testing Added a tcl test that uses the debug command `reload nosave` to load from an RDB that has a future version number. This triggers the same code path that full sync's will use, and verifies that we don't flush the data until after the validation is complete. A test already exists that checks that the data is flushed: https://github.com/valkey-io/valkey/blob/unstable/tests/integration/replication.tcl#L1504 --------- Signed-off-by: Venkat Pamulapati <pamuvenk@amazon.com> Signed-off-by: Venkat Pamulapati <33398322+ChiliPaneer@users.noreply.github.com> Co-authored-by: Venkat Pamulapati <pamuvenk@amazon.com> Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>	2025-11-18 17:08:10 -08:00
yzc-yzc	57892663be	Fix SCAN consistency test to only test what we guarantee (#2853 ) Test the SCAN consistency by alternating SCAN calls to primary and replica. We cannot rely on the exact order of the elements and the returned cursor number. --------- Signed-off-by: yzc-yzc <96833212+yzc-yzc@users.noreply.github.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-11-18 16:06:20 +01:00
chzhoo	33bfac37ba	Optimize zset memory usage by embedding element in skiplist (#2508 ) By default, when the number of elements in a zset exceeds 128, the underlying data structure adopts a skiplist. We can reduce memory usage by embedding elements into the skiplist nodes. Change the `zskiplistNode` memory layout as follows: ``` Before +-------------+ +-----> \| element-sds \| \| +-------------+ \| +------------------+-------+------------------+---------+-----+---------+ \| element--pointer \| score \| backward-pointer \| level-0 \| ... \| level-N \| +------------------+-------+------------------+---------+-----+---------+ After +-------+------------------+---------+-----+---------+-------------+ + score \| backward-pointer \| level-0 \| ... \| level-N \| element-sds \| +-------+------------------+---------+-----+---------+-------------+ ``` Before the embedded SDS representation, we include one byte representing the size of the SDS header, i.e. the offset into the SDS representation where that actual string starts. The memory saving is therefore one pointer minus one byte = 7 bytes per element, regardless of other factors such as element size or number of elements. ### Benchmark step I generated the test data using the following lua script && cli command. And check memory usage using the `info` command. lua script ``` local start_idx = tonumber(ARGV[1]) local end_idx = tonumber(ARGV[2]) local elem_count = tonumber(ARGV[3]) for i = start_idx, end_idx do local key = "zset:" .. string.format("%012d", i) local members = {} for j = 0, elem_count - 1 do table.insert(members, j) table.insert(members, "member:" .. j) end redis.call("ZADD", key, unpack(members)) end return "OK: Created " .. (end_idx - start_idx + 1) .. " zsets" ``` valkey-cli command `valkey-cli EVAL "$(catcreate_zsets.lua)" 0 0 100000 ${ZSET_ELEMENT_NUM}` ### Benchmark result \|number of elements in a zset \| memory usage before optimization \| memory usage after optimization \| change \| \|-------\|-------\|-------\|-------\| \| 129 \| 1047MB \| 943MB \| -9.9% \| \| 256 \| 2010MB\| 1803MB\| -10.3%\| \| 512 \| 3904MB\|3483MB\| -10.8%\| --------- Signed-off-by: chzhoo <czawyx@163.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-11-18 14:27:15 +01:00
Roshan Khatri	616fccb4c5	Fix the failing warmup and duration are cumulative (#2854 ) We need to verify total duration was at least 2 seconds, elapsed time can be quite variable to check upper-bound Fixes https://github.com/valkey-io/valkey/issues/2843 Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>	2025-11-17 21:26:12 +01:00
Binbin	aef56e52f5	Fix timing issue in dual channel replication COB test (#2847 ) After #2829, valgrind report a test failure, it seems that the time is not enough to generate a COB limit in valgrind. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-11-17 17:25:19 +08:00
Binbin	a06cf15b20	Allow dual channel full sync in plain failover (#2659 ) PSYNC_FULLRESYNC_DUAL_CHANNEL is also a full sync, as the comment says, we need to allow it. While we have not yet identified the exact edge case that leads to this line, but during a failover, there should be no difference between different sync strategies. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-11-15 12:57:27 +08:00
Harkrishn Patro	86db609219	Print node name on a best effort basis if light weight message is received before link stabilization (#2825 ) fixes: #2803 --------- Signed-off-by: Harkrishn Patro <harkrisp@amazon.com> Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Binbin <binloveplay1314@qq.com>	2025-11-14 14:33:16 -08:00
yzc-yzc	b93cfcc332	Attempt to fix flaky SCAN consistency test (#2834 ) Related test failures: https://github.com/valkey-io/valkey/actions/runs/19282092345/job/55135193394 https://github.com/valkey-io/valkey/actions/runs/19200556305/job/54887767594 > *** [err]: scan family consistency with configured hash seed in tests/integration/scan-family-consistency.tcl > Expected '5 {k:1 k:25 z k:11 k:18 k:27 k:45 k:7 k:12 k:19 k:29 k:40 k:41 k:43}' to be equal to '5 {k:1 k:25 k:11 z k:18 k:27 k:45 k:7 k:12 k:19 k:29 k:40 k:41 k:43}' (context: type eval line 26 cmd {assert_equal $primary_cursor_next $replica_cursor_next} proc ::start_server) The reason is that the RDB part of the primary-replica synchronization affects the resize policy of the hashtable. See `b835463a73/src/server.c (L807-L818)` Signed-off-by: yzc-yzc <96833212+yzc-yzc@users.noreply.github.com>	2025-11-14 10:55:05 +01:00
Binbin	331a852821	Change DEFAULT_WAIT_BEFORE_RDB_CLIENT_FREE from 60s to 5s (#2829 ) Consider this scenario: 1. Replica starts loading the RDB using the rdb connection 2. Replica finishes loading the RDB before the replica main connection has initiated the PSYNC request 3. Replica stops replicating after receiving replicaof no one 4. Primary can't know that the replica main connection will never ask for PSYNC, so it keeps the reference to the replica's replication buffer block 5. Primary has a shutdown-timeout configured and requires to wait for the rdb connection to close before it can shut down. The current 60-second wait time (DEFAULT_WAIT_BEFORE_RDB_CLIENT_FREE) is excessive and leads to prolonged resource retention in edge cases. Reducing this timeout to 5 seconds would provide adequate time for legitimate PSYNC requests while mitigating the issue described above. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-11-14 11:32:29 +08:00
Ricardo Dias	8e0b375da4	Fix cluster slot stats for scripts with cross-slot keys (#2835 ) This commit fixes the cluster slot stats for scripts executed by scripting engines when the scripts access cross-slot keys. This was not a bug in Lua scripting engine, but `VM_Call` was missing a call to invalidate the script caller client slot to prevent the accumulation of stats. Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-11-13 12:05:52 -08:00
Rain Valentine	01a7657b83	Add --warmup and --duration parameters to valkey-benchmark (#2581 ) It's handy to be able to automatically do a warmup and/or test by duration rather than request count. 🙂 I changed the real-time output a bit - not sure if that's wanted or not. (Like, would it break people's weird scripts? It'll break my weird scripts, but I know the price of writing weird fragile scripts.) ``` Prepended "Warming up " when in warmup phase: Warming up SET: rps=69211.2 (overall: 69747.5) avg_msec=0.425 (overall: 0.393) 3.8 seconds ^^^^^^^^^^ Appended running request counter when based on -n: SET: rps=70892.0 (overall: 69878.1) avg_msec=0.385 (overall: 0.398) 612482 requests ^^^^^^^^^^^^^^^ Appended running second counter when in warmup or based on --duration: SET: rps=61508.0 (overall: 61764.2) avg_msec=0.430 (overall: 0.426) 4.8 seconds ^^^^^^^^^^^ ``` To be clear, the report at the end remains unchanged. --------- Signed-off-by: Rain Valentine <rsg000@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-11-13 12:57:46 +01:00
Sarthak Aggarwal	b835463a73	Fixes test-freebsd workflow in daily (package lang/tclX) (#2832 ) This PR fixes the freebsd daily job that has been failing consistently for the last days with the error "pkg: No packages available to install matching 'lang/tclx' have been found in the repositories". The package name is corrected from `lang/tclx` to `lang/tclX`. The lowercase version worked previously but appears to have stopped working in an update of freebsd's pkg tool to 2.4.x. Example of failed job: https://github.com/valkey-io/valkey/actions/runs/19282092345/job/55135193499 Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>	2025-11-13 08:24:37 +01:00
Binbin	7ffe4dcec4	Remove the EXAT and PXAT from some HFE notifications tests (#2831 ) As we can see, we expected to get hexpired, but we got hexpire instead, this means tht the expiration time has expired during execution. ``` *** [err]: HGETEX EXAT keyspace notifications for active expiry in tests/unit/hashexpire.tcl Expected 'pmessage __keyevent@* __keyevent@9__:hexpired myhash' to match 'pmessage __keyevent@* __keyevent@*:hexpire myhash' ``` We should remove the EXAT and PXAT from these fixtures. And we indeed have the dedicated tests that verify that we get 'expired' when EX,PX are set to 0 or EXAT,PXAT are in the past. Signed-off-by: Binbin <binloveplay1314@qq.com>	2025-11-12 14:32:13 +02:00
eifrah-aws	1b0b5c0cfd	New module API to perform prefix‑aware ACL permission check (#2796 ) ## Description This change introduces the ability for modules to check ACL permissions against key prefix. The update adds a dedicated `prefixmatchlen` helper and extends the core ACL selector logic to support a prefix‑matching mode. The new API `ValkeyModule_ACLCheckPrefixPermissions` is registered and exposed to modules, and a corresponding implementation is added in `module.c`. Existing internal callers that already perform prefix checks (e.g., `VM_ACLCheckKeyPermissions`) are updated to use the new flag, while all legacy paths remain unchanged. The change also modifies the `aclcheck§ test module that exercises the new prefix‑checking API, ensuring that read/write operations are correctly allowed or denied based on the ACL configuration. Key areas touched: * ACL logic * Module API * Testing # Motivation The search module presently makes costly calls to verify index permissions (see https://github.com/valkey-io/valkey-search/blob/main/src/acl.cc#L295). This PR introduces a more efficient approach for that. --------- Signed-off-by: Eran Ifrah <eifrah@amazon.com> Signed-off-by: Madelyn Olson <madelyneolson@gmail.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: Binbin <binloveplay1314@qq.com>	2025-11-12 10:51:58 +02:00
Daniil Kashapov	3c378862c3	Cluster: Avoid usage of light weight messages to nodes with not ready bidirectional links (#2817 ) After network failure nodes that come back to cluster do not always send and/or receive messages from other nodes in shard, this fix avoids usage of light weight messages to nodes with not ready bidirectional links. When a light message comes before any normal message, freeing of cluster link is happening because on the just established connection link->node is not assigned yet. It is assigned in getNodeFromLinkAndMsg right after the condition if (is_light). So on a cluster with heavy pubsub load a long loop of disconnects is possible, and we got this. 1. node A establishes cluster link to node B 2. node A propagates PUBLISH to node B 3. node B frees cluster link because of link->node == null as it has not received non-light messages yet 4. go to 1. During this loop subscribers of node B does not receive any messages published to node A. So here we want to make sure that PING was sent (and link->node was initialized) on this connection before using lightweight messages. --------- Signed-off-by: Daniil Kashapov <daniil.kashapov.ykt@gmail.com> Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>	2025-11-11 20:03:24 -08:00
Jim Brunner	047080a622	shared zadd for geoadd (#2828 ) GEOADD is allocating/destroying a string object for "ZADD" each time it is called. Created a shared string instead. Signed-off-by: Jim Brunner <brunnerj@amazon.com>	2025-11-11 15:26:53 -08:00
Roshan Khatri	b7a3fc988a	Fix Test dual-channel: primary tracking replica backlog refcount (#2827 ) This increases the times we check for the logs from 20 to 40. I found that every `wait-for` check takes about 1.5 to 1.57 milliseconds so when we were checking 2000 times after 1ms we were actually spending (2000 * 1) + (2000 *1.75) = 5500ms time waiting. this can be founds under: for 10 checks we took 35 ms more so thats around 1.75 ms per check ``` Execution time: 2034 ms (failed) [err]: 20 100 - Test dual-channel: primary tracking replica backlog refcount - start with empty backlog in tests/integration/dual-channel-replication-flaky.tcl ``` That is why increasing it to 40 100 will check for approx 4,070 ms which is still less than the original 5500ms but should passes every single time here: https://github.com/roshkhatri/valkey/actions/runs/19279424967/job/55126976882 Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>	2025-11-12 00:03:50 +01:00
Arthur Lee	2da21d9def	Allow partial sync after loading AOF with preamble (#2366 ) The AOF preamble mechanism replaces the traditional AOF base file with an RDB snapshot during rewrite operations, which reduces I/O overhead and improves loading performance. However, when valkey loads the RDB-formatted preamble from the base AOF file, it does not process the replication ID (replid) information within the RDB AUX fields. This omission has two limitations: * On a primary, it prevents the primary from accepting PSYNC continue requests after restarting with a preamble-enabled AOF file. * On a replica, it prevents the replica from successfully performing partial sync requests (avoiding full sync) after restarting with a preamble-enabled AOF file. To resolve this, this commit aligns the AOF preamble handling with the logic used for standalone RDB files, by storing the replication ID and replication offset in the AOF preamble and restoring them when loading the AOF file. Resolves #2677 --------- Signed-off-by: arthur.lee <liziang.arthur@bytedance.com> Signed-off-by: Arthur Lee <arthurkiller@users.noreply.github.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-11-11 12:41:27 +01:00
Ricardo Dias	7fbd4cb260	Expose SIMPLE_STRING and ARRAY_NULL reply type to the Module API (#2804 ) This commit extends the Module API to expose the `SIMPLE_STRING` and `ARRAY_NULL` reply types to modules, by passing the new flag `X` to the `ValkeyModule_Call` function. By only exposing the new reply types behind a flag we keep the backward compatibility with existing module implementations and allow new modules to working with these reply type, which are required for scripts to process correctly the reply type of commands called inside scripts. Before this change, commands like `PING` or `SET`, which return `"OK"` as a simple string reply, would be returned as string replies to scripts. To allow the support of the Lua engine as an external module, we need to distinguish between simple string and string replies to keep backward compatibility. --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-11-10 15:05:26 +00:00
Ricardo Dias	bb8989cfde	Adds new module context flag `VALKEYMODULE_CTX_SCRIPT_EXECUTION` (#2818 ) The new module context flag `VALKEYMODULE_CTX_SCRIPT_EXECUTION` denotes that the module API function is being called in the context of a scripting engine execution. Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-11-10 10:29:40 +00:00
Vadym Khoptynets	65ab07dde7	Leverage zfree_with_size for client reply blocks (#2624 ) clientReplyBlock stores the size of the actual allocation in it size field (minus the header size). This can be used for more effective deallocation with zfree_with_size. Signed-off-by: Vadym Khoptynets <vadymkh@amazon.com>	2025-11-09 20:46:27 +02:00
Roshan Khatri	2288657a05	[DEFLAKE] Psync established after rdb load - beyond grace period (#2748 ) Resolves: https://github.com/valkey-io/valkey/issues/2695 Increase the wait time for periodic log check for rdb load time. Also, increases the delay of log check frequency. --------- Signed-off-by: Roshan Khatri <rvkhatri@amazon.com> Signed-off-by: Roshan Khatri <117414976+roshkhatri@users.noreply.github.com> Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>	2025-11-07 15:11:37 -08:00
Harkrishn Patro	7f8c5b6f0c	[flaky-failure-fix] Increase the cluster-node-timeout to have longer delay between failover of each shard (#2793 )	2025-11-06 16:14:45 -08:00
yzc-yzc	37d08d3866	Fix flaky DBSIZE test for atomic slot migration (#2805 ) Related test failures: *** [err]: Replica importing key containment (slot 0 from node 0 to 2) - DBSIZE command excludes importing keys in tests/unit/cluster/cluster-migrateslots.tcl Expected '1' to match '0' (context: type eval line 2 cmd {assert_match "0" [R $node_idx DBSIZE]} proc ::test) The reason is that we don't wait for the primary-replica synchronization to complete before starting the next testcase. --------- Signed-off-by: yzc-yzc <96833212+yzc-yzc@users.noreply.github.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-11-06 18:02:27 +01:00
Ricardo Dias	7a1d989696	Add "script" context to ACL log entries (#2798 ) In this commit we add a new context for the ACL log entries that is used to log ACL failures that occur during scripts execution. To maintain backward compatibility we still maintain the "lua" context for failures that happen when running Lua scripts. For other scripting engines the context description will be just "script". --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-11-06 09:46:22 +00:00
hieu2102	cf7a628ada	Add instruction to build Valkey with fast_float (#2810 ) The `README.md` file is currently missing a section to build Valkey with `fast_float`, which was introduced in Valkey 8.1 as an optional dependency (#1260) Signed-off-by: hieu2102 <hieund2102@gmail.com>	2025-11-06 09:45:12 +00:00
Sarthak Aggarwal	32844b8b0a	Configurable DB hash seed for SCAN family commands consistency (#2608 ) Introduce a new config `hash-seed` which can be set only at startup and controls the hash seed for the server. This includes all hash tables. This change makes it so that both primaries and replicas will return the same results for SCAN/HSCAN/ZSCAN/SSCAN cursors. This is useful in order to make sure SCAN behaves correctly after a failover. Resolves #4 --------- Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com> Signed-off-by: Sarthak Aggarwal <sarthakaggarwal97@gmail.com> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-11-05 08:45:52 -08:00
xbasel	c88c94e326	Reuse dbHasNoKeys() inside dbsHaveNoKeys() to remove duplicate logic (#2800 ) Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>	2025-11-04 11:28:39 -08:00
Sarthak Aggarwal	a49d469f48	Reverts hashHashtableTypeValidate signature (#2799 ) Fixes https://github.com/valkey-io/valkey/actions/runs/19053371057/job/54418411647#step:6:202 Matched hashHashtableTypeValidate to the [generic hashtable callback signature ](https://github.com/valkey-io/valkey/blob/unstable/src/hashtable.h#L62)and performed the entry cast internally to preserve expiry checks. --------- Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: Jim Brunner <brunnerj@amazon.com>	2025-11-04 20:07:57 +02:00
Jim Brunner	a99c636321	Improve header comment and strengthen type checking for entry (#2794 ) In `entry.c`, the `entry` is a block of memory with variable contents. The structure can be difficult to understand. A new header comment more clearly documents the contents/layout of the `entry`. Also, in `entry.h`, the `entry` was defined by `typedef void entry`. This allows blind casting to the `entry` type. It defeats compiler type checking. Even though the `entry` has a variable definition, we can define entry as a legitimate type which allows the compiler to perform type checking. By performing `typedef struct _entry entry`, now the `entry` is understood to be a pointer to some type of undefined structure. We can pass a pointer and the compiler can typecheck the pointer. (Of course we can't dereference it, because we haven't actually defined the struct.) Signed-off-by: Jim Brunner <brunnerj@amazon.com>	2025-11-03 19:39:36 +02:00
Hanxi Zhang	4d78d36bff	HSETEX: Support NX/XX Flags (#2668 ) ### Summary Addresses https://github.com/valkey-io/valkey/issues/2619. This PR extends the `HSETEX` command to support optional key-level `NX` and `XX` flags, allowing operations conditional on the existence of the hash key. ### Changes - Updated `hsetex.json` and regenerated `commands.def`. - Extended argument parsing for NX/XX. - Added key-level `NX`/`XX` support in `HSETEX`. - Added tests covering all four NX/XX scenarios. --------- Signed-off-by: Hanxi Zhang <hanxizh@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>	2025-11-03 09:43:48 +02:00
Simon Baatz	6cbc1a31d7	Sentinel: fix regression requiring "+failover" ACL in failover path (#2780 ) Since Valkey Sentinel 9.0, sentinel tries to abort an ongoing failover when changing the role of a monitored instance. Since the result of the command is ignored, the "FAILOVER ABORT" command is sent irrespective of the actual failover status. However, when using the documented pre 9.0 ACLs for a dedicated sentinel user, the FAILOVER command is not allowed and _all_ failover cases fail. (Additionally, the necessary ACL adaptation was not communicated well.) Address this by: - Updating the documentation in "sentinel.conf" to reflect the need for an adapted ACL - Only abort a failover when sentinel detected an ongoing (probably stuck) failover. This means that standard failover and manual failover continue to work with unchanged pre 9.0 ACLs. Only the new "SENTINEL FAILOVER COORDINATED" requires to adapt the ACL on all Valkey nodes. - Actually use a dedicated sentinel user and ACLs when testing standard failover, manual failover, and manual coordinated failover. Fixes #2779 Signed-off-by: Simon Baatz <gmbnomis@gmail.com>	2025-10-31 14:46:53 -04:00
harrylin98	189c69e315	Fix: ltrim should not call signalModifiedKey when no elements are removed (#2787 ) There’s an issue with the LTRIM command. When LTRIM does not actually modify the key — for example, with `LTRIM key 0 -1` — the server.dirty counter is not updated because both ltrim and rtrim values are 0. As a result, the command is not propagated. However, `signalModifiedKey` is still called regardless of whether server.dirty changes. This behavior is unexpected and can cause a mismatch between the source and target during propagation, since the LTRIM command is not sent. Signed-off-by: Harry Lin <harrylhl@amazon.com> Co-authored-by: Harry Lin <harrylhl@amazon.com>	2025-10-31 14:46:36 -04:00
Jacob Murphy	43ee46da33	Authenticate slot migration client on source node to internal user (#2785 ) Just setting the authenticated flag actually authenticates to the default user in this case. The default user may be granted no permission to use CLUSTER SYNCSLOTS. Instaed, we now authenticate to the NULL/internal user, which grants access to all commands. This is the same as what we do for replication: `864de555ce/src/replication.c (L4717)` Add a test for this case as well. Closes #2783 Signed-off-by: Jacob Murphy <jkmurphy@google.com>	2025-10-31 10:57:05 -07:00
Ricardo Dias	84eb459cd4	Add ValkeyModule_ReplyWithCustomErrorFormat to module API (#2791 ) Note: these changes are part of the effort to run Lua engine as an external scripting engine module. The new function `ValkeyModule_ReplyWithCustomErrorFormat` is being added to the module API to allow scripting engines to return errors that originated from running commands within the script code, without counting twice in the error stats counters. More details on why this is needed by scripting engines can be read in an older commit `aa856b39f2` messsage. This PR also adds a new test to ensure the correctness of the newly added function. --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-10-31 16:02:27 +00:00
xbasel	f54818cc60	Bug fix: reset io_last_written on c->buf resize to prevent stale pointers (#2786 ) Fixes an assert crash in _writeToClient(): serverAssert(c->io_last_written.data_len == 0 \|\| c->io_last_written.buf == c->buf); The issue occurs when clientsCronResizeOutputBuffer() grows or reallocates c->buf while io_last_written still points to the old buffer and data_len is non-zero. On the next write, both conditions in the assertion become false. Reset io_last_written when resizing the output buffer to prevent stale pointers and keep state consistent. fixes https://github.com/valkey-io/valkey/issues/2769 Signed-off-by: xbasel <103044017+xbasel@users.noreply.github.com>	2025-10-30 13:51:01 -07:00
Ricardo Dias	864de555ce	Make ValkeyModule_Call compatible with calling commands from scripting engines (#2782 ) Note: these changes are another step towards being able to run Lua engine as an external scripting engine module. In this commit we improve the `ValkeyModule_Call` API function code to match the validations and behavior of the `scriptCall` function, currently used by the Lua engine to run commands using `server.call` Lua Valkey API. The changes made are backward compatible. The new behavior/validations are only enabled when calling `ValkeyModule_Call` while running a script using `EVAL` or `FCALL`. To test these changes, we improved the `HELLO` dummy scripting engine module to support calling commands, and compare the behavior with calling the same command from a Lua script. Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-10-30 16:19:46 +00:00
Ricardo Dias	ea103da5d6	New INFO section for scripting engines (#2738 ) This commit adds a new `INFO` section called "Scripting Engines" that shows the information of the current scripting engines available in the server. Here's an output example: ``` > info scriptingengines # Scripting Engines engines_count:2 engines_total_used_memory:68608 engines_total_memory_overhead:56 engine_0:name=LUA,module=built-in,abi_version=4,used_memory=68608,memory_overhead=16 engine_1:name=HELLO,module=helloengine,abi_version=4,used_memory=0,memory_overhead=40 ``` --------- Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>	2025-10-30 16:18:01 +00:00
Diego Ciciani	e381182297	Add IPv6 availability check to skip tests when unavailable (#2674 ) Skip IPv6 tests automatically when IPv6 is not available. This fixes the problem that tests fail when IPv6 is not available on the system, which can worry users when they run `make test`. IPv6 availibility is detected by opening a dummy server socket and trying to connect to it using a client socket over IPv6. Fixes #2643 --------- Signed-off-by: diego-ciciani01 <diego.ciciani@gmail.com> Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech> Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>	2025-10-30 11:20:56 +01:00
Sarthak Aggarwal	10281becaf	Adds a summary for tests (#2745 ) ``` Test Summary: 100 passed, 2 failed !!! WARNING The following tests failed: ... ```` --------- Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>	2025-10-29 13:36:37 -07:00
Ken	f3b2dee3b7	Add monotonic clock calibration handling if clock speed is not found (#2776 ) Currently, monotonic clock initialization relies on the model name field from /proc/cpuinfo to retrieve the clock speed. However, this is not always present. In case it is not present, measure the clock tick and use it instead. Before fix: ``` monotonic: x86 linux, unable to determine clock rate ``` After fix: ``` 21695:M 25 Oct 2025 20:16:23.168 * monotonic clock: X86 TSC @ 2649 ticks/us ``` Fixes #2774 --------- Signed-off-by: Ken Nam <otherscase@gmail.com> Signed-off-by: Ran Shidlansik <ranshid@amazon.com> Co-authored-by: Ran Shidlansik <ranshid@amazon.com>	2025-10-28 22:20:12 +02:00
Ritoban Dutta	909d082cd0	Reorder valkey.conf: move configs to correct sections (#2737 ) - Moved `server-cpulist`, `bio-cpulist`, `aof-rewrite-cpulist`, `bgsave-cpulist` configurations to ADVANCED CONFIG. - Moved `ignore-warnings` configuration to ADVANCED CONFIG. - Moved `availability-zone` configuration to GENERAL. These configs were incorrectly placed at the end of the file in the ACTIVE DEFRAGMENTATION section. Fixes #2736 --------- Signed-off-by: ritoban23 <ankudutt101@gmail.com>	2025-10-28 10:36:23 +01:00
Sarthak Aggarwal	2c92a6072d	Reverts rdb-key-save-delay value to fix dual channel replication test in macos (#2771 ) Resolves #2696 Set `rdb-key-save-delay` to 200 microseconds to reduce the overall RDB load time. Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>	2025-10-27 13:08:40 -07:00
Zhijun Liao	861d0794b7	Sentinel: Skip IS-PRIMARY-DOWN-BY-ADDR requests when primary not SDOWN (#2763 ) A super tiny change to optimize the function `sentinelAskPrimaryStateToOtherSentinels` to early return when the sentinel does not observe the primary as subjectively down. Signed-off-by: Zhijun <dszhijun@gmail.com>	2025-10-24 16:15:54 -04:00
Zhijun Liao	baf2d572f7	Ensure the server executable exists before running tests (#2762 ) Previously, running ./runtest without src/valkey-server would hang, now it throws an error. Signed-off-by: Zhijun <dszhijun@gmail.com>	2025-10-23 19:29:50 +08:00

1 2 3 4 5 ...

13441 Commits All Branches Search

13441 Commits

All Branches