For corrupted (human-made) or program-error-recovered nodes.conf files,
check for duplicate nodeids when loading nodes.conf. If a duplicate is
found, panic is triggered to prevent nodes from starting up
unexpectedly.
The node ID is used to identify every node across the whole cluster,
we do not expect to find duplicate nodeids in nodes.conf.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Currently, when parsing querybuf, we are not checking for CRLF,
instead we assume the last two characters are CRLF by default,
as shown in the following example:
```
telnet 127.0.0.1 6379
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
*3
$3
set
$3
key
$5
value12
+OK
get key
$5
value
*3
$3
set
$3
key
$5
value12345
+OK
-ERR unknown command '345', with args beginning with:
```
This should actually be considered a protocol error. When a bug
occurs in the client-side implementation, we may execute incorrect
requests (writing incorrect data is the most serious of these).
---------
Signed-off-by: Binbin <binloveplay1314@qq.com>
- Require a 2/3 supermajority vote for all Governance Major Decisions.
- Update Technical Major Decision voting to prioritize simple majority, limiting the use of "+2" approval.
- Define remediation steps for when the 1/3 organization limit is exceeded.
---------
Signed-off-by: Ping Xie <pingxie@outlook.com>
Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
GitHub has deprecated older macOS runners, and macos-13 is no longer supported.
1. The latest version of cross-platform-actions/action does allow
running on ubuntu-latest (Linux runner) and does not strictly require macOS.
2. Previously, cross-platform-actions/action@v0.22.0 used runs-on:
macos-13. I checked the latest version of cross-platform-actions, and
the official examples now use runs-on: ubuntu. I think we can switch from macOS to Ubuntu.
---------
Signed-off-by: Vitah Lin <vitahlin@gmail.com>
This reverts commit 2da21d9def.
The implementation in #2366 made it possible to perform a partial
resynchronization after loading an AOF file, by storing the replication
offset in the AOF preamble and counting the bytes in the AOF command
stream in the same way as we count byte offset in a replication stream.
However, this approach isn't safe because some commands are replicated
but not stored in the AOF file. This includes the commands REPLCONF,
PING, PUBLISH and module-implemented commands where a module can control
to propagate a command to the replication stream only, to AOF only or to
both. This oversight led to data inconsistency, where the wrong
replication offset is used for partial resynchronization as explained in
issue #2904.
The revert caused small merge conflicts with 3c3a1966ec which are
solved.
Fixes#2904.
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
The CLUSTER SLOTS reply depends on whether the client is connected over
IPv6, but for a fake client there is no connection and when this command
is called from a module timer callback or other scenario where no real
client is involved, there is no connection to check IPv6 support on.
This fix handles the missing case by returning the reply for IPv4
connected clients.
Fixes#2912.
---------
Signed-off-by: Su Ko <rhtn1128@gmail.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Su Ko <rhtn1128@gmail.com>
Co-authored-by: KarthikSubbarao <karthikrs2021@gmail.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
In #2078, we did not report large reply when copy avoidance is allowed.
This results in replies larger than 16384 not being recorded in the
commandlog large-reply. This 16384 is controlled by the hidden config
min-string-size-avoid-copy-reply.
Signed-off-by: Binbin <binloveplay1314@qq.com>
We have the same settings for the hard limit, and we should apply them to the soft
limit as well. When the `repl-backlog-size` value is larger, all replication buffers
can be handled by the replication backlog, so there's no need to worry about the
client output buffer soft limit in here. Furthermore, when `soft_seconds` is 0, in
some ways, the soft limit behaves the same (mostly) as the hard limit.
Signed-off-by: Binbin <binloveplay1314@qq.com>
## Problem
IO thread shutdown can deadlock during server panic when the main thread
calls `pthread_cancel()` while the IO thread holds its mutex, preventing
the thread from observing the cancellation.
## Solution
Release the IO thread mutex before cancelling to ensure clean thread
termination.
## Testing
Reproducer:
```
bash
./src/valkey-server --io-threads 2 --enable-debug-command yes
./src/valkey-cli debug panic
```
Before: Server hangs indefinitely
After: Server terminates cleanly
Signed-off-by: Ouri Half <ourih@amazon.com>
This adds the workflow improvements for PR and Release benchmark where
it runs on `c8g.metal-48xl` for `ARM64` and `c7i.metal-48xl` for `X86`.
```
Cluster mode: disabled
TLS: disabled
io-threads: 1, 9
Pipelining: 1, 10
Clients: 1600
Benchmark Treads: 90
Data size: 16 ,96
Commands: SET, GET
```
c8g.metal-48xl Spec: https://aws.amazon.com/ec2/instance-types/c8g/
c7i.metal.48xl Spec: https://aws.amazon.com/ec2/instance-types/c7i/
```
vCPU: 192
NUMA nodes: 2
Memory (GiB): 384
Network Bandwidth (Gbps): 50
```
PR benchmarking will be executed on **ARM64** machine as it has been
seen to be more consistent.
Additionally, it runs 5 iterations for each tests and posts the average
and other statistical metrics like
- CI99%: 99% Confidence Interval - range where the true population mean
is likely to fall
- PI99%: 99% Prediction Interval - range where a single future
observation is likely to fall
- CV: Coefficient of Variation - relative variability (σ/μ × 100%)
_Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate
averages from X runs with standard deviation Y, coefficient of variation
Z%, 99% confidence interval margin of error ±W% of the mean, and 99%
prediction interval margin of error ±V% of the mean. CI bounds [A, B]
and PI bounds [C, D] show the actual interval ranges._
For comparing between versions, it adds a workflow which runs on both
**ARM64** and **X86** machine. It will also post the comparison between
the versions like this:
https://github.com/valkey-io/valkey/issues/2580#issuecomment-3399539615
---------
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
Signed-off-by: Roshan Khatri <117414976+roshkhatri@users.noreply.github.com>
This makes it safe to delete hashtable while a safe iterator is
iterating it. This currently isn't possible, but this improvement is
required for fork-less replication #1754 which is being actively
worked on.
We discussed these issues in #2611 which guards against a different but
related issue: calling hashtableNext again after it has already returned
false.
I implemented a singly linked list that hashtable uses to track its
current safe iterators. It is used to invalidate all associated safe
iterators when the hashtable is released. A singly linked list is
acceptable because the list length is always very small - typically zero
and no more than a handful.
Also, renames the internal functions:
hashtableReinitIterator -> hashtableRetargetIterator
hashtableResetIterator -> hashtableCleanupIterator
---------
Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
This prevents crashes on the older nodes in mixed clusters where some
nodes are running 8.0 or older. Mixed clusters often exist temporarily
during rolling upgrades.
Fixes: #2341
Signed-off-by: Harkrishn Patro <harkrisp@amazon.com>
Fixes#2792
Replace strcmp with byte-by-byte comparison to avoid accidental
heap-buffer-overflow errors.
Signed-off-by: murad shahmammadli <shmurad@amazon.com>
Co-authored-by: murad shahmammadli <shmurad@amazon.com>
Fixes: https://github.com/valkey-io/valkey/issues/2859
Increased the warmup to 2 sec so we can verify that it runs more number
of commands than the actual benchmark.
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
General cleanup on LRU/LFU code. Improve modularity and maintainability.
Specifically:
* Consolidates the mathematical logic for LRU/LFU into `lrulfu.c`, with
an API in `lrulfu.h`. Knowledge of the LRU/LFU implementation was
previously spread out across `db.c`, `evict.c`, `object.c`, `server.c`,
and `server.h`.
* Separates knowledge of the LRU from knowledge of the object containing
the LRU value. `lrulfu.c` knows about the LRU/LFU algorithms, without
knowing about the `robj`. `object.c` knows about the `robj` without
knowing about the details of the LRU/LFU algorithms.
* Eliminated `server.lruclock`, instead using `server.unixtime`. This
also eliminates the periodic need to call `mstime()` to maintain the lru
clock.
* Fixed a minor computation bug in the old `LFUTimeElapsed` function
(off by 1 after rollover).
* Eliminate specific IF checks for rollover, using defined behavior for
unsigned rollover instead.
* Fixed a bug in `debug.c` which would perform LFU modification on an
LRU value.
---------
Signed-off-by: Jim Brunner <brunnerj@amazon.com>
Co-authored-by: Ran Shidlansik <ranshid@amazon.com>
Closes#2883
Support a new environment variable `VALKEY_PROG_SUFFIX` in the test
framework, which can be used for running tests if the binaries are
compiled with a program suffix. For example, if the binaries are
compiled using `make PROG_SUFFIX=-alt` to produce binaries named
valkey-server-alt, valkey-cli-alt, etc., run the tests against these
binaries using `VALKEY_PROG_SUFFIX=-alt ./runtest` or simply using `make
test`.
Now the test with the make variable `PROG_SUFFIX` works well.
```
% make PROG_SUFFIX="-alt"
...
...
CC trace/trace_aof.o
LINK valkey-server-alt
INSTALL valkey-sentinel-alt
CC valkey-cli.o
CC serverassert.o
CC cli_common.o
CC cli_commands.o
LINK valkey-cli-alt
CC valkey-benchmark.o
LINK valkey-benchmark-alt
INSTALL valkey-check-rdb-alt
INSTALL valkey-check-aof-alt
Hint: It's a good idea to run 'make test' ;)
%
% make test
cd src && /Library/Developer/CommandLineTools/usr/bin/make test
CC Makefile.dep
CC release.o
LINK valkey-server-alt
INSTALL valkey-check-aof-alt
INSTALL valkey-check-rdb-alt
LINK valkey-cli-alt
LINK valkey-benchmark-alt
Cleanup: may take some time... OK
Starting test server at port 21079
[ready]: 39435
Testing unit/pubsub
```
Signed-off-by: Zhijun <dszhijun@gmail.com>
Historically, Valkey’s TCL test suite expected all binaries
(src/valkey-server, src/valkey-cli, src/valkey-benchmark, etc.) to exist
under the src/ directory. This PR enables Valkey TCL tests to run
seamlessly after a CMake build — no manual symlinks or make build
required.
The test framework accepts a new environment variable `VALKEY_BIN_DIR`
to look for the binaries.
CMake will copy all TCL test entrypoints (runtest, runtest-cluster,
etc.) into the CMake build dir (e.g. `cmake-build-debug`) and insert
`VALKEY_BIN_DIR` into these. Now we can either do
./cmake-build-debug/runtest at the project root or ./runtest at the
Cmake dir to run all tests.
A new CMake post-build target prints a friendly reminder after
successful builds, guiding developers on how to run tests with their
CMake binaries:
```
Hint: It is a good idea to run tests with your CMake-built binaries ;)
./cmake-build-debug/runtest
Build finished
```
A helper TCL script `tests/support/set_executable_path.tcl` is added to
support this change, which gets called by all test entrypoints:
`runtest`, `runtest-cluster`, `runtest-sentinel`.
---------
Signed-off-by: Zhijun <dszhijun@gmail.com>
Although we can infer this infomartion from the replica logs,
i think this still would be useful to see this information directly
in the primary logs.
So now we can see the psync offset range when psync fails and then
we can analyze and adjust the value of repl-backlog-size.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Persist USE_FAST_FLOAT and PROG_SUFFIX to prevent a complete rebuild
next time someone types make or make test without specifying variables.
Fixes#2880
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
fedorarawhide CI reports these warnings:
```
networking.c: In function 'afterErrorReply':
networking.c:821:30: error: initialization discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
821 | char *spaceloc = memchr(s, ' ', len < 32 ? len : 32);
```
Signed-off-by: Binbin <binloveplay1314@qq.com>
In functions `clusterSendFailoverAuthIfNeeded` and
`clusterProcessPacket`, we iterate through **every slot bit**
sequentially in the form of `for (int j = 0; j < CLUSTER_SLOTS; j++)`,
performing 16384 checks even when only a few bits were set, and thus
causing unnecessary loop overhead.
This is particularly wasteful in function
`clusterSendFailoverAuthIfNeeded` where we need to ensure the sender's
claimed slots all have up-to-date config epoch. Usually healthy senders
would meet such condition, and thus we normally need to exhaust the for
loop of 16384 checks.
The proposed new implementation loads 64 bits (8 byte word) at a time
and skips empty words completely, therefore only performing 256 checks.
---------
Signed-off-by: Zhijun <dszhijun@gmail.com>
When we added the Hash Field Expiration feature in Valkey 9.0, some of
the new command docs included complexity description of O(1) even tough
they except multiple arguments.
(see discussion in
https://github.com/valkey-io/valkey/pull/2851#discussion_r2535684985)
This PR does:
1. align all the commands to the same description
2. fix the complexity description of some commands (eg HSETEX and
HGETEX)
---------
Signed-off-by: Ran Shidlansik <ranshid@amazon.com>
Enhance debugging for cluster logs
[1] Add human node names in cluster tests so that we can easily
understand which nodes we are interacting with:
```
pong packet received from: 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) from client: :0
node 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) announces that it is a primary in shard c6d1152caee49a5e70cb4b77d1549386078be603
Reconfiguring node 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) as primary for shard c6d1152caee49a5e70cb4b77d1549386078be603
Configuration change detected. Reconfiguring myself as a replica of node 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) in shard c6d1152caee49a5e70cb4b77d1549386078be603
```
[2] Currently there are logs showing the address of incoming
connections:
```
Accepting cluster node connection from 127.0.0.1:59956
Accepting cluster node connection from 127.0.0.1:59957
Accepting cluster node connection from 127.0.0.1:59958
Accepting cluster node connection from 127.0.0.1:59959
```
but we have no idea which nodes these connections refer to. I added a
logging statement when the node is set to the inbound link connection.
```
Bound cluster node 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) to connection of client 127.0.0.1:59956
```
[3] Add a debug log when processing a packet to show the packet type,
sender node name, and sender client port (this also has the benefit of
telling us whether this is an inbound or outbound link).
```
pong packet received from: 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) from client: :0
ping packet received from: 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) from client: 127.0.0.1:59973
fail packet received from: 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) from client: 127.0.0.1:59973
auth-req packet received from: 92a960ffd62f1bd04efeb260b30fe9ca6b9294ed (R4) from client: 127.0.0.1:59973
```
---------
Signed-off-by: Zhijun <dszhijun@gmail.com>
Only enable `HAVE_ARM_NEON` on AArch64 because it supports vaddvq and
all needed compiler intrinsics.
Fixes the following error when building for machine `qemuarm` using the
Yocto Project and OpenEmbedded:
```
| bitops.c: In function 'popcountNEON':
| bitops.c:219:23: error: implicit declaration of function 'vaddvq_u16'; did you mean 'vaddq_u16'? [-Wimplicit-function-declaration]
| 219 | uint32_t t1 = vaddvq_u16(sc);
| | ^~~~~~~~~~
| | vaddq_u16
| bitops.c:225:14: error: implicit declaration of function 'vaddvq_u8'; did you mean 'vaddq_u8'? [-Wimplicit-function-declaration]
| 225 | t += vaddvq_u8(vcntq_u8(vld1q_u8(p)));
| | ^~~~~~~~~
| | vaddq_u8
```
More details are available in the following log:
https://errors.yoctoproject.org/Errors/Details/889836/
Signed-off-by: Leon Anavi <leon.anavi@konsulko.com>
## Problem
When executing `FLUSHALL ASYNC` on a **writable replica** that has
a large number of expired keys directly written to it, the main thread
gets blocked for an extended period while synchronously releasing the
`replicaKeysWithExpire` dictionary.
## Root Cause
`FLUSHALL ASYNC` is designed for asynchronous lazy freeing of core data
structures, but the release of `replicaKeysWithExpire` (a dictionary tracking
expired keys on replicas) still happens synchronously in the main thread.
This synchronous operation becomes a bottleneck when dealing with massive
key volumes, as it cannot be offloaded to the lazyfree background thread.
This PR addresses the issue by moving the release of `replicaKeysWithExpire`
to the lazyfree background thread, aligning it with the asynchronous design
of `FLUSHALL ASYNC` and eliminating main thread blocking.
## User scenarios
In some operations, people often need to do primary-replica switches.
One goal is to avoid noticeable impact on the business—like key loss
or reduced availability (e.g., write failures).
Here is the process: First, temporarily switch traffic to writable replicas.
Then we wait for the primary pending replication data to be fully synced
(so primry and replicas are in sync), before finishing the switch. We don't
usually need to do the flush in this case, but it's an optimization that can
be done.
Signed-off-by: Scut-Corgis <512141203@qq.com>
Signed-off-by: jiegang0219 <512141203@qq.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
After introducing the dual channel replication in #60, we decided in #915
not to add a new configuration item to limit the replica's local replication
buffer, just use "client-output-buffer-limit replica hard" to limit it.
We need to document this behavior and mention that once the limit is reached,
all future data will accumulate in the primary side.
Signed-off-by: Binbin <binloveplay1314@qq.com>
In dual channel replication, when the rdb channel client finish
the RDB transfer, it will enter REPLICA_STATE_RDB_TRANSMITTED
state. During this time, there will be a brief window that we are
not able to see the connection in the INFO REPLICATION.
In the worst case, we might not see the connection for the
DEFAULT_WAIT_BEFORE_RDB_CLIENT_FREE seconds. I guess there is no
harm to list this state, showing connected_slaves but not showing
the connection is bad when troubleshooting.
Note that this also affects the `valkey-cli --rdb` and `--functions-rdb`
options. Before the client is in the `rdb_transmitted` state and is
released, we will now see it in the info (see the example later).
Before, not showing the replica info
```
role:master
connected_slaves:1
```
After, for dual channel replication:
```
role:master
connected_slaves:1
slave0:ip=xxx,port=xxx,state=rdb_transmitted,offset=0,lag=0,type=rdb-channel
```
After, for valkey-cli --rdb-only and --functions-rdb:
```
role:master
connected_slaves:1
slave0:ip=xxx,port=xxx,state=rdb_transmitted,offset=0,lag=0,type=replica
```
Signed-off-by: Binbin <binloveplay1314@qq.com>
This commit adds script function flags to the module API, which allows
function scripts to specify the function flags programmatically.
When the scripting engine compiles the script code can extract the flags
from the code and set the flags on the compiled function objects.
---------
Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>
The original test code only checks:
The original test code only checks:
1. wait_for_cluster_size 4, which calls cluster_size_consistent for every node.
Inside that function, for each node, cluster_size_consistent queries cluster_known_nodes,
which is calculated as (unsigned long long)dictSize(server.cluster->nodes). However, when
a new node is added to the cluster, it is first created in the HANDSHAKE state, and
clusterAddNode adds it to the nodes hash table. Therefore, it is possible for the new
node to still be in HANDSHAKE status (processed asynchronously) even though it appears
that all nodes “know” there are 4 nodes in the cluster.
2. cluster_state for every node, but when a new node is added, server.cluster->state remains FAIL.
Some handshake processes may not have completed yet, which likely causes the flakiness.
To address this, added a --cluster check to ensure that the config state is consistent.
Fixes#2693.
Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Fix a little miss in "Hash field TTL and active expiry propagates
correctly through chain replication" test in `hashexpire.tcl`.
The test did not wait for the initial sync of the chained replica and thus made the test flakey
Signed-off-by: Arad Zilberstein <aradz@amazon.com>
Addresses: https://github.com/valkey-io/valkey/issues/2588
## Overview
Previously we call `emptyData()` during a fullSync before validating the
RDB version is compatible.
This change adds an rdb flag that allows us to flush the database from
within `rdbLoadRioWithLoadingCtx`. THhis provides the option to only
flush the data if the rdb has a valid version and signature. In the case
where we do have an invalid version and signature, we don't emptyData,
so if a full sync fails for that reason a replica can still serve stale
data instead of clients experiencing cache misses.
## Changes
- Added a new flag `RDBFLAGS_EMPTY_DATA` that signals to flush the
database after rdb validation
- Added logic to call `emptyData` in `rdbLoadRioWithLoadingCtx` in
`rdb.c`
- Added logic to not clear data if the RDB validation fails in
`replication.c` using new return type `RDB_INCOMPATIBLE`
- Modified the signature of `rdbLoadRioWithLoadingCtx` to return RDB
success codes and updated all calling sites.
## Testing
Added a tcl test that uses the debug command `reload nosave` to load
from an RDB that has a future version number. This triggers the same
code path that full sync's will use, and verifies that we don't flush
the data until after the validation is complete.
A test already exists that checks that the data is flushed:
https://github.com/valkey-io/valkey/blob/unstable/tests/integration/replication.tcl#L1504
---------
Signed-off-by: Venkat Pamulapati <pamuvenk@amazon.com>
Signed-off-by: Venkat Pamulapati <33398322+ChiliPaneer@users.noreply.github.com>
Co-authored-by: Venkat Pamulapati <pamuvenk@amazon.com>
Co-authored-by: Harkrishn Patro <bunty.hari@gmail.com>
Test the SCAN consistency by alternating SCAN
calls to primary and replica.
We cannot rely on the exact order of the elements and the returned
cursor number.
---------
Signed-off-by: yzc-yzc <96833212+yzc-yzc@users.noreply.github.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
By default, when the number of elements in a zset exceeds 128, the
underlying data structure adopts a skiplist. We can reduce memory usage
by embedding elements into the skiplist nodes. Change the `zskiplistNode`
memory layout as follows:
```
Before
+-------------+
+-----> | element-sds |
| +-------------+
|
+------------------+-------+------------------+---------+-----+---------+
| element--pointer | score | backward-pointer | level-0 | ... | level-N |
+------------------+-------+------------------+---------+-----+---------+
After
+-------+------------------+---------+-----+---------+-------------+
+ score | backward-pointer | level-0 | ... | level-N | element-sds |
+-------+------------------+---------+-----+---------+-------------+
```
Before the embedded SDS representation, we include one byte representing
the size of the SDS header, i.e. the offset into the SDS representation
where that actual string starts.
The memory saving is therefore one pointer minus one byte = 7 bytes per
element, regardless of other factors such as element size or number of
elements.
### Benchmark step
I generated the test data using the following lua script && cli command.
And check memory usage using the `info` command.
**lua script**
```
local start_idx = tonumber(ARGV[1])
local end_idx = tonumber(ARGV[2])
local elem_count = tonumber(ARGV[3])
for i = start_idx, end_idx do
local key = "zset:" .. string.format("%012d", i)
local members = {}
for j = 0, elem_count - 1 do
table.insert(members, j)
table.insert(members, "member:" .. j)
end
redis.call("ZADD", key, unpack(members))
end
return "OK: Created " .. (end_idx - start_idx + 1) .. " zsets"
```
**valkey-cli command**
`valkey-cli EVAL "$(catcreate_zsets.lua)" 0 0 100000
${ZSET_ELEMENT_NUM}`
### Benchmark result
|number of elements in a zset | memory usage before optimization |
memory usage after optimization | change |
|-------|-------|-------|-------|
| 129 | 1047MB | 943MB | -9.9% |
| 256 | 2010MB| 1803MB| -10.3%|
| 512 | 3904MB|3483MB| -10.8%|
---------
Signed-off-by: chzhoo <czawyx@163.com>
Co-authored-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
We need to verify total duration was at least 2 seconds, elapsed time
can be quite variable to check upper-bound
Fixes https://github.com/valkey-io/valkey/issues/2843
Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
After #2829, valgrind report a test failure, it seems that the time is
not enough to generate a COB limit in valgrind.
Signed-off-by: Binbin <binloveplay1314@qq.com>
PSYNC_FULLRESYNC_DUAL_CHANNEL is also a full sync, as the comment says,
we need to allow it. While we have not yet identified the exact edge case
that leads to this line, but during a failover, there should be no difference
between different sync strategies.
Signed-off-by: Binbin <binloveplay1314@qq.com>
Related test failures:
https://github.com/valkey-io/valkey/actions/runs/19282092345/job/55135193394https://github.com/valkey-io/valkey/actions/runs/19200556305/job/54887767594
> *** [err]: scan family consistency with configured hash seed in
tests/integration/scan-family-consistency.tcl
> Expected '5 {k:1 k:25 z k:11 k:18 k:27 k:45 k:7 k:12 k:19 k:29 k:40
k:41 k:43}' to be equal to '5 {k:1 k:25 k:11 z k:18 k:27 k:45 k:7 k:12
k:19 k:29 k:40 k:41 k:43}' (context: type eval line 26 cmd {assert_equal
$primary_cursor_next $replica_cursor_next} proc ::start_server)
The reason is that the RDB part of the primary-replica synchronization
affects the resize policy of the hashtable.
See
b835463a73/src/server.c (L807-L818)
Signed-off-by: yzc-yzc <96833212+yzc-yzc@users.noreply.github.com>
Consider this scenario:
1. Replica starts loading the RDB using the rdb connection
2. Replica finishes loading the RDB before the replica main connection has
initiated the PSYNC request
3. Replica stops replicating after receiving replicaof no one
4. Primary can't know that the replica main connection will never ask for
PSYNC, so it keeps the reference to the replica's replication buffer block
5. Primary has a shutdown-timeout configured and requires to wait for the rdb
connection to close before it can shut down.
The current 60-second wait time (DEFAULT_WAIT_BEFORE_RDB_CLIENT_FREE) is excessive
and leads to prolonged resource retention in edge cases. Reducing this timeout to
5 seconds would provide adequate time for legitimate PSYNC requests while mitigating
the issue described above.
Signed-off-by: Binbin <binloveplay1314@qq.com>
This commit fixes the cluster slot stats for scripts executed by
scripting engines when the scripts access cross-slot keys.
This was not a bug in Lua scripting engine, but `VM_Call` was missing a
call to invalidate the script caller client slot to prevent the
accumulation of stats.
Signed-off-by: Ricardo Dias <ricardo.dias@percona.com>
It's handy to be able to automatically do a warmup and/or test by
duration rather than request count. 🙂
I changed the real-time output a bit - not sure if that's wanted or not.
(Like, would it break people's weird scripts? It'll break my weird
scripts, but I know the price of writing weird fragile scripts.)
```
Prepended "Warming up " when in warmup phase:
Warming up SET: rps=69211.2 (overall: 69747.5) avg_msec=0.425 (overall: 0.393) 3.8 seconds
^^^^^^^^^^
Appended running request counter when based on -n:
SET: rps=70892.0 (overall: 69878.1) avg_msec=0.385 (overall: 0.398) 612482 requests
^^^^^^^^^^^^^^^
Appended running second counter when in warmup or based on --duration:
SET: rps=61508.0 (overall: 61764.2) avg_msec=0.430 (overall: 0.426) 4.8 seconds
^^^^^^^^^^^
```
To be clear, the report at the end remains unchanged.
---------
Signed-off-by: Rain Valentine <rsg000@gmail.com>
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
This PR fixes the freebsd daily job that has been failing consistently
for the last days with the error "pkg: No packages available to install
matching 'lang/tclx' have been found in the repositories".
The package name is corrected from `lang/tclx` to `lang/tclX`. The
lowercase version worked previously but appears to have stopped working
in an update of freebsd's pkg tool to 2.4.x.
Example of failed job:
https://github.com/valkey-io/valkey/actions/runs/19282092345/job/55135193499
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
As we can see, we expected to get hexpired, but we got hexpire instead,
this means tht the expiration time has expired during execution.
```
*** [err]: HGETEX EXAT keyspace notifications for active expiry in tests/unit/hashexpire.tcl
Expected 'pmessage __keyevent@* __keyevent@9__:hexpired myhash' to match 'pmessage __keyevent@* __keyevent@*:hexpire myhash'
```
We should remove the EXAT and PXAT from these fixtures. And we indeed
have
the dedicated tests that verify that we get 'expired' when EX,PX are set
to 0
or EXAT,PXAT are in the past.
Signed-off-by: Binbin <binloveplay1314@qq.com>