/r/redis
All about the Redis key-value store
Redis is a persistent data structure server operating on the key/value model, where values can be hashes, lists, sets, or sorted sets.
Resources:
/r/redis
Hi all,
I'm running Immich in docker on a VPS with external block storage. It has four containers - server, postgress, reddish and machine learning.
A week or so ago, I noticed that the server was not accepting uploads or in turn login, and further to that the Web portal does not resolve.
Investigation found all containers are 'healthy' but the server container has this error in the logs.
ReplyError: NOAUTH Authentication required. at parseError (/usr/src/app/node_modules/redis-parser/lib/parser.js:179:12) at parseType (/usr/src/app/node_modules/redis-parser/lib/parser.js:302:14) { command: { name: 'info', args: [] } }
I can see it's an authentication error with reddis, but not sure how to fix.
Any ideas would be greatly appreciated.
Thanks S
Hi y'all!
I'm working on an IoT Solution in which we want to improve reliability and speed, and thought that maybe REDIS was the kind of DB that might fit our case.
So, for context:
We have a bunch [1500~2000] IoT devices, which are fully featured embedded Linux devices. Each one has like 6GB ram and 64GB disk space with a decent CPU+GPU.
Right now there are some dockers in each device making requests to a cloud BE, but some things are being cached in a local DB for faster access. That DB is mongo with some synchronization service that's soon to be deprecated. But we need this approach to make the solution more reliable since we could be offering an offline experience with the same device in case of connection loss.
So I was considering moving onto REDIS to replace that internal DB since it seems to be way less memory hungry and it's intended for distributed usage, so it has the means of synchronization against a Master. That master in our case could be on-premises or cloud based.
Thank you all for reading and shedding some light into this matter!
I'm building an e-commerce app and want to implement a lightning-fast, scalable product search feature. I’m working with MongoDB as the database, and each product document has fields like productId
, title
, description
, price
, images
, inventory_quantity
, and more (sample document below). For search, I'd primarily focus on the title
, and potentially the description
if it doesn't compromise speed too much.
Here is a simple document:
The goal is to make the search feature ultrafast and highly relevant, handling high volumes and returning accurate results in milliseconds. Here are some key requirements:
title
, and ideally description
if it doesn’t slow things down significantly.title
and description
) to facilitate faster searches, as I’ve heard this is a technique often used in search systems.Questions:
Any experiences, insights, or suggestions (technical details especially welcome!) are greatly appreciated. Thank you!
I'm new to Redis and wondering if it would be a good for something I'm working on.
I have a form on a client-facing site that's collecting data (maybe a dozen fields) from users (maybe 1000 or so). Our internal system can query that data through a REST API for display, but each API call is pretty slow (a few seconds).
I was thinking about caching the data after a call to the API and then having any new form submissions trigger the cache to clear.
Is this a common use case? And is that a reasonable amount of data to store?
Hi all,
I am trying to upgrade redis from 6.2 on Rocky Linux 8 to 7.2 on Rocky Linux 9 and I managed to do almost everything but new slaves are in disconnected state and can't figure out the reason why.
So this his how I did it:
I thought that should do it and when I tried to failover I get (error) NOGOODSLAVE No suitable replica to promote
After some digging through statuses I found out the issue is 10) "slave,disconnected"
when I run redis-cli -p 26379 sentinel replicas test-cluster
.
Here are some outputs:
[root@redis4 ~]# redis-cli -p 26379 sentinel replicas test-cluster
1) 1) "name"
2) "10.100.200.106:6379"
3) "ip"
4) "10.100.200.106"
5) "port"
6) "6379"
7) "runid"
8) "57bb455a3e7dcb13396696b9e96eaa6463fdf7e2"
9) "flags"
10) "slave,disconnected"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "956"
19) "last-ping-reply"
20) "956"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "4080"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "4877433"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "10.100.200.104"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "2115110"
41) "replica-announced"
42) "1"
2) 1) "name"
2) "10.100.200.105:6379"
3) "ip"
4) "10.100.200.105"
5) "port"
6) "6379"
7) "runid"
8) "5ba882d9d6e44615e9be544e6c5d469d13e9af2c"
9) "flags"
10) "slave,disconnected"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "956"
19) "last-ping-reply"
20) "956"
21) "down-after-milliseconds"
22) "5000"
23) "info-refresh"
24) "4080"
25) "role-reported"
26) "slave"
27) "role-reported-time"
28) "4877433"
29) "master-link-down-time"
30) "0"
31) "master-link-status"
32) "ok"
33) "master-host"
34) "10.100.200.104"
35) "master-port"
36) "6379"
37) "slave-priority"
38) "100"
39) "slave-repl-offset"
40) "2115110"
41) "replica-announced"
42) "1"
Sentinel log on the slave:
251699:X 24 Oct 2024 17:16:35.623 * User requested shutdown...
251699:X 24 Oct 2024 17:16:35.623 # Sentinel is now ready to exit, bye bye...
252065:X 24 Oct 2024 17:16:35.639 * Supervised by systemd. Please make sure you set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
252065:X 24 Oct 2024 17:16:35.639 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
252065:X 24 Oct 2024 17:16:35.639 * Redis version=7.2.6, bits=64, commit=00000000, modified=0, pid=252065, just started
252065:X 24 Oct 2024 17:16:35.639 * Configuration loaded
252065:X 24 Oct 2024 17:16:35.639 * monotonic clock: POSIX clock_gettime
252065:X 24 Oct 2024 17:16:35.639 * Running mode=sentinel, port=26379.
252065:X 24 Oct 2024 17:16:35.639 * Sentinel ID is ca842661e783b16daffecb56638ef2f1036826fa
252065:X 24 Oct 2024 17:16:35.639 # +monitor master test-cluster 10.100.200.104 6379 quorum 2
252065:signal-handler (1729785210) Received SIGTERM scheduling shutdown...
252065:X 24 Oct 2024 17:53:30.528 * User requested shutdown...
252065:X 24 Oct 2024 17:53:30.528 # Sentinel is now ready to exit, bye bye...
252697:X 24 Oct 2024 17:53:30.541 * Supervised by systemd. Please make sure you set appropriate values for TimeoutStartSec and TimeoutStopSec in your service unit.
252697:X 24 Oct 2024 17:53:30.541 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
252697:X 24 Oct 2024 17:53:30.541 * Redis version=7.2.6, bits=64, commit=00000000, modified=0, pid=252697, just started
252697:X 24 Oct 2024 17:53:30.541 * Configuration loaded
252697:X 24 Oct 2024 17:53:30.541 * monotonic clock: POSIX clock_gettime
252697:X 24 Oct 2024 17:53:30.541 * Running mode=sentinel, port=26379.
252697:X 24 Oct 2024 17:53:30.541 * Sentinel ID is ca842661e783b16daffecb56638ef2f1036826fa
252697:X 24 Oct 2024 17:53:30.541 # +monitor master test-cluster 10.100.200.104 6379 quorum 2
Redis log:
Oct 24 18:08:48 redis5 redis[246101]: User requested shutdown...
Oct 24 18:08:48 redis5 redis[246101]: Saving the final RDB snapshot before exiting.
Oct 24 18:08:48 redis5 redis[246101]: DB saved on disk
Oct 24 18:08:48 redis5 redis[246101]: Removing the pid file.
Oct 24 18:08:48 redis5 redis[246101]: Redis is now ready to exit, bye bye...
Oct 24 18:08:48 redis5 redis[252962]: monotonic clock: POSIX clock_gettime
Oct 24 18:08:48 redis5 redis[252962]: Running mode=standalone, port=6379.
Oct 24 18:08:48 redis5 redis[252962]: Server initialized
Oct 24 18:08:48 redis5 redis[252962]: Loading RDB produced by version 7.2.6
Oct 24 18:08:48 redis5 redis[252962]: RDB age 0 seconds
Oct 24 18:08:48 redis5 redis[252962]: RDB memory usage when created 1.71 Mb
Oct 24 18:08:48 redis5 redis[252962]: Done loading RDB, keys loaded: 0, keys expired: 0.
Oct 24 18:08:48 redis5 redis[252962]: DB loaded from disk: 0.000 seconds
Oct 24 18:08:48 redis5 redis[252962]: Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
Oct 24 18:08:48 redis5 redis[252962]: Ready to accept connections tcp
Oct 24 18:08:48 redis5 redis[252962]: Connecting to MASTER 10.100.200.104:6379
Oct 24 18:08:48 redis5 redis[252962]: MASTER <-> REPLICA sync started
Oct 24 18:08:48 redis5 redis[252962]: Non blocking connect for SYNC fired the event.
Oct 24 18:08:48 redis5 redis[252962]: Master replied to PING, replication can continue...
Oct 24 18:08:48 redis5 redis[252962]: Trying a partial resynchronization (request db5a47a36aadccb0c928fc632f5232c0fc07051b:2151335).
Oct 24 18:08:48 redis5 redis[252962]: Successful partial resynchronization with master.
Oct 24 18:08:48 redis5 redis[252962]: MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization.
Firewall is off, selinux is not running. I have no idea why are slaves disconnected. Anyone have a clue maybe?
Hi all,
What redis clients are you using for Dev Teams?
I'm looking for a Redis client that allow us to control the access of Dev members, and roles.
Thanks.
Did marketing ask me to post this? Of course. But that doesn't mean it's not worth checking out!
Redis Released: Worldwide is next month. It's virtual, it's free, and it's packed with talks by industry leaders from places like Dell, Viacom, NVIDIA, AWS, and more.
Edit: Here's the link.
Hi all, I have 7x Redis with Sentinel working on version 5.0.4 with some hammers on the entrypoint for the thing to work more or less without problems on Kubernetes Cluster. This Redis are storing the Database on a File Storage from Oracle Cloud (NFS)
Só, tried to upgrade to version 7.4.1 using Helm Chart from Bitnami and it went well..
The problem is, we have the old redis data base on a File Storage from Oracle Cloud (NFS) and its working as expected a year or two. With this new one from Bitnami i pointed the helm chart to the mount volume on NFS and it recognized the old DB from 5.0.4 and it reconfigured for the new version 7.4.1, all fine, but after a while of load on the Redis it starts to restart the redis container entering in Failover, the logs are showing me errors on the “fsync” operation and MISCONF errors..
So, i tried to mount in a disk volume after some reading on the internet and voilá it works fine..
Problem are the costs, it needs 3 disks per redis cluster, or if i scale it it will require more disks for each pod. The new minium disk i can create on Oracle Cloud is 50Gb, so i need 150Gb of disks for each cluster, without scaling and it’s not viable for us.
My Redis have each one around 1~5Gb of space, i dont need 150Gb to have 99% free all the time..
What i’m missing here? What i’m doing wrong?
Thank you!
func hset(ctx context.Context, c *client, key, field string, object Revisioner) (newObj Revisioner, err error) {
txf := func(tx *redis.Tx) error {
// Get the current value or some state of the key
current, err := tx.HGet(ctx, key, field).Result()
if err != nil && err != redis.Nil {
return fmt.Errorf("hget: %w", err)
}
// Compare revisions for optimistic locking
ok, err := object.RevisionCompare([]byte(current))
if err != nil {
return fmt.Errorf("revision compare: %w", err)
}
if !ok {
return ErrModified
}
// Create a new object with a new revision
newObj = object.WitNewRev()
data, err := json.Marshal(newObj)
if err != nil {
return fmt.Errorf("marshalling: %w", err)
}
// Execute the HSET command within the transaction
_, err = tx.TxPipelined(ctx, func(pipe redis.Pipeliner) error {
pipe.HSet(ctx, key, field, string(data))
return nil
})
return err
}
// Execute the transaction with the Watch method
err = c.rc.Watch(ctx, txf, key)
if err == redis.TxFailedErr {
return nil, fmt.Errorf("transaction error: %w", err)
} else if err != nil {
return nil, ErrModified
}
return newObj, nil
}
I was experimenting with optimistic locks and wrote this for hset, under heavy load of events trying to update the same key, observed transaction failed, not too often but for my use case, it should not happen ideally. What is wrong here? Also can I see anywhere what has caused this transaction to failed? The VM I am running this has enough memory btw.
Hello everyone,
I'm planning to deploy Redis across two k8s Tanzu clusters located at different sites (Site 1 and Site 2). The goal is to have a shared Redis setup where data written in one site is automatically replicated to the other. This ensures both sites are kept in sync (e.g., writes in Site 1 replicate to Site 2, and vice versa).
If anyone has a sample YAML configuration for such a setup, I would greatly appreciate it, as well as any recommendations for the deployment as i am mostly beginner when it comes to the Redis related stuff.
Please note that Redis Enterprise isn't an option for this environment, and I’m working in an air-gapped setup.
Thanks!
Hi everyone, I need some guidance in the using redis gears in cluster modes to capture keyspace notifications. My aim is to add acknowledgement for keyspace events. Also I am student developing applications with redis. In order to test out redis gears in local cluster, I tried to setup cluster and load redis gears but failed.
I need some guidance on resources for setting up redis cluster in local with redis gears loaded with python client. If possible through a docker compose. Please guide me on the resources for reference and any better ways of what I am trying to achieve.
Thanks in advance. Also I love redis
My site is hosted on Kinsta and they ask $100 a month to access Redis.
Because I have a Microsoft founders startup hub sponsorship freebee for a year I connected Azure Cache for Redis to my site on Kinsta and it slowed side right down to a crawl. Spoke to them and they said because DB requests have to travel externally and then return data there will be latency issues, whereas they put their licenced redis on my app server internally etc.
But my question is - doesnt Redis stands for remote server - should the remoteness be an issue ?
Any advise how to find a solution ?
According to this diagram below, in read-through caching strategy, the cache itself should read the data directly from the database. However, I just wonder how can this be done in practice? I just wonder "cache" in this case means a middle application or a specific cache system like Redis? Can this be done using Redis Gears?
Thank you in advance.
Redis 8's first milestone release is out. If you want to try it, it's available on Docker. Look for the 8.0-M01 tag, not the latest one.
There's even a blog post talking about what's new. The tl;dr is that the JSON, search, probabilistic data structures, and timeseries data structures that were once just a part of Redis Stack are now baked-in with Redis 8.
Hello!
If I start Redis on my Debian VPS I get this error:
root@BerlinRP:~# sudo systemctl status redis
● redis-server.service - Advanced key-value store
Loaded: loaded (/lib/systemd/system/redis-server.service; enabled; vendor preset: enable> Active: failed (Result: exit-code) since Sun 2024-09-29 18:41:27 CEST; 8min ago
Docs:
http://redis.io/documentation
,
man:redis-server(1)
Process: 252876 ExecStart=/usr/bin/redis-server /etc/redis/redis.conf --supervised system> Main PID: 252876 (code=exited, status=226/NAMESPACE)
Sep 29 18:41:27 BerlinRP systemd[1]: redis-server.service: Main process exited, code=exited, >Sep 29 18:41:27 BerlinRP systemd[1]: redis-server.service: Failed with result 'exit-code'.
Sep 29 18:41:27 BerlinRP systemd[1]: Failed to start Advanced key-value store.
Sep 29 18:41:27 BerlinRP systemd[1]: redis-server.service: Scheduled restart job, restart cou>Sep 29 18:41:27 BerlinRP systemd[1]: Stopped Advanced key-value store.
Sep 29 18:41:27 BerlinRP systemd[1]: redis-server.service: Start request repeated too quickly.Sep 29 18:41:27 BerlinRP systemd[1]: redis-server.service: Failed with result 'exit-code'.
Sep 29 18:41:27 BerlinRP systemd[1]: Failed to start Advanced key-value store.
lines 1-16/16 (END)
Can anyone help me?
Hi guys, Today I add new 2 nodes into cluster and reshard, cluster worked, but I found some issues in Grafana, as you can see, my 7007 port Master nodes has slot [0-1364] [5461-6826] [10923-12287] but in grafana only shows 0-1364, I try to run cluster nodes command in grafana, It shows normal, how can I solve this problem? Thanks!
Seems like I am now getting every 10 days these emails from redis-cloud threatening to delete my free db for not being unused. It is supposed to be once every month - not every other week. It seems like they are trying to force users into buying paid subs they don't yet need. Seems rather sneaky if you asked me.
I'm not sure if I can do what I am trying to do. I have file metadata stored as Redis hashes. I am trying to search (using redisearch) and group by a particular field so all the items that have the same value for that field should be grouped together. If I use `aggregate` and `groupby` with `reduce`, it will give me a summary of the groups:
`ft.aggregate idx:files '*' groupby 1 @size reduce count 0 as nb_of_items limit 0 1000`
but that's not what I want. Is this going to have to be multiple steps handled client-side?
EDIT:
Adding some clarification. Here is what a typical hash looks like:
Field | Value |
---|---|
path | /mnt/user/downloads/New Text Document.txt |
nlink | 1 |
ino | 652459000385795943 |
size | 0 |
atimeMs | 1724706393280 |
mtimeMs | 1724706393284 |
ctimeMs | 1724760002387 |
birthtimeMs | 0 |
Running the above query, I get this:
I'm wanting something similar to this:
Reddit kept screwing up the formatting so I ended up taking images of the text. Sorry.
❯ sudo dnf install redis
Updating and loading repositories:
Repositories loaded.
Package Arch Version Repository Size
Installing:
valkey-compat-redis noarch 7.2.6-2.fc41 fedora 1.4 KiB
Installing dependencies:
valkey x86_64 7.2.6-2.fc41 fedora 5.3 MiB
Transaction Summary:
Installing: 2 packages
Total size of inbound packages is 2 MiB. Need to download 0 B.
After this operation, 5 MiB extra will be used (install 5 MiB, remove 0 B).
Is this ok [Y/n]:
[1/1] valkey-compat-redis-0:7.2.6-2.fc41.noarch 100% | 0.0 B/s | 0.0 B | 00m00s
>>> Already downloaded
[1/2] valkey-0:7.2.6-2.fc41.x86_64 100% | 0.0 B/s | 0.0 B | 00m00s
>>> Already downloaded
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[2/2] Total 100% | 0.0 B/s | 0.0 B | 00m00s
Running transaction
[1/4] Verify package files 100% | 333.0 B/s | 2.0 B | 00m00s
[2/4] Prepare transaction 100% | 7.0 B/s | 2.0 B | 00m00s
[3/4] Installing valkey-0:7.2.6-2.fc41.x86_64 100% | 93.6 MiB/s | 5.3 MiB | 00m00s
[4/4] Installing valkey-compat-redis-0:7.2.6-2.fc41.noarch 100% [==================] | 629.2 KiB/s | 2.5 KiB | -00m00s
>>> Running trigger-install scriptlet: glibc-common-0:2.40-3.fc41.x86_64warning: posix.fork(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.spawn() or rpm.execute() instead
warning: posix.wait(): .fork(), .exec(), .wait() and .redirect2null() are deprecated, use rpm.spawn() or rpm.execute() instead
[4/4] Installing valkey-compat-redis-0:7.2.6-2.fc41.noarch 100% | 5.2 KiB/s | 2.5 KiB | 00m00s
Complete!
❯ sudo systemctl enable redis
Failed to enable unit: Unit redis.service does not exist
I tried downloading Redis on Fedora Linux but for some reason it says that redis.service doesn't exist.
Any troubleshooting tips?
.....
I am using a redis-py client for querying a Redis Stack server for some user-provided query_str
, with basically the intent of building a user-facing text serach engine. I would like to seek advice regarding the following areas:
1. How to protect against query injection? I understand that Redis is not susceptible to query injection in its protocol, but as I am implementing this search client in Python, using a directly interpolated string as the query
argument of FT.SEARCH
will definitely cause issues if the user input contains reserved characters of the query syntax. Therefore, is passing the user query as PARAMS
or manually filtering out the reserved characters a better approach?
2. Parsing the user query into words/tokens. I understand that RediSearch does tokenization by itself. However, suppose that I pass the entire user query e.g. "the quick brown fox" as a parameter, it would be an exact phrase search as opposed to searching for "the" AND "quick" AND "brown" AND "fox". Such is what would happen in the implementation below:
from redis import Redis
from redis.commands.search.query import Query
client = Redis.from_url("redis://localhost:6379")
def search(query_str: str):
params = {"query_str": query_str}
query = Query("@text:$query_str").dialect(2).scorer("BM25")
return client.ft("idx:test").search(query, params)from redis import Redis
from redis.commands.search.query import Query
client = Redis.from_url("redis://localhost:6379")
def search(query_str: str):
params = {"query_str": query_str}
query = Query("@text:$query_str").dialect(2).scorer("BM25")
return client.ft("idx:test").search(query, params)
Therefore, I wonder what would be the best approach for tokenizing the user query, using preferably Python, so that it would be consistent with the result of RediSearch's tokenization rules.
3. Support for both English and Chinese. The documents stored in the database is of mixed English and Chinese. You may assume that each document is either English or Chinese, which would hold true for most cases. However, it would be better if there are ways to support mixed English and Chinese within a single document. The documents are not labelled with their languages though. Additionally, the user query could also be English, Chinese, or mixed.
The need to specify language is that for many European languages such as English, stemming is need to e.g. recognize that "jumped" is "jump" + "ed". As for Chinese, RediSearch has special support for its tokenization since it does not use space as word separators, e.g. phrases like "一个单词" would be like "一 个 单词" suppose that Chinese uses space to separate words. However, these language-specific RediSearch features require the explicit specification of the LANGUAGE
parameter both in indexing and search. Therefore, should I create two indices and detect language automatically somehow?
4. Support of Google-like search syntax. It would be great if the user-provided query can support Google-like syntax, which would then be translated to the relevant FT.SEARCH
operators. I would prefer to have this implemented in Python if possible.
This is a partial crosspost of this Stack Overflow question.
I'm currently conducting a survey to collect insights into user expectations regarding comparing various data formats. Your expertise in the field would be incredibly valuable to this research.
The survey should take no more than 10 minutes to complete. You can access it here: https://forms.gle/K9AR6gbyjCNCk4FL6
I would greatly appreciate your response!
Has anyone used redis stack with redisjson / redistimeseries for actual data storage? I store all our data as json and think Postgres is probably not the right tool.. so does anyone have experience in production setup with redis json ?
Redis Version: v7.0.12
Hello.
I have deployed a Redis Cluster in my Kubernetes Cluster using ot-helm/redis-operator
with the following values:
redisCluster:
redisSecret:
secretName: redis-password
secretKey: REDIS_PASSWORD
leader:
replicas: 3
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: test
operator: In
values:
- "true"
follower:
replicas: 3
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: test
operator: In
values:
- "true"
externalService:
enabled: true
serviceType: LoadBalancer
port: 6379
redisExporter:
enabled: true
storageSpec:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 10Gi
nodeConfVolumeClaimTemplate:
spec:
resources:
requests:
storage: 1Gi
After adding a couple of keys to the cluster, I stop the host machine (EC2 instance) where the Redis Cluster is deployed, and start it again. Upon the restart of the EC2 instance, and the Redis Cluster, the couple of keys that I have added before the restart disappear.
I have both persistence methods enabled (RDB & AOF), and this is my configuration (default) for Redis Cluster regarding persistency:
config get dir # /data
config get dbfilename # dump.rdb
config get appendonly # yes
config get appendfilename # appendonly.aof
I have noticed that during/after the addition of the keys/data in Redis, /data/dump.rdb
, and /data/appendonlydir/appendonly.aof.1.incr.aof
(within my main Redis Cluster leader) increase in size, but when I restart the EC2 instance, /data/dump.rdb
get back to 0 bytes, while /data/appendonlydir/appendonly.aof.1.incr.aof
stays at the same size that was before the restart.
I can confirm this with this screenshot from my Grafana dashboard while monitoring the persistent volume that was attached to main leader of the Redis Cluster. From what I understood, the volume contains both AOF, and RDB data until few seconds after the restart of Redis Cluster, where RDB data is deleted.
This is the Prometheus metric I am using in case anyone is wondering:
sum(kubelet_volume_stats_used_bytes{namespace="test", persistentvolumeclaim="redis-cluster-leader-redis-cluster-leader-0"}/(1024*1024)) by (persistentvolumeclaim)
So, Redis Cluster is actually backing up the data using RDB, and AOF, but as soon as it is restarted (after the EC2 restart), it loses RDB data, and AOF is not enough to retrieve the keys/data for some reason.
Here are the logs of Redis Cluster when it is restarted:
ACL_MODE is not true, skipping ACL file modification
Starting redis service in cluster mode.....
12:C 17 Sep 2024 00:49:39.351 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
12:C 17 Sep 2024 00:49:39.351 # Redis version=7.0.12, bits=64, commit=00000000, modified=0, pid=12, just started
12:C 17 Sep 2024 00:49:39.351 # Configuration loaded
12:M 17 Sep 2024 00:49:39.352 * monotonic clock: POSIX clock_gettime
12:M 17 Sep 2024 00:49:39.353 * Node configuration loaded, I'm ef200bc9befd1c4fb0f6e5acbb1432002a7c2822
12:M 17 Sep 2024 00:49:39.353 * Running mode=cluster, port=6379.
12:M 17 Sep 2024 00:49:39.353 # Server initialized
12:M 17 Sep 2024 00:49:39.355 * Reading RDB base file on AOF loading...
12:M 17 Sep 2024 00:49:39.355 * Loading RDB produced by version 7.0.12
12:M 17 Sep 2024 00:49:39.355 * RDB age 2469 seconds
12:M 17 Sep 2024 00:49:39.355 * RDB memory usage when created 1.51 Mb
12:M 17 Sep 2024 00:49:39.355 * RDB is base AOF
12:M 17 Sep 2024 00:49:39.355 * Done loading RDB, keys loaded: 0, keys expired: 0.
12:M 17 Sep 2024 00:49:39.355 * DB loaded from base file appendonly.aof.1.base.rdb: 0.001 seconds
12:M 17 Sep 2024 00:49:39.598 * DB loaded from incr file appendonly.aof.1.incr.aof: 0.243 seconds
12:M 17 Sep 2024 00:49:39.598 * DB loaded from append only file: 0.244 seconds
12:M 17 Sep 2024 00:49:39.598 * Opening AOF incr file appendonly.aof.1.incr.aof on server start
12:M 17 Sep 2024 00:49:39.599 * Ready to accept connections
12:M 17 Sep 2024 00:49:41.611 # Cluster state changed: ok
12:M 17 Sep 2024 00:49:46.592 # Cluster state changed: fail
12:M 17 Sep 2024 00:50:02.258 * DB saved on disk
12:M 17 Sep 2024 00:50:21.376 # Cluster state changed: ok
12:M 17 Sep 2024 00:51:26.284 * Replica 192.168.58.43:6379 asks for synchronization
12:M 17 Sep 2024 00:51:26.284 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '995d7ac6eedc09d95c4fc184519686e9dc8f9b41', my replication IDs are '654e768d51433cc24667323f8f884c66e8e55566' and '0000000000000000000000000000000000000000')
12:M 17 Sep 2024 00:51:26.284 * Replication backlog created, my new replication IDs are 'de979d9aa433bf37f413a64aff751ed677794b00' and '0000000000000000000000000000000000000000'
12:M 17 Sep 2024 00:51:26.284 * Delay next BGSAVE for diskless SYNC
12:M 17 Sep 2024 00:51:31.195 * Starting BGSAVE for SYNC with target: replicas sockets
12:M 17 Sep 2024 00:51:31.195 * Background RDB transfer started by pid 218
218:C 17 Sep 2024 00:51:31.196 * Fork CoW for RDB: current 0 MB, peak 0 MB, average 0 MB
12:M 17 Sep 2024 00:51:31.196 # Diskless rdb transfer, done reading from pipe, 1 replicas still up.
12:M 17 Sep 2024 00:51:31.202 * Background RDB transfer terminated with success
12:M 17 Sep 2024 00:51:31.202 * Streamed RDB transfer with replica 192.168.58.43:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
12:M 17 Sep 2024 00:51:31.203 * Synchronization with replica 192.168.58.43:6379 succeeded
Here is the output of INFO PERSISTENCE
redis-cli command, after the addition of some data:
# Persistence
loading:0
async_loading:0
current_cow_peak:0
current_cow_size:0
current_cow_size_age:0
current_fork_perc:0.00
current_save_keys_processed:0
current_save_keys_total:0
rdb_changes_since_last_save:0
rdb_bgsave_in_progress:0
rdb_last_save_time:1726552373
rdb_last_bgsave_status:ok
rdb_last_bgsave_time_sec:0
rdb_current_bgsave_time_sec:-1
rdb_saves:5
rdb_last_cow_size:1093632
rdb_last_load_keys_expired:0
rdb_last_load_keys_loaded:0
aof_enabled:1
aof_rewrite_in_progress:0
aof_rewrite_scheduled:0
aof_last_rewrite_time_sec:-1
aof_current_rewrite_time_sec:-1
aof_last_bgrewrite_status:ok
aof_rewrites:0
aof_rewrites_consecutive_failures:0
aof_last_write_status:ok
aof_last_cow_size:0
module_fork_in_progress:0
module_fork_last_cow_size:0
aof_current_size:37092089
aof_base_size:89
aof_pending_rewrite:0
aof_buffer_length:0
aof_pending_bio_fsync:0
aof_delayed_fsync:0
In case anyone is wondering, the persistent volume is attached correctly to the Redis Cluster in /data
mount path. Here is a snippet of the YAML definition of the main Redis Cluster leader (this is automatically generated via Helm & Redis Operator):
apiVersion: v1
kind: Pod
metadata:
name: redis-cluster-leader-0
namespace: test
[...]
spec:
containers:
[...]
volumeMounts:
- mountPath: /node-conf
name: node-conf
- mountPath: /data
name: redis-cluster-leader
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-7ds8c
readOnly: true
[...]
volumes:
- name: node-conf
persistentVolumeClaim:
claimName: node-conf-redis-cluster-leader-0
- name: redis-cluster-leader
persistentVolumeClaim:
claimName: redis-cluster-leader-redis-cluster-leader-0
[...]
I have already spent a couple of days on this issue, and I kind of looked everywhere, but in vain. I would appreciate any kind of help guys. I will also be available in case any additional information is needed. Thank you very much.
We have been running redis in master/replica mode for a while now for disaster recovery. Each instance of our product is running in a different datacenter and each one has redis running in a single pod. When the master goes down, we swap the roles and the replica becomes the master.
Now we want to upgrade both instances to have multiple redis instances so that we can survive a single pod (or worker node) issue without causing a master/replica role switch.
Is this possible? Do we need redis enterprise?