Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We've been using JuiceFS in production for a few months now and I'm a big fan. I've felt for a while that block-level filesystems do not adapt at all well to being implemented across a network (with my personal experience being of AWS EBS and OpenEBS Mayastor). So the fact that JuiceFS is interfaces at the POSIX layer felt intuitively better to me.

I also like that it can keep a local read-cache, rather than having to hit object storage for every read. This is because it can perform a freshness check with with the (relatively fast) metadata store to determine if its cached data is valid, prior to serving the request from cache.

We back it with a 3-node (Redis-compatible) HA Valkey cluster, and in-cluster MinIO object storage, all in bare-metal Kubernetes. We can saturate a 25g NIC with (IIRC) 16+ concurrent users.

It is also one of the few Kubernetes storage providers that provides read-write-many (RWX) access. Which can also be rather helpful in some situations.

In an early test we were running it against MinIO with zero redundancy. Which is not recommended in any case. There we did see some file corruption creep in. In which case some files in JuiceFS become unreadable, but the system as a whole kept working.

Another reason I think JuiceFS works well, is indeed because of its custom block-based storage format. It is disconcerting because you cannot see your files in object storage, but instead just a lot of chunks. But this does buy some real performance benefits, especially when doing partial file reads or updates.

Another test we're doing is running a small-to-medium sized Prometheus persisted to JuiceFS. It hasn't shown any issues so far.

And, if you've made it this, far: check us out if you want a hand installing and operating this kind of infra: https://lithus.eu . We deploy to bare-metal Hetzner.



> There we did see some file corruption creep in

Did you figure out what caused corruption? Was minio losing blocks or was juicefs corrupted even though minio was consistent?


It was definitely MinIO related, I probably should have made that clearer. We noticed that with zero fault tolerance, MinIO objects would randomly become corrupted, which MinIO would present as "you're making too many requests, please slow down". We were certainly not making too many requests.


What is your plan after MinIO enters maintenance mode?


We're looking at alternatives, I've made some previous comments on that front. Sadly MinIO was the only option with sufficient performance for this particular situation. Thankfully we're not using any MinIO-specific features, so at least the migration path away is clear.


Ceph. The answer is always Ceph.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: