cross-posted from: https://beehaw.org/post/282116
We’ve posted a number of times about our increasing storage issues. We’re currently at the cusp of using 80% of the 25gb we have available in the current tier for the online service we run this instance on. This has caused some issues with the server crashing in recent days.
We’ve been monitoring and reporting on this progress occasionally, including support requests and comments on the main lemmy instance. Of particular note, it seems that pictures tend to be the culprit when it comes to storage issues.
The last time a discussion around pict-rs came up, the following comment stuck out to me as a potential solution
Storage requirements depend entirely on the amount of images that users upload. In case of slrpnk.net, there are currently 1.6 GB of pictrs data. You can also use s3 storage, or something like sshfs to mount remote storage.
Is there anyone around who is technically proficient enough to help guide us through potential solutions using “something like sshfs” to mount remote storage? As it currently exists, our only feasible option seems to be upgrading from $6/month to $12/month to double our current storage capacity (25GB -> 50 GB) which seems like an undesirable solution.
- nutomic ( @nutomic@lemmy.ml ) 7•2 years ago
If you search for sshfs you can find lots of different guides, like the one below. Basically you need a normal ssh login for another server, and use that with sshfs command to mount a remote folder to the pictrs folder on your Lemmy server.
If this setup is slow then you can setup caching in nginx for image files on your fast server SSD. That way only a fixed amount of storage will be used to store frequently loaded images. Only images which are older or viewed less frequently will be slower to load as they need to be fetched from the remote server.
https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/
- poVoq ( @poVoq@slrpnk.net ) 4•2 years ago
Has Pict-rs implemented those changes to reduce image size already? My guess would be that maybe it is sufficient to just prune older large images with a script?
Edit: looks like Pict-rs 4.0 is still in beta, but it should probably fix this issue to some extend, so only older images would need to be pruned or down-scaled.
- wintermute ( @wintermute@feddit.de ) Deutsch3•2 years ago
I saved a few gig by capping syslog files by setting
SystemMaxUse=500M SystemMaxFileSize=50M
in
/etc/systemd/journald.conf
I’ve already implemented this as well as capping the Docker log files.
- poVoq ( @poVoq@slrpnk.net ) 3•2 years ago
Besides my practical comment below: This is not something I can code myself, but I have been thinking if pict-rs implements some sort of image deduplication system, then it could be interesting for some Lemmy instances to form a data-storage collective and run a combined S3 backend via Garage for example:
https://garagehq.deuxfleurs.fr/documentation/connect/apps/#lemmy
I am willing to contribute storage (I have several TB), but I am somewhat bandwidth limited, so I need to be a bit careful with hosting too many images to not impact the other services that I run on the same connection.
- Gaywallet (they/it) ( @Gaywallet@beehaw.org ) 3•2 years ago
I am willing to contribute storage (I have several TB), but I am somewhat bandwidth limited, so I need to be a bit careful with hosting too many images to not impact the other services that I run on the same connection.
How would you accomplish this? I have plenty of bandwidth and plenty of storage I can subsection as a possible solution (hell even buying a raspberry pi and an old hard drive wouldn’t be all that expensive and potentially a fun project) but I really don’t even have an idea of how to connect this to the lemmy instance
- poVoq ( @poVoq@slrpnk.net ) 2•2 years ago
See the link above. You can configure Pict-rs 4.0 (beta, unreleased) to a S3 compatible storage. The Garage project is a S3 compatible storage especially aimed at distributed self-hosters, but with some latency caveats aside something like Minio would probably also work.
S3 storage allows to redirect users directly to a storage location (with a public IP) instead of the main server loading images from a storage location and serving that to the user itself. Kind of like a CDN works.
- Gaywallet (they/it) ( @Gaywallet@beehaw.org ) 1•2 years ago
Is there any way to do this and avoid having to use S3? I don’t want a surprise bill from Amazon because we exceeded some thresholds they have on the free tier (nor do I want to have to make new free tiers every 12 months).
- sexy_peach ( @sexy_peach@feddit.de ) English2•2 years ago
s3 is only the api. Every S3 compatible storage should work afaik
- Gaywallet (they/it) ( @Gaywallet@beehaw.org ) English2•2 years ago
Okay so I need to be sure I have something that can make sense of s3 calls to storage, I feel like we’re getting closer, just still way out of my own technological depth.
- poVoq ( @poVoq@slrpnk.net ) 3•2 years ago
Pict-rs that Lemmy uses as the image storage is able to do S3 compatible storage API calls in the upcoming 4.0 version.
There are also many self-hosted options that you can install on your server to provide a S3 compatible storage API. The probably best known open-source software for that is called Minio. It is however more meant for data-centers with fast low latency connections or local network only.
The above mentioned Garage software is unique in that it is specifically designed to work in less than ideal networking conditions typical to self-hosted servers.
- sexy_peach ( @sexy_peach@feddit.de ) English2•2 years ago
backblaze and wasabi have s3 compatible storage I think