/r/DataHoarder
This is a sub that aims at bringing data hoarders together to share their passion with like minded people.
Who are we?
We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Timetm). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.
We are one. We are legion. And we're trying really hard not to forget.
-- /u/5-4-3-2-1-bang from this thread
Links!!
Free Post Friday
On Fridays we'll allow posts that don't normally fit in the usual data-hoarding theme, including posts that would usually be removed by rule 4: “No memes or 'look at this [thing]'”
Just make sure to tag the post with the flair [Free-Post Friday!] and give a little background info/context.
Related Subreddits
Data Hoarding/Curation:
Servers and Homelabs:
Tech Support:
Sales & Marketplace:
/r/DataHoarder
With the way things are going, I wouldn't be surprised if Internet Archive became a target for censorship. Does anyone know if there are backups hosted in other countries or plans to move their data?
In a 2016 blog post, they mentioned that they were planning to host a copy of the archive in Canada and that they have partial copies hosted in Egypt and the Netherlands. Is that still relevant information?
Bit of a niche case here, but I have an Asus ROG STRIX X870E-E GAMING WIFI mobo in a system that has a lot of HDDs, and the board only has x4 SATA slots.
I'm trying to avoid using my 2nd PCIe slot so that the primary doesn't go to x8.
Is there an m.2 to mini SAS (x2) option out there that's reliable?
I found this but I don't know if it would work. Plus I don't know how to tell if my m.2 slots are SATA or NVME slots... that's important right?
Any opinions or suggestions are welcome!
Thank you.
https://brickshelf.com/ is shutting down March 1st.
I’m not well versed in scraping it would be sad to see so many Lego albums be deleted and there’s lots of custom instructions on there too.
First of all, it sounds like a lot of you are doing really important work in light of what's going on.
This a really simple task for you guys, but it's currently way beyond my skill set.
I ran a website that used a template from a company called Zenfolio. I still have it, but just want to download all of the blog entries, ideally with pics.
I haven't been sure who to ask until I saw this sub mentioned a lot today. An ELI10 format would be very much appreciated.
Hey,
I currently run 2x12TB and want to add more storage.
My main options are:
Buy 2x16TB and make two mirrors, for a total of 28TB
OR
Buy 2x12TB and make a raidz1 for a total of 36TB
Obviously the second option is not only cheaper but also provides more storage.
The problem is, that the second option will lock me more into the 12TB, while the first allows me to more easily extend with 16TB Drives in the future.
Is it still worth it to go with 12TB drives or will prices of higher capacity drives drop quickly enough to already start with a 16TB array?
Never thought I'd have to think this, much less say it, but to all those of you who save humanity's data, I salute you
you all are heroes in a super weird world
I always remember hearing storage was really expensive, and with mechanical drives growing up, higher capacities being more likely to give out with a lot of use. How is storage in current era and fail rates? I'm still using about 4TB between two drives.
I had this idea tonight of recording live tv news channels (digitally) on a 2 - 3 day loop? like could be done with pc or an SBC? I have a raspberry pi 5? Or iptv?
Ive been downloading tiktoks for the past few weeks and my archive has gotten up to over 400k videos and almost 2 TBs of room. I've still got another 300ish creators to download so I've got a bit of time but what would be the best practices for ensuring that I never lose my archive? I know that raid exists and that raid 6 has dual parity so you can lose up to 2 drives without losing any data but that's about as much as I know. I have a bit of money but id like to do it fairly cheap if at all possible without adding too much labor or failure points.
Also, ive only ever used windows machines regularly so Linux will be a challenge but im up to it if that's the best choice.
Y’all probably feel so justified right now… it’s like being a survivalist/doomsday packer and the zombie apocalypse just happens.
Appreciate y’all
(And of course this is ignoring the genuine fear, insecurity, and worries people are experiencing)
Hi fellow hoarders, I noticed the detailed data downloads from the census bureau (the ftp site) is down right now. Is this a coincidence or just routine maintenance?
https://www2.census.gov/geo/tiger/TIGER2024/
I would like to save all of this down as I use it for a lot of personal and professional work. And it's just cool.
the last 1080p story I saved was January 6, and all 35 stories I've ripped since then are 720p. very disappointing as if I knew I would have screen recorded. has Instagram blocked apps from ripping stories at max bitrate?
what apps or websites are u guys using?
Hello,
I recently found 2 Samsung SSDs (980 and 990) that I lost 6 months ago. Apparently, someone from the household was "cleaning" and accidentally left these SSDs outside. Oops ...
I live in the PNW. It's humid here and has been freezing for a week, with temps dropping to 25F. These are/were brand new hard drives, nothing stored on them.
Would you think the hard drives survived? Would this type of memory "survive" freezing temperatures? How to "double" check the integrity of these drives?
Thank you
Is there any group organizing an effort to create a shadow instance of "vital sites and information"? I would be willing to bet that many of us have at least some spare space and the ability to host things like cdc.screwfascists.com or whatever to make sure that things are continued. Maybe this could be the beginning of a trusted decentralized register of scientific and historical data. Not to step on Wikipedia's toes.
Given the news I'm planning on turning my TubeArchivist instance for good. I don't think these are in the EOT archives, but if they are feel free to ignore me.
So far I'm collecting:
I'm sure there's more, but the first two are my highest priority right now, I've had a handful of videos removed already.
Hello All,
I was going to use the BRFSS Places for a project. But the site is down
Does anyone have the BRFSS places 500 data by census tract for the latest year?
I would really appreciate it.
I've got a QNAP 873A 8-bay NAS. Four Samsung QVO 870 8TB drives in a 32TB RAID0 array. Four 16TB EXOS X16s in a RAID6 configuration. HBS3 is set to real-time sync data from the RAID0 array to the RAID6. The SSD array is also backed up to Backblaze B2 nightly. The NAS has 64GB of RAM and a QUADRO T1000 8GB GPU, stock fans have been replaced with Noctua NF-P12's, all ethernet ports have been fitted with surge protection.
I mean, is that good enough, already? How much more can I do lol
I just wanted to ask if there's a way to help your efforts to save and archive public data from Trump's actions.
I got an Unraid setup at home and I want to do something to help you all out, because knowledge is so damn important.
Is there a simple Docker container I could set up? Can I lend a hand somehow?
I hope this is the right sub...
Thanks in advance xxo
I'm seeking a refurb 14-16 TB internal SATA drive for backup and storing (not connected) purposes. So, it's not critical this be working perfect everyday for the next 5 years if you get what I'm saying here.
It appears all the deals have dried up after XMAS. How long must I wait until these type of deals re-appear?
is there any benefit to using yt-dlp commandline as opposed to just jdownloader?
I need more storage but I’m limited on my budget. I need people’s opinion.
2 8TB Drives and mirror them Or 1 20TB Drive and add and other later to mirror.
Is it safe ish to run 1 drive for say 6-8 months before I can get another?
Hey everyone! We're excited to announce the release of SOSSE v1.12.0, the latest version of our open-source web archiving software, crawler, and search engine.
For those unfamiliar, SOSSE (Selenium Open Source Search Engine) lets you:
📖 Full docs: https://sosse.readthedocs.io/
🐙 GitHub: https://github.com/biolds/sosse
🦊 GitLab: https://gitlab.com/biolds1/sosse
💬 Join us on Discord: https://discord.gg/Vt9cMf7BGK
We're running a short survey to help prioritize new features and gauge interest in professional support. If you've used SOSSE or are interested, please take a moment to fill it out:
➡️ https://framaforms.org/202502-sosse-survey-1738309561
Your feedback is invaluable! Let us know what you think about v1.12.0! 🚀