Sourceware currently has a life backup system (running in the same rack as the main machine) that could take over the main server if it goes down. But we don't have a full offsite backup system (even though most project data is mirrored in various places). We should make a plan and budget for having a full system backup offsite. How much data changes per hour/day which needs to be stored offsite. And write up procedures for how to (re)create a life system from that in case of emergency.
mark at klomp dot org via Overseers <overseers@sourceware.org> writes: > https://sourceware.org/bugzilla/show_bug.cgi?id=29644 > > Bug ID: 29644 > Summary: Offsite backup/system > Product: sourceware > Version: unspecified > Status: NEW > Severity: normal > Priority: P2 > Component: Infrastructure > Assignee: overseers at sourceware dot org > Reporter: mark at klomp dot org > Target Milestone: --- > > Sourceware currently has a life backup system (running in the same rack as the > main machine) that could take over the main server if it goes down. But we > don't have a full offsite backup system (even though most project data is > mirrored in various places). > > We should make a plan and budget for having a full system backup offsite. How > much data changes per hour/day which needs to be stored offsite. And write up > procedures for how to (re)create a life system from that in case of emergency. At the FSF, we do nightly backups to a physical machine we run with all free software including the bios, and monthly backups to tape that are manually stored offline in a separate location for disaster recovery. I bet we have plenty of storage room to backup sourcware's data too (how much are we talking?). I presume it would make sense to allow the sourceware servers to write and read backups, but not delete them without a human being involved. Note: I think the right term is "live system" instead of "life system".
Due to an rsyncd setup in effect on sourceware, it is possible to read-only rsync the entire corpus of sourceware data to an off-site backup volume. In fact I have such a thing going as we speak, and <2TB of live data updates within an hour or so. (It's comforting that the system does not store confidential data, so user privacy is not seriously at risk.) I'll experiment with a local VM restore and report & write up a SOP on the sourceware-wiki, at which point I think we're done.
Another option (or maybe a second backup) might be the OSUOSL Backups service https://osuosl.org/services/hosting/details/#backups Backups This service is to be used for disaster recovery rather than data recovery, meaning we keep backups for a limited period of time (usually long enough to provide a couple of full data sets that can be used to rebuild a server as opposed to recovering files from long ago). We currently utilize rdiff-backup for file storage backups and a variety of other tools for database backups.