Bug 29644

Summary: Offsite backup/system
Product: sourceware Reporter: Mark Wielaard <mark>
Component: InfrastructureAssignee: overseers mailing list <overseers>
Status: NEW ---    
Severity: normal CC: fche
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Mark Wielaard 2022-10-02 21:39:34 UTC
Sourceware currently has a life backup system (running in the same rack as the main machine) that could take over the main server if it goes down. But we don't have a full offsite backup system (even though most project data is mirrored in various places).

We should make a plan and budget for having a full system backup offsite. How much data changes per hour/day which needs to be stored offsite. And write up procedures for how to (re)create a life system from that in case of emergency.
Comment 1 iank 2022-10-03 04:18:58 UTC
mark at klomp dot org via Overseers <overseers@sourceware.org> writes:

> https://sourceware.org/bugzilla/show_bug.cgi?id=29644
>
>             Bug ID: 29644
>            Summary: Offsite backup/system
>            Product: sourceware
>            Version: unspecified
>             Status: NEW
>           Severity: normal
>           Priority: P2
>          Component: Infrastructure
>           Assignee: overseers at sourceware dot org
>           Reporter: mark at klomp dot org
>   Target Milestone: ---
>
> Sourceware currently has a life backup system (running in the same rack as the
> main machine) that could take over the main server if it goes down. But we
> don't have a full offsite backup system (even though most project data is
> mirrored in various places).
>
> We should make a plan and budget for having a full system backup offsite. How
> much data changes per hour/day which needs to be stored offsite. And write up
> procedures for how to (re)create a life system from that in case of emergency.

At the FSF, we do nightly backups to a physical machine we run with all
free software including the bios, and monthly backups to tape that are
manually stored offline in a separate location for disaster recovery. I
bet we have plenty of storage room to backup sourcware's data too (how
much are we talking?). I presume it would make sense to allow the
sourceware servers to write and read backups, but not delete them
without a human being involved.

Note: I think the right term is "live system" instead of "life system".
Comment 2 Frank Ch. Eigler 2022-10-03 16:37:13 UTC
Due to an rsyncd setup in effect on sourceware, it is possible to read-only rsync the entire corpus of sourceware data to an off-site backup volume.  In fact I have such a thing going as we speak, and <2TB of live data updates within an hour or so.  (It's comforting that the system does not store confidential data, so user privacy is not seriously at risk.)

I'll experiment with a local VM restore and report & write up a SOP on the sourceware-wiki, at which point I think we're done.
Comment 3 Mark Wielaard 2022-10-06 17:40:45 UTC
Another option (or maybe a second backup) might be the OSUOSL Backups service
https://osuosl.org/services/hosting/details/#backups

  Backups

  This service is to be used for disaster recovery rather than data recovery,
  meaning we keep backups for a limited period of time (usually long enough to
  provide a couple of full data sets that can be used to rebuild a server as
  opposed to recovering files from long ago). We currently utilize rdiff-backup
  for file storage backups and a variety of other tools for database backups.