Bug 29618

Summary: Archive projects at Software Heritage
Product: sourceware Reporter: Mark Wielaard <mark>
Component: InfrastructureAssignee: overseers mailing list <overseers>
Status: RESOLVED FIXED    
Severity: normal CC: fche, pabs3
Priority: P2    
Version: unspecified   
Target Milestone: ---   
Host: Target:
Build: Last reconfirmed:

Description Mark Wielaard 2022-09-27 13:24:14 UTC
Besides 25 active projects, sourceware also contains the history of another 40 projects that have either stopped being developed or have moved on completely to other hosting services. We should make sure to properly archive at least the source code repositories at https://www.softwareheritage.org/

We should also setup an automatic capture of all active code repositories:
https://archive.softwareheritage.org/add-forge/request/create/

This will also help with our secure supply chain story because some upstreams check that sha hashes can (also) be found in the sotwareheritage archives.
Comment 1 Paul Wise 2023-05-16 09:43:20 UTC
I have submitted the Sourceware cgit instance to the Software Heritage add forge interface, they will contact sourcemaster@sourceware.org for approval at some point.

https://sourceware.org/cgit
Comment 2 Paul Wise 2023-05-16 09:44:29 UTC
I have also submitted the cgit instance for a once-off upload to archive.org via ArchiveTeam's Codearchiver project:

https://wiki.archiveteam.org/index.php/Codearchiver
Comment 3 Mark Wielaard 2023-06-04 17:52:10 UTC
(In reply to Paul Wise from comment #1)
> I have submitted the Sourceware cgit instance to the Software Heritage add
> forge interface, they will contact sourcemaster@sourceware.org for approval
> at some point.
> 
> https://sourceware.org/cgit

Thanks, the request for approval was acknowledged.

There are also:

https://cygwin.com/cgit
https://gcc.gnu.org/cgit
https://git.dwarfstd.org/

Which might be good to have archived too.

Then there are older subversion and cvs repos.
I don't believe we have a public list of those.
Comment 4 Mark Wielaard 2023-06-04 18:03:30 UTC
There seem to be only three subversion repos:

svn checkout svn://sourceware.org/svn/prelink/
svn checkout svn://sourceware.org/svn/kawa/
svn checkout svn://gcc.gnu.org/svn/gcc

The gcc one has been converted to git already so isn't essential.
Comment 5 Mark Wielaard 2023-06-04 18:15:17 UTC
cvs is a bit trickier, but I believe this is the list to checkout the public repos and modules:

cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/autobook co autobook
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/autoconf co autoconf
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/automake co automake
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co cluster
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co conga
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co felix
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co ftp
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co crypt
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co csih
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cvsmaint
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygrunsrv
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygupdate
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygutils
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygwin-doc
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co editrights
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co genini
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co libgetopt++
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co login
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co mt
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co packaging
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co rebase
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co resedit
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co robots
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co run
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co setup
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co shutdown
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co up2date
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co upload
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co windows-default-manifest
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co wrappers
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co device-mapper
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co dmraid
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co multipath-tools
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co people
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/docbook-tools co docbook-tools
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co autotools
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co branding
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co bugzilla
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co changelog
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co cheatsheets
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co mylyn
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co org.eclipse.team.bugs
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co org.sourceware.update
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co pydev
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co rpm
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/ecos co ecos
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/ecos co ecos-opt
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/elix co elix
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-common
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-core
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-gtk
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-gui
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-imports
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-sys
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-top
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-utrace
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co benchmarks
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co boehm-gc
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co gcc
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co old-gcc
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co repository
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co testrun
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co wwwdocs
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gettext co gettext
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/glibc co libc
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/glibc co linuxthreads
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/glibc co ports
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gsl co gsl
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/guile co guile
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/guile co guile-modules
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/installshell co installshell
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/inti co inti
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/ip-over-scsi co ipscsi
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/java co libgcj
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/kawa co kawa
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/kawa co software
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libaio co ftp
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libaio co libaio
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libffi co libffi
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libstdc++ co libstdc++
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm co ftp
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm co LVM
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm2 co ftp
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm2 co LVM2
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm2 co system-config-lvm
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co builder
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co jacks
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co mauve
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co rmic-tests
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co serialization
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co verify
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co wonka
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mingw co mingw
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mingw co w32api
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/patchutils co ftp
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/patchutils co patchutils
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/piranha co code
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/procps co procps
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/pthreads-win32 co bossom
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/pthreads-win32 co pthreads
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhdb co src
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co ant-for-jhbuild
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co ecj-for-jhbuild
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co eclipse-gcj
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co fakejdk
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co gcj-jit
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co gcjwebplugin-test
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co java-gcj-compat
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co rhug
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rpm2html co rpm2html
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rpm2html co rpmfind
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/sourcenav co src
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co morpho
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co newlib
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co sockets
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co src
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co win32-x11
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co archpaper
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co doc
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co froggy
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co kprobes_test
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co patches
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co src
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co tests
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co utracer
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/xconq co xconq
cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/XOpenWin co dlls

These are all dormant and either converted to git to moved somewhere else.
Comment 6 Paul Wise 2023-06-05 00:45:15 UTC
I have submitted the three cgit instances to the Software Heritage forge registration. I submitted to the Software Heritage save code now interface; all the git repos for the GCC/DWARF cgit instances, all the SVN repos and I'm in the process of submitting the many git repos for the Cygwin cgit instance. I tried submitting the CVS repos and got an error, so I asked on the #swh IRC channel on Libera and am waiting for a response. All of the SVN and CVS repos will require SWH approval before they get archived.

I also pointed the ArchiveTeam Codearchiver folks at the cgit/svn/cvs lists above.
Comment 7 jsm-csl@polyomino.org.uk 2023-06-05 18:41:21 UTC
The best way to archive a CVS repository is probably via rsync, see 
https://gcc.gnu.org/rsync.html for anonymous rsync instructions; I don't 
think CVS pserver protocol is well-designed for replicating complete 
repository contents.  (SVN repositories are also available for rsync, 
though svnrdump should also work for those.)
Comment 8 Mark Wielaard 2023-06-05 20:02:51 UTC
Yes, indeed rsync would be a better and simpler way to mirror the cvs repos. 
The cvs code repros as rsync are:

rsync://sourceware.org/autobook-cvs
rsync://sourceware.org/autoconf-cvs
rsync://sourceware.org/automake-cvs
rsync://sourceware.org/binutils-cvs
rsync://sourceware.org/bzip2-cvs
rsync://sourceware.org/catapult-cvs
rsync://sourceware.org/cgen-cvs
rsync://sourceware.org/cluster-cvs
rsync://sourceware.org/cygwin-cvs
rsync://sourceware.org/dm-cvs
rsync://sourceware.org/dominion-cvs
rsync://sourceware.org/eclipse-cvs
rsync://sourceware.org/ecos-cvs
rsync://sourceware.org/elix-cvs
rsync://sourceware.org/frysk-cvs
rsync://sourceware.org/gcc-cvs
rsync://sourceware.org/gdb-cvs
rsync://sourceware.org/gettext-cvs
rsync://sourceware.org/glibc-cvs
rsync://sourceware.org/gsl-cvs
rsync://sourceware.org/guile-cvs
rsync://sourceware.org/ha-cvs
rsync://sourceware.org/insight-cvs
rsync://sourceware.org/inti-cvs
rsync://sourceware.org/java-cvs
rsync://sourceware.org/jffs2-cvs
rsync://sourceware.org/kawa-cvs
rsync://sourceware.org/libabigail-cvs
rsync://sourceware.org/libaio-cvs
rsync://sourceware.org/libffi-cvs
rsync://sourceware.org/libstdc++-cvs
rsync://sourceware.org/lvm2-cvs
rsync://sourceware.org/lvm-cvs
rsync://sourceware.org/mauve-cvs
rsync://sourceware.org/mingw-cvs
rsync://sourceware.org/netresolve-cvs
rsync://sourceware.org/newlib-cvs
rsync://sourceware.org/patchutils-cvs
rsync://sourceware.org/piranha-cvs
rsync://sourceware.org/procps-cvs
rsync://sourceware.org/psim-cvs
rsync://sourceware.org/rda-cvs
rsync://sourceware.org/rhdb-cvs
rsync://sourceware.org/rhl-cvs
rsync://sourceware.org/rhug-cvs
rsync://sourceware.org/rpm2html-cvs
rsync://sourceware.org/sharutils-cvs
rsync://sourceware.org/sid-cvs
rsync://sourceware.org/sourcenav-cvs
rsync://sourceware.org/src-cvs
rsync://sourceware.org/systemtap-cvs
rsync://sourceware.org/testcvs-cvs
rsync://sourceware.org/win32-x11-cvs
rsync://sourceware.org/xconq-cvs
rsync://sourceware.org/XOpenWin-cvs
Comment 9 Paul Wise 2023-06-06 01:03:28 UTC
Yeah, the SWH folks said on IRC that the CVS archiving expects an rsync URL.

The current status is now:

Regular forge listing and repo archiving enabled for all four cgit instances.

All the git repositories on sourceware.org/cgit that I submitted already got archived. I noticed that I missed the cygwin-apps ones there though, submitting them now.

All the git repositories on gcc.gnu.org/cgit and git.dwarfstd.org that I submitted got archived.

The submissions for cygwin.com/cgit are still ongoing.

The submissions for sourceware.org SVN repos: all accepted, kawa/prelink finished, gcc ongoing.

The submissions for sourceware CVS repos are still ongoing.
Comment 10 Paul Wise 2023-06-10 01:12:46 UTC
The current status of the forges:

* Accepted: https://sourceware.org/cgit 
* First origin loaded: https://cygwin.com/cgit https://gcc.gnu.org/cgit 	https://git.dwarfstd.org/

I've asked on the #swh IRC channel about the process from Accepted to First origin loaded.
Comment 11 Paul Wise 2023-06-10 01:40:06 UTC
The status of individual repos:

* https://sourceware.org/git/*: all requested, accepted and succeeded
* https://cygwin.com/git/*: all requested, accepted and scheduled/succeeded
* https://gcc.gnu.org/git/*: all requested, accepted and succeeded
* https://git.dwarfstd.org/*: all requested, accepted and succeeded
* svn://sourceware.org/svn/{prelink,kawa}/: all requested, accepted and succeeded
* svn://gcc.gnu.org/svn/gcc: requested, accepted, running (but a previous run from last month failed)
* rsync://sourceware.org/*: all requested, accepted/rejected (see below), and succeeded/failed (see below)

These are the failures for the CVS repos:

rsync://sourceware.org/testcvs-cvs/gcc
rsync://sourceware.org/systemtap-cvs/private
rsync://sourceware.org/src-cvs/src
rsync://sourceware.org/src-cvs/newlib
rsync://sourceware.org/src-cvs/htdocs
rsync://sourceware.org/gcc-cvs/testrun
rsync://sourceware.org/gcc-cvs/repository
rsync://sourceware.org/libaio-cvs/htdocs
rsync://sourceware.org/gcc-cvs/gcc
rsync://sourceware.org/procps-cvs/procps
rsync://sourceware.org/frysk-cvs/frysk
rsync://sourceware.org/lvm2-cvs/system-config-lvm
rsync://sourceware.org/lvm2-cvs/htdocs
rsync://sourceware.org/lvm2-cvs/ftp
rsync://sourceware.org/procps-cvs/htdocs.saf
rsync://sourceware.org/lvm2-cvs/LVM2
rsync://sourceware.org/netresolve-cvs/htdocs
rsync://sourceware.org/eclipse-cvs/htdocs
rsync://sourceware.org/dominion-cvs/htdocs
rsync://sourceware.org/dm-cvs/people

These are the rejections for the CVS repos:

rsync://sourceware.org/XOpenWin-cvs
rsync://sourceware.org/xconq-cvs
rsync://sourceware.org/win32-x11-cvs
rsync://sourceware.org/testcvs-cvs
rsync://sourceware.org/systemtap-cvs
rsync://sourceware.org/src-cvs
rsync://sourceware.org/sourcenav-cvs
rsync://sourceware.org/sid-cvs
rsync://sourceware.org/sharutils-cvs
rsync://sourceware.org/rpm2html-cvs
rsync://sourceware.org/rhug-cvs
rsync://sourceware.org/rhl-cvs 
rsync://sourceware.org/rhdb-cvs
rsync://sourceware.org/rda-cvs
rsync://sourceware.org/psim-cvs
rsync://sourceware.org/procps-cvs
rsync://sourceware.org/piranha-cvs
rsync://sourceware.org/patchutils-cvs
rsync://sourceware.org/newlib-cvs
rsync://sourceware.org/netresolve-cvs
rsync://sourceware.org/mingw-cvs
rsync://sourceware.org/mauve-cvs
rsync://sourceware.org/lvm-cvs
rsync://sourceware.org/lvm2-cvs
rsync://sourceware.org/libstdc++-cvs
rsync://sourceware.org/libffi-cvs
rsync://sourceware.org/libaio-cvs
rsync://sourceware.org/libabigail-cvs
rsync://sourceware.org/kawa-cvs
rsync://sourceware.org/jffs2-cvs
rsync://sourceware.org/java-cvs
rsync://sourceware.org/inti-cvs
rsync://sourceware.org/insight-cvs
rsync://sourceware.org/ha-cvs
rsync://sourceware.org/guile-cvs
rsync://sourceware.org/gsl-cvs
rsync://sourceware.org/glibc-cvs
rsync://sourceware.org/gettext-cvs
rsync://sourceware.org/gdb-cvs
rsync://sourceware.org/gcc-cvs
rsync://sourceware.org/frysk-cvs
rsync://sourceware.org/elix-cvs
rsync://sourceware.org/ecos-cvs
rsync://sourceware.org/eclipse-cvs
rsync://sourceware.org/dominion-cvs
rsync://sourceware.org/dm-cvs
rsync://sourceware.org/cygwin-cvs
rsync://sourceware.org/cluster-cvs
rsync://sourceware.org/cgen-cvs
rsync://sourceware.org/catapult-cvs
rsync://sourceware.org/bzip2-cvs
rsync://sourceware.org/binutils-cvs
rsync://sourceware.org/automake-cvs
rsync://sourceware.org/autoconf-cvs
rsync://sourceware.org/autobook-cvs
Comment 12 Paul Wise 2023-06-10 01:41:43 UTC
I think what happened with the CVS rsync://sourcware.org/* repos was that the top-level URLs were rejected but the individual CVS repos within those URLs were accepted and succeeded, if there were non-CVS repos then they failed.
Comment 13 Paul Wise 2023-06-10 01:41:56 UTC
PS: the status comments above were derived from these pages:

https://archive.softwareheritage.org/add-forge/request/list/
https://archive.softwareheritage.org/save/list/
Comment 14 Paul Wise 2023-06-10 01:54:40 UTC
I would like to suggest that you also register a Sourceware account at archive.org, tar up the rsync repos and upload them to the archive.org upload page. Some of the data may be private (systemtap-cvs/private) for example, so maybe filter/check before uploading.

https://archive.org/account/signup
https://archive.org/upload/

There is also an API and a command-line tool for it called ia for uploading/etc, it is available in the Debian internetarchive package and on GitHub:

https://archive.org/developers/internetarchive/
https://github.com/jjjake/internetarchive
Comment 15 Paul Wise 2023-06-10 01:58:19 UTC
I'm not sure what to do about the CVS failures, most of them are probably not CVS repos, some do look like maybe CVS repos. Perhaps you could check and if any of them are, follow up with Software Heritage about getting access to the error and figuring out a fix.

I definitely suggest also uploading all the CVS/SVN repos to archive.org, so there is a copy in case anyone ever wants to investigate the failures further.
Comment 16 Mark Wielaard 2023-06-23 19:45:48 UTC
Paul, thanks so much for getting all the sources into the Software Heritage Archive
https://archive.softwareheritage.org/browse/search/?q=sourceware.org&with_content=true

Various of our other sites, cygwin.com, gcc.gnu.org, dwarfstd.org are now also listed under Regular crawling - Git.

And it looks like the cvs archives were all imported through the rsync links.

I think the Software Heritage Archive part is done now. They should also pick up any new repos from now on.

Should we open another bug for archive.org?
Comment 17 Paul Wise 2023-06-24 01:41:22 UTC
The svn://gcc.gnu.org/svn/gcc SWH import failed again. Some of the CVS repos SWH imports failed too. I agree that it isn't worth trying to get these fixed.

A new bug for the archive.org upload sounds good, please add me to CC. I can also check if the ArchiveTeam Codearchiver folks did anything and report back in the new bug.
Comment 18 Frank Ch. Eigler 2023-07-24 19:12:42 UTC
software heritage uploads done, thanks a lot!