Summary: | Archive projects at Software Heritage | ||
---|---|---|---|
Product: | sourceware | Reporter: | Mark Wielaard <mark> |
Component: | Infrastructure | Assignee: | overseers mailing list <overseers> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | fche, pabs3 |
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: |
Description
Mark Wielaard
2022-09-27 13:24:14 UTC
I have submitted the Sourceware cgit instance to the Software Heritage add forge interface, they will contact sourcemaster@sourceware.org for approval at some point. https://sourceware.org/cgit I have also submitted the cgit instance for a once-off upload to archive.org via ArchiveTeam's Codearchiver project: https://wiki.archiveteam.org/index.php/Codearchiver (In reply to Paul Wise from comment #1) > I have submitted the Sourceware cgit instance to the Software Heritage add > forge interface, they will contact sourcemaster@sourceware.org for approval > at some point. > > https://sourceware.org/cgit Thanks, the request for approval was acknowledged. There are also: https://cygwin.com/cgit https://gcc.gnu.org/cgit https://git.dwarfstd.org/ Which might be good to have archived too. Then there are older subversion and cvs repos. I don't believe we have a public list of those. There seem to be only three subversion repos: svn checkout svn://sourceware.org/svn/prelink/ svn checkout svn://sourceware.org/svn/kawa/ svn checkout svn://gcc.gnu.org/svn/gcc The gcc one has been converted to git already so isn't essential. cvs is a bit trickier, but I believe this is the list to checkout the public repos and modules: cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/autobook co autobook cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/autoconf co autoconf cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/automake co automake cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co cluster cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co conga cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co felix cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cluster co ftp cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co crypt cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co csih cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cvsmaint cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygrunsrv cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygupdate cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygutils cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co cygwin-doc cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co editrights cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co genini cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co libgetopt++ cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co login cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co mt cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co packaging cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co rebase cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co resedit cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co robots cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co run cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co setup cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co shutdown cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co up2date cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co upload cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co windows-default-manifest cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/cygwin-apps co wrappers cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co device-mapper cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co dmraid cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co multipath-tools cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/dm co people cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/docbook-tools co docbook-tools cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co autotools cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co branding cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co bugzilla cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co changelog cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co cheatsheets cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co mylyn cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co org.eclipse.team.bugs cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co org.sourceware.update cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co pydev cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/eclipse co rpm cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/ecos co ecos cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/ecos co ecos-opt cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/elix co elix cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-common cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-core cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-gtk cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-gui cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-imports cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-sys cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-top cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/frysk co frysk-utrace cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co benchmarks cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co boehm-gc cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co gcc cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co old-gcc cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co repository cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co testrun cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gcc co wwwdocs cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gettext co gettext cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/glibc co libc cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/glibc co linuxthreads cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/glibc co ports cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/gsl co gsl cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/guile co guile cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/guile co guile-modules cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/installshell co installshell cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/inti co inti cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/ip-over-scsi co ipscsi cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/java co libgcj cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/kawa co kawa cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/kawa co software cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libaio co ftp cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libaio co libaio cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libffi co libffi cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/libstdc++ co libstdc++ cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm co ftp cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm co LVM cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm2 co ftp cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm2 co LVM2 cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/lvm2 co system-config-lvm cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co builder cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co jacks cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co mauve cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co rmic-tests cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co serialization cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co verify cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mauve co wonka cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mingw co mingw cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/mingw co w32api cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/patchutils co ftp cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/patchutils co patchutils cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/piranha co code cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/procps co procps cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/pthreads-win32 co bossom cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/pthreads-win32 co pthreads cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhdb co src cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co ant-for-jhbuild cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co ecj-for-jhbuild cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co eclipse-gcj cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co fakejdk cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co gcj-jit cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co gcjwebplugin-test cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co java-gcj-compat cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rhug co rhug cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rpm2html co rpm2html cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/rpm2html co rpmfind cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/sourcenav co src cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co morpho cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co newlib cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co sockets cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co src cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/src co win32-x11 cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co archpaper cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co doc cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co froggy cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co kprobes_test cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co patches cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co src cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co tests cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/systemtap co utracer cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/xconq co xconq cvs -z9 -d :pserver:anoncvs@sourceware.org:/cvs/XOpenWin co dlls These are all dormant and either converted to git to moved somewhere else. I have submitted the three cgit instances to the Software Heritage forge registration. I submitted to the Software Heritage save code now interface; all the git repos for the GCC/DWARF cgit instances, all the SVN repos and I'm in the process of submitting the many git repos for the Cygwin cgit instance. I tried submitting the CVS repos and got an error, so I asked on the #swh IRC channel on Libera and am waiting for a response. All of the SVN and CVS repos will require SWH approval before they get archived. I also pointed the ArchiveTeam Codearchiver folks at the cgit/svn/cvs lists above. The best way to archive a CVS repository is probably via rsync, see https://gcc.gnu.org/rsync.html for anonymous rsync instructions; I don't think CVS pserver protocol is well-designed for replicating complete repository contents. (SVN repositories are also available for rsync, though svnrdump should also work for those.) Yes, indeed rsync would be a better and simpler way to mirror the cvs repos. The cvs code repros as rsync are: rsync://sourceware.org/autobook-cvs rsync://sourceware.org/autoconf-cvs rsync://sourceware.org/automake-cvs rsync://sourceware.org/binutils-cvs rsync://sourceware.org/bzip2-cvs rsync://sourceware.org/catapult-cvs rsync://sourceware.org/cgen-cvs rsync://sourceware.org/cluster-cvs rsync://sourceware.org/cygwin-cvs rsync://sourceware.org/dm-cvs rsync://sourceware.org/dominion-cvs rsync://sourceware.org/eclipse-cvs rsync://sourceware.org/ecos-cvs rsync://sourceware.org/elix-cvs rsync://sourceware.org/frysk-cvs rsync://sourceware.org/gcc-cvs rsync://sourceware.org/gdb-cvs rsync://sourceware.org/gettext-cvs rsync://sourceware.org/glibc-cvs rsync://sourceware.org/gsl-cvs rsync://sourceware.org/guile-cvs rsync://sourceware.org/ha-cvs rsync://sourceware.org/insight-cvs rsync://sourceware.org/inti-cvs rsync://sourceware.org/java-cvs rsync://sourceware.org/jffs2-cvs rsync://sourceware.org/kawa-cvs rsync://sourceware.org/libabigail-cvs rsync://sourceware.org/libaio-cvs rsync://sourceware.org/libffi-cvs rsync://sourceware.org/libstdc++-cvs rsync://sourceware.org/lvm2-cvs rsync://sourceware.org/lvm-cvs rsync://sourceware.org/mauve-cvs rsync://sourceware.org/mingw-cvs rsync://sourceware.org/netresolve-cvs rsync://sourceware.org/newlib-cvs rsync://sourceware.org/patchutils-cvs rsync://sourceware.org/piranha-cvs rsync://sourceware.org/procps-cvs rsync://sourceware.org/psim-cvs rsync://sourceware.org/rda-cvs rsync://sourceware.org/rhdb-cvs rsync://sourceware.org/rhl-cvs rsync://sourceware.org/rhug-cvs rsync://sourceware.org/rpm2html-cvs rsync://sourceware.org/sharutils-cvs rsync://sourceware.org/sid-cvs rsync://sourceware.org/sourcenav-cvs rsync://sourceware.org/src-cvs rsync://sourceware.org/systemtap-cvs rsync://sourceware.org/testcvs-cvs rsync://sourceware.org/win32-x11-cvs rsync://sourceware.org/xconq-cvs rsync://sourceware.org/XOpenWin-cvs Yeah, the SWH folks said on IRC that the CVS archiving expects an rsync URL. The current status is now: Regular forge listing and repo archiving enabled for all four cgit instances. All the git repositories on sourceware.org/cgit that I submitted already got archived. I noticed that I missed the cygwin-apps ones there though, submitting them now. All the git repositories on gcc.gnu.org/cgit and git.dwarfstd.org that I submitted got archived. The submissions for cygwin.com/cgit are still ongoing. The submissions for sourceware.org SVN repos: all accepted, kawa/prelink finished, gcc ongoing. The submissions for sourceware CVS repos are still ongoing. The current status of the forges: * Accepted: https://sourceware.org/cgit * First origin loaded: https://cygwin.com/cgit https://gcc.gnu.org/cgit https://git.dwarfstd.org/ I've asked on the #swh IRC channel about the process from Accepted to First origin loaded. The status of individual repos: * https://sourceware.org/git/*: all requested, accepted and succeeded * https://cygwin.com/git/*: all requested, accepted and scheduled/succeeded * https://gcc.gnu.org/git/*: all requested, accepted and succeeded * https://git.dwarfstd.org/*: all requested, accepted and succeeded * svn://sourceware.org/svn/{prelink,kawa}/: all requested, accepted and succeeded * svn://gcc.gnu.org/svn/gcc: requested, accepted, running (but a previous run from last month failed) * rsync://sourceware.org/*: all requested, accepted/rejected (see below), and succeeded/failed (see below) These are the failures for the CVS repos: rsync://sourceware.org/testcvs-cvs/gcc rsync://sourceware.org/systemtap-cvs/private rsync://sourceware.org/src-cvs/src rsync://sourceware.org/src-cvs/newlib rsync://sourceware.org/src-cvs/htdocs rsync://sourceware.org/gcc-cvs/testrun rsync://sourceware.org/gcc-cvs/repository rsync://sourceware.org/libaio-cvs/htdocs rsync://sourceware.org/gcc-cvs/gcc rsync://sourceware.org/procps-cvs/procps rsync://sourceware.org/frysk-cvs/frysk rsync://sourceware.org/lvm2-cvs/system-config-lvm rsync://sourceware.org/lvm2-cvs/htdocs rsync://sourceware.org/lvm2-cvs/ftp rsync://sourceware.org/procps-cvs/htdocs.saf rsync://sourceware.org/lvm2-cvs/LVM2 rsync://sourceware.org/netresolve-cvs/htdocs rsync://sourceware.org/eclipse-cvs/htdocs rsync://sourceware.org/dominion-cvs/htdocs rsync://sourceware.org/dm-cvs/people These are the rejections for the CVS repos: rsync://sourceware.org/XOpenWin-cvs rsync://sourceware.org/xconq-cvs rsync://sourceware.org/win32-x11-cvs rsync://sourceware.org/testcvs-cvs rsync://sourceware.org/systemtap-cvs rsync://sourceware.org/src-cvs rsync://sourceware.org/sourcenav-cvs rsync://sourceware.org/sid-cvs rsync://sourceware.org/sharutils-cvs rsync://sourceware.org/rpm2html-cvs rsync://sourceware.org/rhug-cvs rsync://sourceware.org/rhl-cvs rsync://sourceware.org/rhdb-cvs rsync://sourceware.org/rda-cvs rsync://sourceware.org/psim-cvs rsync://sourceware.org/procps-cvs rsync://sourceware.org/piranha-cvs rsync://sourceware.org/patchutils-cvs rsync://sourceware.org/newlib-cvs rsync://sourceware.org/netresolve-cvs rsync://sourceware.org/mingw-cvs rsync://sourceware.org/mauve-cvs rsync://sourceware.org/lvm-cvs rsync://sourceware.org/lvm2-cvs rsync://sourceware.org/libstdc++-cvs rsync://sourceware.org/libffi-cvs rsync://sourceware.org/libaio-cvs rsync://sourceware.org/libabigail-cvs rsync://sourceware.org/kawa-cvs rsync://sourceware.org/jffs2-cvs rsync://sourceware.org/java-cvs rsync://sourceware.org/inti-cvs rsync://sourceware.org/insight-cvs rsync://sourceware.org/ha-cvs rsync://sourceware.org/guile-cvs rsync://sourceware.org/gsl-cvs rsync://sourceware.org/glibc-cvs rsync://sourceware.org/gettext-cvs rsync://sourceware.org/gdb-cvs rsync://sourceware.org/gcc-cvs rsync://sourceware.org/frysk-cvs rsync://sourceware.org/elix-cvs rsync://sourceware.org/ecos-cvs rsync://sourceware.org/eclipse-cvs rsync://sourceware.org/dominion-cvs rsync://sourceware.org/dm-cvs rsync://sourceware.org/cygwin-cvs rsync://sourceware.org/cluster-cvs rsync://sourceware.org/cgen-cvs rsync://sourceware.org/catapult-cvs rsync://sourceware.org/bzip2-cvs rsync://sourceware.org/binutils-cvs rsync://sourceware.org/automake-cvs rsync://sourceware.org/autoconf-cvs rsync://sourceware.org/autobook-cvs I think what happened with the CVS rsync://sourcware.org/* repos was that the top-level URLs were rejected but the individual CVS repos within those URLs were accepted and succeeded, if there were non-CVS repos then they failed. PS: the status comments above were derived from these pages: https://archive.softwareheritage.org/add-forge/request/list/ https://archive.softwareheritage.org/save/list/ I would like to suggest that you also register a Sourceware account at archive.org, tar up the rsync repos and upload them to the archive.org upload page. Some of the data may be private (systemtap-cvs/private) for example, so maybe filter/check before uploading. https://archive.org/account/signup https://archive.org/upload/ There is also an API and a command-line tool for it called ia for uploading/etc, it is available in the Debian internetarchive package and on GitHub: https://archive.org/developers/internetarchive/ https://github.com/jjjake/internetarchive I'm not sure what to do about the CVS failures, most of them are probably not CVS repos, some do look like maybe CVS repos. Perhaps you could check and if any of them are, follow up with Software Heritage about getting access to the error and figuring out a fix. I definitely suggest also uploading all the CVS/SVN repos to archive.org, so there is a copy in case anyone ever wants to investigate the failures further. Paul, thanks so much for getting all the sources into the Software Heritage Archive https://archive.softwareheritage.org/browse/search/?q=sourceware.org&with_content=true Various of our other sites, cygwin.com, gcc.gnu.org, dwarfstd.org are now also listed under Regular crawling - Git. And it looks like the cvs archives were all imported through the rsync links. I think the Software Heritage Archive part is done now. They should also pick up any new repos from now on. Should we open another bug for archive.org? The svn://gcc.gnu.org/svn/gcc SWH import failed again. Some of the CVS repos SWH imports failed too. I agree that it isn't worth trying to get these fixed. A new bug for the archive.org upload sounds good, please add me to CC. I can also check if the ArchiveTeam Codearchiver folks did anything and report back in the new bug. software heritage uploads done, thanks a lot! |