Core Toolchain Infrastructure - Services for glibc

Carlos O'Donell carlos@redhat.com
Thu Jul 13 21:58:14 GMT 2023


The Core Toolchain Infrastructure (renamed) project continues to
move forward the goal of creating a long-term sustainable set of
secure and state of the art services for the GNU Toolchain.

Some of the major goals include:

 * Isolating all services in VMs or containers to increase service
   security and reduce service resource interference. 

 * Allow volunteers to focus efforts outside of core infrastructure
   maintenance.

 * Prepare for additional software supply chain requirements from
   distributors.

As part of this effort the Technical Advisory Committee (TAC) has
evaluated the services used by glibc and how they could be migrated
and serviced by the CTI project.

The CTI TAC recommendation is to use Linux Foundation IT services
for core infrastructure. The LF IT team already supports many of
the same services for the Linux kernel and at scale. The migration
would involve moving services from Sourceware.org to LF IT servers.
We continue to be thankful and appreciative of the time spent by
Sourceware.org volunteers in support of the current services.

Appended to this message are the list of proposed services that
LF IT can provide and an enumeration of the current services.

The CTI TAC is looking for feedback from the community on the listed
services which would be migrated to LF IT services and overall
project service requirements. Please feel free to reply to this
thread or email me, or any CTI TAC member privately.

After the glibc release early in August the CTI project will be
taking the next step of asking the glibc stewards (GNU Maintainers)
to review the proposal and make a decision to move forward with the
migration.

Cheers,
Carlos O’Donell
CTI TAC Member

The following is the suggested list of services to be migrated (with notes):

* Mailing lists
 * Support public-inbox for mailing list archives.
 * Use of public-inbox means archives can be cloned and copied.
 * Use of LF IT Subspace mailing list services (mlmmj, postfix).

* bug database
 * Consider starting fresh in new Bugzilla 5.0.4+ instance and freeze old product.
 * glibc component in sourceware instance marked "Not open for new bugs."
 * No easy way to clone this but we can discuss options.
 * Isolate bugzilla from other services.

* git
 * Migrate to gitolite
 * Community manages access via gitolite keys.
 * Minimize all access to sources and isolate from other processes.
 * Minimal server side web hooks where required.
 * Isolate git service from other services.
 * Stop supporting svn/cvs and provide tarball dumps.

* wiki
 * Migrate to git-based documentation with existing content copied over.
 * Suggest rst/Sphinx or similar to existing discussions for GCC docs.
 * Sphinx with themes can provide a lot of flexibility for display.
 * Isolate wiki service from other services.

* patch management
 * Continue patchwork usage and maintenance of isolated instance
 * Required for community driven pre-commit CI
 * LF IT hosting patchwork instance with community hosting bots.
 * Isolate patchwork from other services.

* Website
 * Provide a simple static site.
 * Isolate web hosting from other services.

* Meeting
 * Already migrated away from proprietary solutions.
 * Continue to use LF IT BBB instance for glibc meetings including weekly patch review.
 * Isolate BBB from other services.

The current list of glibc services were put together as part of the
CTI TAC glibc service enumeration:

* mailing lists
  * Mailman 2 mailing lists: https://sourceware.org/mailman/listinfo/*
    * libc-announce
    * libc-alpha
    * libc-stable
    * libc-help
    * libc-locales
    * libc-testresults
    * glibc-cvs
    * glibc-bugs
    * glibc-bugs-regex (limited bugs just for regex).
    * Closed legacy mailing lists:
      * libc-ports: https://sourceware.org/mailman/listinfo/libc-ports
      * libc-hacker: https://sourceware.org/mailman/listinfo/libc-hacker
    * Mailing lists accept non-html email only.
    * Run through spamassasin
    * Run through clamav
  * Pipermail archives:
    * https://sourceware.org/pipermail/*
    * e.g. https://sourceware.org/pipermail/libc-alpha/
  * public-inbox archives:
    * https://inbox.sourceware.org/*
    * e.g. https://inbox.sourceware.org/libc-alpha/
  * MHonArc archives (/legacy-ml/) - no longer updated
    * The /legacy-ml/ URLs and /ml/ redirects to them need to keep working.
    * E.g. https://sourceware.org/legacy-ml/libc-alpha/2020-01/

* bugzilla 5.0.4+
  * Uses backend SQL database of MariaDB 10.3
  * Must be able to send email to glibc-bugs mailing list.
    * Don't know how email is routed to this list.
  * Must also send email glibc-bugs-regex mailing list.
    * Don't know how email is routed to this list.
  * Must be able to send email to all users on the bug.
  * Must be able to receive email when someone responds to a glibc-bugs
    email e.g. sourceware-bugzilla@sourceware.org.
  * Custom Administration->Groups settings for User RegExp.
    * canconfirm: Allow certain domains to always be able to confirm bugs.
    * editbugs: Likewise but for editbugs.
  * Must have REST API enabled to allow RM to generate release list
    of fixed bugs using the glibc/scripts/list-fixed-bugs.py script
    e.g. https://sourceware.org/bugzilla/rest.cgi/
    * Implies that non-logged-in users can list and view all bugs
      that were fixed for the release.
  * Must have account creation disabled due to spamming.
  * Must have someone with Bugzilla admin access to:
    * Add new users to bugzilla.
    * Add new Product components, versions, and milestones.
    * Add new Key Words
    * Remove users.

* git 2.31
  * Allows per-user access to commit to the glibc repo.
  * Allows per-user access to commit to the legacy glibc-ports repo.
  * Uses group access to control repository access.
  * Must be able to send email to glibc-cvs mailing list with one
    email for each commit made by a developer to any branch of the repository.
  * AdaCore hooks need more thorough audit for required services.
    * Must be able to send email to bugzilla to update bugs.
      * Done by AdaCore hook 'file-commit-cmd'
      * Configured to use email-to-bugzilla-filtered command.
        * Uses connection to SQL database to determine if bug exists.
  * Currently uses shared AdaCore hooks configured via origin/meta/config 
    * Active hooks:
      * post-receive
        * AdaCore post_receive
        * /git/glibc.git/hooks-bin/post-receive
          * Triggers irkerhook.py (see notes below).
          * Does not work today, likely due to requirement to register OFTC user.
      * post-update
	* Standard git-update-server-info.
      * pre-receive
	* AdaCore pre_receive
    * AdaCore config:
      * No max line lengths.
      * Allow UTF-8 in commit messages.
      * 5MiB max email size.
      * Max 500 commit messages for larger commit series sent to glibc-cvs.
      * Reject merge commits to master and release branches.
      * Allow rebasing only private branches (non master and non release).
      * Run minimal style checker, nominally for whitespace issue rejection.
        * Run extra commit checking to avoid source address for author being wrong.
          * /git/glibc.git/hooks-bin/commit_checker
            * From email format checker. No special requirements.
        * /git/glibc.git/hooks-bin/style_checker
          * Style chcker. No special requirements.
      * Send email to bugzilla if a commit mentions a bug.
        * /git/glibc.git/hooks-bin/email-to-bugzilla-filtered
          * Uses /sourceware/infra/bin/email-to-bugzilla
          * Must be able to connect to bugzilla SQL database.
          * Does not appear to work today. We don't get emails for commits with bugs.
      * Send IRC message to per-project configured IRC channel.
        * Involves irkerhook.py and git config information for project.
        * Hook must be able to connect to external IRC networks to post IRC notices.

* wiki
  * Uses MoinMoin 1.9.10
  * Must have account creation disabled due to spamming.
    * Uses EditorGroup permissions to allow any community member to add a new
      community member to the wiki e.g. human vetting another human.
  * Must be able to send notification emails.
  * Cron run to purge users not in EditorGroup to prevent wiki slowdown.

* patch management.
  * Uses patchwork v3.1.1.post18-g11cf1f3
  * Must be able to receive email (as part of collecting patch data)
  * Must be able to send emails as part of account verification.
  * Uses django for administration
  * Must allow authenticated REST API access for patchwork.
    * Currently rate limited.
    * Used by SLI tools (Carlos O'Donell)
      * Run manually on developer systems.
    * Auto-close on commit patchwork bot (Siddhesh Poyarekar)
      * Run on sourceware.org via cron.
  * Used for weekly patch management meetings.
  * git-pw integration used to access patchwork directly using REST API and API token.

* Red Hat Bluejeans remote meeting system.
  * Must allow remote video and audio for participants around the world.
  * Allows weekly glibc patch review meetings for patch review collaboration.
  * Meetings must operate without host needing to be present so community can host.
    * Delegating host is difficult in bluejeans.
  * Managed by Bluejeans/Verizon.

* pre-commit CI system.
  https://gitlab.com/djdelorie/glibc-cicd
  * Run inside a VM.
  * Uses networkless containers for further build isolation.
  * Highest risk system because it runs mailing list posted patches.
  * Event curation system (curator):
    * Must have network access to patchwork REST API.
    * Must have access to SQL database for storing state.
      * Currently using MariaDB.
    * Must allow runners to access curatore REST API URL.
    * One curator currently hosted by DJ Delorie.
  * Event running system (runner + trybots):
    * Must have network access to curator REST API.
    * Must have local network access to rabbitmq queue (job delegation)
    * trybots must have local network access to rabbitmq.
      * Must have network access to patchwork REST API to post results.
      * Must have network access to container registries to pull modern containers.w
      * Must have network git access to pull updated glibc git repo.
    * Generally the runner and trybots are on one site together.
      * Avoid passing rabbitmq traffic beyond the local network.
      * Eventual emailing of results to the mailing list will happen via another bot
        that is distinct from this system to avoid the runners needing anything but
        restricted network access.
    * One runner hosted by DJ Delorie	
    * One i686 trybot hosted by DJ Delorie
    * One "patch applies" trybot hosted by DJ Delorie

* Website (sourceware.org)
  * CVS hosted website.
  * Static redirect to gnu.org website.

* Website (gnu.org)
  https://www.gnu.org/software/libc/
  * CVS hosted website uploads along with manual.
    * Manuals are generated with scripts in the CVS repo and generated files committed.
  * All static content.
  * Website automatically updated after CVS commits.
  * Manged by the GNU Project/FSF.

* Release tarballs (ftp upload of gpg-signed release tarballs)
  https://ftp.gnu.org/gnu/libc/
  * Use gnupload script to gpg sign uploaded tarballs.
   * Uses ncftpput to place files into /incoming directories.
   * Network ftp access required.
  * Managed by the GNU Project/FSF.

* Translation project services
  https://translationproject.org/html/welcome.html
  * https network access to TP servers to fetch uploaded translation files.
  * Managed by the Translation Project.



More information about the Libc-alpha mailing list