Mark Wielaard
Wed Aug 17 21:18:35 GMT 2022

Hi Frank,

On Wed, Aug 17, 2022 at 09:24:56AM -0400, Frank Ch. Eigler wrote:
> > I don't think we should be using v1 archives, those or deprecated
> > upstream and they strongly recommend using v2 archives which are much
> > more scalable.
> Given that v1 is the default of public-inbox-init, they can't be that bad.

Looks like it is just for backward compatibility. They actively warn
against using it for new installations and strongly recommend using
-V2. See also the public-inbox-init, public-inbox-v1-format and
public-inbox-v2-format man pages.

I don't expect support for v1 will disappear, but new projects around
public-inbox, like lei, only support v2. So it is better to simply use
the v2 format from the start.

> > Reimporting the lists as v2 archives using the import_from_mbox
> > script should be much more efficient and can be done in a couple of
> > hours instead of days.
> That speed is nice, but I suspect that's not a v1/v2 representation
> efficiency issue but something else.

The v2 format allows parallel imports so it defaults to using multiple
threads. Also using the import_from_mbox script allows to stream the
import of messages using just one perl process per mbox instead of
starting a new perl process per message.

> Yes, understood that the extra indexing can do extra searches.  My
> question was about utility/need for this.

The use seems obvious to me for anybody using the web based archives
to generate tailored message/mbox results, specifically date ranged
searches seem pretty mandatory since otherwise you essentially just
need to keep clicking, next, next, next. But also to get specific
messages based on author or subject. On specific use case for
public-inbox is to not have to be subscribed to a list to read it or
to have a local copy to search through it (even if it makes mirroring
a mailinglist easy, but not everybody has the space or network to do

> For elfutils-devel, note
> that the full xapian indexes are about 10x the size of the
> git-compressed email archive, whereas in the case of the systemtap
> import, it's only about 0.2x, so there is a serious cost/benefit
> question.

That is a concern and much bigger than I anticipated. So we should
probably only enable full indexing for active discussion and patch
lists and keep it at basic for autogenerated lists like -cvs or
old/inactive lists.

> > That would be great. But I would need some time reading up on
> > postfix/mailman configs. Do you have an example of where/how this hack
> > would be done?
> postfix delivers mailing list traffic via /etc/mailman/aliases,
> e.g.:
> autobook-cvs:             "|/usr/local/mailman/mailman post autobook-cvs"
> autobook-cvs-bounces:     "|/usr/local/mailman/mailman bounces autobook-cvs"
> autobook-cvs-confirm:     "|/usr/local/mailman/mailman confirm autobook-cvs"
> autobook-cvs-join:        "|/usr/local/mailman/mailman join autobook-cvs"
> I would use a script to generate a new config file from that, so that the
> primary mailing list incoming aliases are forked:
> autobook-cvs:             autobook-cvs-mailman, autobook-cvs-inbox
> autobook-cvs-mailman:     "|/usr/local/mailman/mailman post autobook-cvs"
> autobook-cvs-inbox:       "|env SOMETHING /usr/bin/public-inbox-mda SOMETHING"
> autobook-cvs-bounces:     "|/usr/local/mailman/mailman bounces autobook-cvs"
> autobook-cvs-confirm:     "|/usr/local/mailman/mailman confirm autobook-cvs"
> autobook-cvs-join:        "|/usr/local/mailman/mailman join autobook-cvs"
> and then switch postfix to this alias file instead.

OK that could work and should be easy to generate combining
/etc/mailman/aliases with the lists in

So this is before mailman sees the message, so we do need to do a
spam-check. And I think postfix sets ORIGINAL_RECIPIENT already, we
just need to make sure it is one of the addresses for a list in the

But what generates /etc/mailman/aliases itself?  Can we hook into that
to trigger generation of this aliases-inbox file? Otherwise if we add
a new mailman list it won't work. And do we need to update/regenerate
/etc/aliases.db and/or /etc/mailman/aliases.db ?



More information about the Overseers mailing list