This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: question about NSS [SUCCESS=merge] semantics


On 05/01/2017 02:58 PM, DJ Delorie wrote:
> If two services both have the same user in a given group, and the
> merge feature needs to merge them, do we de-duplicate the user or not?
> Do we also dedup the groups when enumerating?
> 

We do not deduplicate. That was dropped from the plan because it introduced
unacceptably poor performance to address an edge-case. I also deemed it
reasonable to drop on the grounds that we have the same problem already if
someone was to duplicate results directly in /etc/passwd.


Basically, the performance hit would have been both with having to process
through the membership lists (which may have hundreds or thousands of messages)
as well as managing the memory for the buffer. I chose to match the way that
glibc was handling initgroups lookups (which also does no deduplication, just
tacks the results on at the end.


> The wiki proposing the feature says to dedup, but neither the man page
> nor the patch submission mentions it.
> 
> I.e.
> 
> group	test1 [SUCCESS=merge] test2
> 
> with...
> 
> test1:  mygroup::4:alpha,beta,gamma
> 
> test2:  mygroup::4:alpha,delta
> 
> There are two interesting cases (well, three):
> 
> getgrgid(4) -> alpha,beta,gamma,delta
> -OR-
> getgrgid(4) -> alpha,beta,gamma,alpha,delta

You should expect to get back:
mygroup::4:alpha,beta,gamma,alpha,delta


> 
> and also:
> 
> getgrent() returns which?
> 
> mygroup:4:alpha,beta,gamma,delta

No

> -OR-
> mygroup:4:alpha,beta,gamma,alpha,delta

No


> -OR-
> mygroup:4:alpha,beta,gamma,delta
> mygroup:4:alpha,delta
> 

Yes (which is exactly the way enumerated group lookups have always worked if
there were multiple copies of the group name/ID in one or more data sources).

For example, if you take [SUCCESS=merge] out of the equation, it will behave in
exactly the same way as it does here: it will call the reentrant function for
each source and iterate through them one at a time, returning them in order as
it goes. You can reproduce this even just with nss_files: if you have
```
mygroup::4:alpha,beta,gamma
mygroup::4:alpha,delta
```
in /etc/group, `getent group |grep mygroup` will return two lines, one with each
result. The same would be true if you had one or the other of them in nss_nis or
nss_ldap etc.

Deduplicating this information would be significantly memory-intensive and
poorly-performing, because we would have to first loop through all possible
results (which has no upper bound on number of entries and may require accessing
multiple remote data sources which might be very slow), then iterate through
each one looking for commonalities, then rewrite the buffers containing each of
the membership lists.

This is just one in a long list of reasons why the IDM teams (FreeIPA, Samba and
SSSD) all strongly recommend disabling enumeration for remote sources (and set
it that way by default).

> (i.e. two parts: dedup users, and/or dedup groups)
> 


Attachment: signature.asc
Description: OpenPGP digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]