RFC: ABI support for special memory area
H.J. Lu
hjl.tools@gmail.com
Sun Jan 1 00:00:00 GMT 2017
On Thu, Mar 9, 2017 at 7:23 AM, Suprateeka R Hegde
<hegdesmailbox@gmail.com> wrote:
> H.J,
>
> I think we are full 180 degrees out-of-phase in our discussion this time
> somehow :-)
>
> As I have already asked, I want to know what is that ONE-FIXED-FORM of
> __gnu_mbind_setup being called by ld.so.
>
> The code you provided seems to be of Intel's implementation of libmbind. I
> am interested in how it looks like in ld.so. Because that is what we want to
> document in the ABI support. We do not want implementation specific details
> in GNU-gABI.
>
> So inside ld.so, would it be what I showed in my earlier mail or would it be
> something else?
>
> In my opinion, we have to bring that out in the ABI support proposal.
> Without the actual signature/prototype, __gnu_mbind_setup sounds more like a
> guideline and less like a ABI spec/standard. And in actual code (in ld.so),
> it may eventually appear really different for each vendor/implementation.
>
> So, either keep it as a guideline or make it generic. IMHO, we can not keep
> the following (original text) as generic:
>
> ---
>>
>> Run-time support
>>
>> int __gnu_mbind_setup (unsigned int type, void *addr, size_t length);
>
> ---
>
> --
> Supra
>
>
>
> On 07-Mar-2017 04:05 AM, H.J. Lu wrote:
>>
>> On Mon, Mar 6, 2017 at 5:25 AM, Suprateeka R Hegde
>> <hegdesmailbox@gmail.com> wrote:
>>>
>>> On 04-Mar-2017 07:37 AM, Carlos O'Donell wrote:
>>>>
>>>>
>>>> On 03/03/2017 11:00 AM, H.J. Lu wrote:
>>>>>
>>>>>
>>>>> __gnu_mbind_setup is called from ld.so. Since there is only one ld.so,
>>>>> it needs to know what to pass to __gnu_mbind_setup. Not all arguments
>>>>> have to be used by all implementations nor all memory types.
>>>>
>>>>
>>>>
>>>> I think what Supra is suggesting is a pointer-to-implementation
>>>> interface
>>>> which would allow ld.so to pass completely different arguments to the
>>>> library depending on what kind of memory is being defined by the sh_info
>>>> value. It avoids needing to encode all the types in the API, and just
>>>> uses an incomplete pointer to the type.
>>>
>>>
>>>
>>> Thats absolutely right.
>>>
>>> However, I am not suggesting one is better over the other. I just want to
>>> get clarity on how the code looks like for different implementations.
>>>
>>> On 03-Mar-2017 09:30 PM, H.J. Lu wrote:
>>>>
>>>>
>>>> __gnu_mbind_setup is called from ld.so. Since there is only one ld.so,
>>>> it needs to know what to pass to __gnu_mbind_setup.
>>>
>>>
>>>
>>> So I want to know what is that ONE-FIXED-FORM of __gnu_mbind_setup being
>>> called by ld.so.
>>>
>>>> Not all arguments
>>>> have to be used by all implementations nor all memory types.
>>>
>>>
>>>
>>> I think I am still not getting this. Really sorry for that. Would it be
>>> possible for you to write a small pseudo code that depicts how this
>>> design
>>> looks like for different implementations?
>>>
>>
>> For my usage, I only want to know memory type, address and its size:
>>
>> #define _GNU_SOURCE
>> #include <unistd.h>
>> #include <errno.h>
>> #include <stdint.h>
>> #include <cpuid.h>
>> #include <numa.h>
>> #include <numaif.h>
>> #include <mbind.h>
>>
>> #ifdef LIBMBIND_DEBUG
>> #include <stdio.h>
>> #endif
>>
>> /* High-Bandwidth Memory node mask. */
>> static struct bitmask *hbw_node_mask;
>>
>> /* Initialize High-Bandwidth Memory node mask. This must be called before
>> __gnu_mbind_setup. */
>> static void
>> __attribute__ ((used, constructor))
>> init_node_mask (void)
>> {
>> if (__get_cpuid_max (0, 0) == 0)
>> return;
>>
>> /* Check if vendor is Intel. */
>> uint32_t eax, ebx, ecx, edx;
>> __cpuid (0, eax, ebx, ecx, edx);
>> if (!(ebx == 0x756e6547 && ecx == 0x6c65746e && edx == 0x49656e69))
>> return;
>>
>> /* Get family and model. */
>> uint32_t model;
>> uint32_t family;
>> __cpuid (1, eax, ebx, ecx, edx);
>> family = (eax >> 8) & 0x0f;
>> if (family != 0x6)
>> return;
>> model = (eax >> 4) & 0x0f;
>> model += (eax >> 12) & 0xf0;
>>
>> /* Check for KNL and KNM. */
>> switch (model)
>> {
>> default:
>> return;
>>
>> case 0x57: /* Knights Landing. */
>> case 0x85: /* Knights Mill. */
>> break;
>> }
>>
>> /* Check if NUMA configuration is supported. */
>> int nodes_num = numa_num_configured_nodes ();
>> if (nodes_num < 2)
>> return;
>>
>> /* Get MCDRAM NUMA nodes. */
>> struct bitmask *node_mask = numa_allocate_nodemask ();
>> struct bitmask *node_cpu = numa_allocate_cpumask ();
>>
>> int i;
>> for (i = 0; i < nodes_num; i++)
>> {
>> numa_node_to_cpus (i, node_cpu);
>> /* NUMA node without CPU is MCDRAM node. */
>> if (numa_bitmask_weight (node_cpu) == 0)
>> numa_bitmask_setbit (node_mask, i);
>> }
>>
>> if (numa_bitmask_weight (node_mask) != 0)
>> {
>> /* On Knights Landing and Knights Mill, MCDRAM is High-Bandwidth
>> Memory. */
>> hbw_node_mask = node_mask;
>> }
>> else
>> numa_bitmask_free (node_mask);
>> numa_bitmask_free (node_cpu);
>> }
>>
>> /* Support all different memory types. */
>>
>> static int
>> mbind_setup (unsigned int type, void *addr, size_t length,
>> unsigned int mode, unsigned int flags)
>> {
>> int err = ENXIO;
>>
>> switch (type)
>> {
>> default:
>> #ifdef LIBMBIND_DEBUG
>> printf ("Unsupported mbind type %d: from %p of size %p\n",
>> type, addr, length);
>> #endif
>> return EINVAL;
>>
>> case GNU_MBIND_HBW:
>> if (hbw_node_mask)
>> err = mbind (addr, length, mode, hbw_node_mask->maskp,
>> hbw_node_mask->size, flags);
>> break;
>> }
>>
>> if (err < 0)
>> err = errno;
>>
>> #ifdef LIBMBIND_DEBUG
>> printf ("Mbind type %d: from %p of size %p\n", type, addr, length);
>> #endif
>>
>> return err;
>> }
>>
>> int
>> __gnu_mbind_setup (unsigned int type, void *addr, size_t length)
>> {
>> return mbind_setup (type, addr, length, MPOL_BIND, MPOL_MF_MOVE);
>> }
>>
>> If other memory types need additional information, they can be
>> passed to __gnu_mbind_setup. We just need to know what
>> information is needed.
>>
>>
>
Here is my glibc prototype.
--
H.J.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glibc-mbind.patch
Type: text/x-patch
Size: 8431 bytes
Desc: not available
URL: <http://sourceware.org/pipermail/gnu-gabi/attachments/20170101/e27876b4/attachment.bin>
More information about the Gnu-gabi
mailing list