It would be nice to have global variables that are explicitly percpu. You can effectively to this already with a map and a cpu() index, but we could make it completely lockless if it is a native feature. I'm imagining this as an extension to the "global" keyword -- either as some sort of attribute, or perhaps a new "cpu_global" keyword. The implementation could be simply a percpu_globals struct, or maybe even stuffed into our percpu context struct. We may also want {begin,end,error}.percpu probes for initializing / reporting such globals.
PR5108 is similar to this request, but I'm looking to be more generic.
people have made do with the per-cpu aggregates (<<< and @ops)