sourceware.org Git - systemtap.git/commit

runtime: fix panics when polling on the control channel while unloading

When the stapio pselect() runs while the given stap module is unloading,
there's a use-after-free opportunity in do_select(). This occurs because
the control channel's poll function, _stp_ctl_poll_cmd(), passes a
pointer to a global variable along to do_select(), which can then
dereference the pointer after the stap module is unloaded.

Normally, this wouldn't be a problem because do_select() uses get_file()
and fput(), which respectively grab and release references to the module
owner specified in `file->f_op->owner`. However, procfs doesn't provide
any interface to pass in a module owner, and instead all procfs files
use an internal `struct file_operations` declared in fs/proc/inode.c.
As a result, we cannot bolster procfs files with module reference
count protection through any normal means, so we must inject a module
owner the hard way.

A module owner is now patched into the control channel's file ops when
the file is opened by making a copy of the existing file ops and then
setting the module owner inside the copy, which then replaces the old
`file->f_op` pointer. This neatly fixes the race because procfs *does*
guarantee that none of the procfs callback functions are still running
after an entry is removed, and because _stp_ctl_poll_cmd() cannot be
reached without first passing through _stp_ctl_open_cmd().

Since delete_module() can now return EWOULDBLOCK, we must make staprun
aware that it's not a fatal error and that the module deletion should
be retried. EWOULDBLOCK simply indicates that a pselect() on the control
channel has yet to finish, so it will go away after a brief wait.

This fixes the following panic:
BUG: unable to handle kernel paging request at ffffffffc0914030
PGD 79820c067 P4D 79820c067 PUD 79820e067 PMD 3f9ee6067 PTE 0
Oops: 0002 [#1] SMP PTI
CPU: 6 PID: 1636475 Comm: stapio Kdump: loaded Tainted: G           OE     4.19.91-22.2.al7.x86_64 #1
RIP: 0010:_raw_spin_lock_irqsave+0x1e/0x40
RSP: 0018:ffffb9fb0e45f980 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffb9fb0e45faf0 RDI: ffffffffc0914030
RBP: ffffffffc0914030 R08: 0000000000000001 R09: ffff973fa8924000
R10: 0000000000000104 R11: 0000000000000041 R12: 0000000000000000
R13: ffffb9fb0e45fab0 R14: 000000000000000f R15: 000000000000000f
FS:  00007effdcf53740(0000) GS:ffff97409fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc0914030 CR3: 0000000522d42003 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
remove_wait_queue+0x14/0x60
poll_freewait+0x37/0xa0
do_select+0x650/0x740
? compat_poll_select_copy_remaining+0x110/0x110
? kvm_sched_clock_read+0xd/0x20
? sched_clock+0x5/0x10
? sched_clock_cpu+0xc/0xa0
? select_idle_sibling+0x28/0x400
? account_entity_enqueue+0x9c/0xd0
? enqueue_entity+0x71f/0xc80
? __switch_to_asm+0x35/0x70
? enqueue_task_fair+0xd2/0x9b0
? remove_entity_load_avg+0x27/0x70
? check_preempt_curr+0x6b/0x90
? ttwu_do_wakeup+0x19/0x150
? try_to_wake_up+0x219/0x580
core_sys_select+0x1e2/0x320
? audit_filter_inodes+0x1f/0xf0
? audit_filter_syscall.constprop.11+0x8c/0xd0
? __audit_syscall_exit+0x1fd/0x290
? kvm_clock_get_cycles+0xd/0x10
? ktime_get_ts64+0x46/0xf0
__se_sys_pselect6+0xf6/0x1b0
do_syscall_64+0x5b/0x1b0
entry_SYSCALL_64_after_hwframe+0x44/0xa9

author	Sultan Alsawaf <sultan@openresty.com>
	Wed, 25 Aug 2021 02:27:43 +0000 (19:27 -0700)
committer	Sultan Alsawaf <sultan@openresty.com>
	Fri, 27 Aug 2021 20:50:02 +0000 (13:50 -0700)
commit	166a95089af2f7c911afd85c53e6bdaeba95b950
tree	3343868f4bf95c51c891aaa31ea4f964e8325420	tree
parent	e6a1b008b822ed211b8f9c15fda565f8d51e512d	commit \| diff

runtime/transport/control.c		diff \| blob \| blame \| history
staprun/staprun.c		diff \| blob \| blame \| history