Mark Geisert [Mon, 23 Dec 2019 06:45:54 +0000 (22:45 -0800)]
Cygwin: Add missing Linux #define of CPU_SETSIZE
Though our implementation of cpu sets doesn't need it, software from
Linux environments expects this definition to be present. It's
documented on the Linux CPU_SET(3) man page but was left out due to
oversight.
Ken Brown [Sat, 21 Dec 2019 22:53:52 +0000 (17:53 -0500)]
Cygwin: FIFO: use FILE_PIPE_REJECT_REMOTE_CLIENTS flag
Add that flag to the pipe type argument when creating the Windows
named pipe. And add a definition of that flag to ntdll.h (copied from
/usr/include/w32api/ddk/ntifs.h).
Takashi Yano [Thu, 19 Dec 2019 11:03:30 +0000 (20:03 +0900)]
Cygwin: pty: Fix ESC[?3h and ESC[?3l handling again.
- Even with commit fe512b2b12a2cea8393d14f038dc3914b1bf3f60, pty
still has a problem in ESC[?3h and ESC[?3l handling if invalid
sequence such as ESC[?$ is sent. This patch fixes the issue.
Takashi Yano [Wed, 18 Dec 2019 16:07:33 +0000 (01:07 +0900)]
Cygwin: pty: Fix a bug regarding ESC[?3h and ESC[?3l handling.
- Midnight commander (mc) does not work after the commit 1626569222066ee601f6c41b29efcc95202674b7 as reported in
https://www.cygwin.com/ml/cygwin/2019-12/msg00173.html.
This patch fixes the issue.
On certain error conditions there is a code snippet that checks
whether the last component of the path has a trailing dot or space or
a leading space. Skip this check if the last component is empty,
i.e., if the path ends with a backslash. This avoids an assertion
failure if the trailing backslash is the only backslash in the path,
as is the case for a DOS drive 'X:\'.
Keith Packard [Fri, 29 Nov 2019 19:23:20 +0000 (11:23 -0800)]
libm: switch sf_log1p from double error routines to float
sf_log1p was using __math_divzero and __math_invalid, which
drag in a pile of double-precision code. Switch to using the
single-precision variants. This also required making those
available in __OBSOLETE_MATH mode.
newlib wide char conversion functions were updated to
Unicode 11 on 2019-01-12
update standard symbol __STDC_ISO_10646__ to
Unicode 11 release date 2018-06-05 for Cygwin
Bastien Bouclet [Sat, 9 Nov 2019 16:28:04 +0000 (17:28 +0100)]
newlib: fix fseek optimization with SEEK_CUR
The call to fflush was invalidating the read buffer, preventing relative
seeks to positions that would have been inside the read buffer from
being optimized. The call to srefill would then re-read mostly the same
data that was initially in the read buffer.
Takashi Yano [Wed, 13 Nov 2019 10:49:29 +0000 (19:49 +0900)]
Cygwin: pty: Trigger redraw screen if ESC[?3h or ESC[?3l is sent.
- Pseudo console clears console screen buffer if ESC[?3h or ESC[?3l
is sent. However, xterm/vt100 does not clear screen. This cause
mismatch between real screen and console screen buffer. Therefore,
this patch triggers redraw screen in that situation so that the
synchronization is done on the next execution of native app.
This solves the problem reported in:
https://www.cygwin.com/ml/cygwin-patches/2019-q4/msg00116.html
Takashi Yano [Tue, 12 Nov 2019 18:04:59 +0000 (03:04 +0900)]
Cygwin: console: Revise the code checking if the console is legacy.
- Accessing shared_console_info before initializing causes access
violation in checking if the console is legacy mode. This patch
fixes this issue. This solves the problem reported in:
https://www.cygwin.com/ml/cygwin-patches/2019-q4/msg00099.html
Takashi Yano [Tue, 12 Nov 2019 13:00:23 +0000 (22:00 +0900)]
Cygwin: pty: Use redraw screen instead of clear screen.
- Previously, pty cleared screen at startup for synchronization
between the real screen and console screen buffer for pseudo
console. With this patch, instead of clearing screen, the screen
is redrawn when the first native program is executed after pty
is created. In other words, synchronization is deferred until
the native app is executed. Moreover, this realizes excluding
$TERM dependent code.
s[0:3] contain a descriptor used to set up the initial value of the
stack, but only the lower 48 bits of s[0:1] are currently used.
The reent marker is currently set in s3, but by stashing it in the
upper 16 bits of s[0:1] instead, s3 can be freed up for other purposes.
Cygwin: fix quoting when starting invisible console process
fhandler_console::create_invisible_console_workaround() does not use the
lpApplicationName parameter and neglects to quote its command name on
lpCommandLine in the call to CreateProcessW.
Given CreateProcessW's brain-dead method to evaluate the application
path given on the command line, this opens up a security problem if
Cygwin is installed into a path with spaces in it.
Fix this by using the lpApplicationName parameter and quoting of the
application path in the lpCommandLine parameter (used as argv[0] in
the called console helper.
For extended paranoia, make the argument string array big enough to
fit full 64 bit pointer values into it. Handles usually only use
the lower 32 bit, but better safe than sorry.
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Cygwin: fix process parent/child relationship after execve
Commit 5a0f2c00aa "Cygwin: fork/exec: fix child process permissions"
removed the PROCESS_DUP_HANDLE handle permission of the parent process
handle in the child to avoid a security problem.
It turned out that this broke the following scenario: If a process forks
and then the parent execs, the child loses the ability to register the
parent's death. To wit, after the parent died the child process does
not set its own PPID to 1 anymore.
The current exec mechanism copies required handle values (handles to
keep contact to the child processes) into the child_info for the
about-to-be-exec'ed process. The exec'ed process is supposed to
duplicate these handles. This fails, given that we don't allow the
exec'ed process PROCESS_DUP_HANDLE access to the exec'ing process since
commit 5a0f2c00aa.
The fix is to avoid the DuplicateHandle calls in the exec'ed process.
This patch sets the affected handles to "inheritable" in the exec'ing
process at exec time. The exec'ed process just copies the handle values
and resets handle inheritance to "non-inheritable". The exec'ing
process doesn't have to reset handle inheritance, it exits after setting
up the exec'ed process anyway.
Testcase: $ ssh-agent /bin/sleep 3
ssh-agent forks and the parent exec's sleep. After sleep exits, `ps'
should show ssh-agent to have PPID 1, and eventually ssh-agent exits.
Target libraries are considered to be built for GCC's "host", not GCC's
"target". The "host" variable must be set by configure scripts using
"config-ml.in" to determine multilib support, otherwise disabled
multilibs (specified as a configure argument with --disable-<multilib>)
will still be built for the subdirectories those configure scripts
reside in.
Dimitar Dimitrov [Sun, 11 Mar 2018 20:15:05 +0000 (22:15 +0200)]
PRU: Align libmath to PRU ABI
The TI proprietary toolchain uses nonstandard names for some math
library functions. In order to achieve ABI compatibility between
GNU and TI toolchains, add support for the TI function names.
Ken Brown [Wed, 9 Oct 2019 20:06:02 +0000 (20:06 +0000)]
Cygwin: spawnvp, spawnvpe: fail if executable is not in $PATH
Call find_exec with the FE_NNF flag to enforce a NULL return when the
executable isn't found in $PATH. Convert NULL to "". This aligns
spawnvp and spawnvpe with execvp and execvpe.
Ken Brown [Fri, 27 Sep 2019 18:00:52 +0000 (14:00 -0400)]
Cygwin: mkdir and rmdir: treat drive names specially
If the directory name has the form 'x:' followed by one or more
slashes or backslashes, and if there's at least one backslash, assume
that the user is referring to 'x:\', the root directory of drive x,
and don't strip the backslash.
Previously all trailing slashes and backslashes were stripped, and the
name was treated as a relative file name containing a literal colon.
Brian Inglis [Mon, 7 Oct 2019 16:23:04 +0000 (10:23 -0600)]
fhandler_proc.cc(format_proc_cpuinfo): use feature test print macro
Add feature test print macro that makes feature, bit, and flag text
comparison and checking easier. Handle as common former Intel only
feature flags also supported on AMD. Change order and some flag names
to agree with current Linux.
- change sys/reent.h to replace _REENT_CHECK_DEBUG with
_REENT_CHECK_VERIFY which when set asserts that any memory
allocated is non-NULL and calls __assert_func directly
- add new --enable-newlib-reent-check-verify configure option
- add support for configure.host to specify default for
newlib_reent_check_verify
- add _REENT_CHECK_VERIFY macro support to acconfig.h and newlib.hin
Jeff Johnston [Fri, 4 Oct 2019 21:01:03 +0000 (17:01 -0400)]
Prevent NULL ptr accesses due to Balloc out of memory
- add new eBalloc macro to mprec.h which calls Balloc and
aborts if Balloc fails due to out of memory
- change mprec.c functions that use Balloc without checking to use eBalloc instead
- fix dtoa.c to use eBalloc
Takashi Yano [Thu, 3 Oct 2019 10:43:37 +0000 (19:43 +0900)]
Cygwin: Fix signal handling issue introduced by PTY related change.
- After commit 41864091014b63b0cb72ae98281fa53349b6ef77, there is a
regression in signal handling reported in
https://www.cygwin.com/ml/cygwin/2019-10/msg00010.html. This patch
fixes the issue.
If the source path starts with the Win32 long path prefix '\\?\' or
the NT object directory prefix '\??\', require the prefix to be
followed by 'UNC\' or '<drive letter>:\'. Otherwise return EINVAL.
This fixes the assertion failure in symlink_info::check that was
reported here:
Cygwin: pty: Fix PTY so that cygwin setup shows help with -h option.
- After commit 169d65a5774acc76ce3f3feeedcbae7405aa9b57, cygwin
setup fails to show help message when -h option is specified, as
reported in https://cygwin.com/ml/cygwin/2019-09/msg00248.html.
This patch fixes the problem.
The ioctl(2) is intended to provide more details about the cause of
the down for the link.
Eventually we might define a comprehensive list of codes for the
situations. But interface also allows the driver to provide free-form
null-terminated ASCII string to provide arbitrary non-formalized
information. Sample implementation exists for mlx5(4), where the
string is fetched from firmware controlling the port.
jhb [Tue, 27 Aug 2019 00:01:56 +0000 (00:01 +0000)]
Add kernel-side support for in-kernel TLS.
KTLS adds support for in-kernel framing and encryption of Transport
Layer Security (1.0-1.2) data on TCP sockets. KTLS only supports
offload of TLS for transmitted data. Key negotation must still be
performed in userland. Once completed, transmit session keys for a
connection are provided to the kernel via a new TCP_TXTLS_ENABLE
socket option. All subsequent data transmitted on the socket is
placed into TLS frames and encrypted using the supplied keys.
Any data written to a KTLS-enabled socket via write(2), aio_write(2),
or sendfile(2) is assumed to be application data and is encoded in TLS
frames with an application data type. Individual records can be sent
with a custom type (e.g. handshake messages) via sendmsg(2) with a new
control message (TLS_SET_RECORD_TYPE) specifying the record type.
At present, rekeying is not supported though the in-kernel framework
should support rekeying.
KTLS makes use of the recently added unmapped mbufs to store TLS
frames in the socket buffer. Each TLS frame is described by a single
ext_pgs mbuf. The ext_pgs structure contains the header of the TLS
record (and trailer for encrypted records) as well as references to
the associated TLS session.
KTLS supports two primary methods of encrypting TLS frames: software
TLS and ifnet TLS.
Software TLS marks mbufs holding socket data as not ready via
M_NOTREADY similar to sendfile(2) when TLS framing information is
added to an unmapped mbuf in ktls_frame(). ktls_enqueue() is then
called to schedule TLS frames for encryption. In the case of
sendfile_iodone() calls ktls_enqueue() instead of pru_ready() leaving
the mbufs marked M_NOTREADY until encryption is completed. For other
writes (vn_sendfile when pages are available, write(2), etc.), the
PRUS_NOTREADY is set when invoking pru_send() along with invoking
ktls_enqueue().
A pool of worker threads (the "KTLS" kernel process) encrypts TLS
frames queued via ktls_enqueue(). Each TLS frame is temporarily
mapped using the direct map and passed to a software encryption
backend to perform the actual encryption.
(Note: The use of PHYS_TO_DMAP could be replaced with sf_bufs if
someone wished to make this work on architectures without a direct
map.)
KTLS supports pluggable software encryption backends. Internally,
Netflix uses proprietary pure-software backends. This commit includes
a simple backend in a new ktls_ocf.ko module that uses the kernel's
OpenCrypto framework to provide AES-GCM encryption of TLS frames. As
a result, software TLS is now a bit of a misnomer as it can make use
of hardware crypto accelerators.
Once software encryption has finished, the TLS frame mbufs are marked
ready via pru_ready(). At this point, the encrypted data appears as
regular payload to the TCP stack stored in unmapped mbufs.
ifnet TLS permits a NIC to offload the TLS encryption and TCP
segmentation. In this mode, a new send tag type (IF_SND_TAG_TYPE_TLS)
is allocated on the interface a socket is routed over and associated
with a TLS session. TLS records for a TLS session using ifnet TLS are
not marked M_NOTREADY but are passed down the stack unencrypted. The
ip_output_send() and ip6_output_send() helper functions that apply
send tags to outbound IP packets verify that the send tag of the TLS
record matches the outbound interface. If so, the packet is tagged
with the TLS send tag and sent to the interface. The NIC device
driver must recognize packets with the TLS send tag and schedule them
for TLS encryption and TCP segmentation. If the the outbound
interface does not match the interface in the TLS send tag, the packet
is dropped. In addition, a task is scheduled to refresh the TLS send
tag for the TLS session. If a new TLS send tag cannot be allocated,
the connection is dropped. If a new TLS send tag is allocated,
however, subsequent packets will be tagged with the correct TLS send
tag. (This latter case has been tested by configuring both ports of a
Chelsio T6 in a lagg and failing over from one port to another. As
the connections migrated to the new port, new TLS send tags were
allocated for the new port and connections resumed without being
dropped.)
ifnet TLS can be enabled and disabled on supported network interfaces
via new '[-]txtls[46]' options to ifconfig(8). ifnet TLS is supported
across both vlan devices and lagg interfaces using failover, lacp with
flowid enabled, or lacp with flowid enabled.
Applications may request the current KTLS mode of a connection via a
new TCP_TXTLS_MODE socket option. They can also use this socket
option to toggle between software and ifnet TLS modes.
In addition, a testing tool is available in tools/tools/switch_tls.
This is modeled on tcpdrop and uses similar syntax. However, instead
of dropping connections, -s is used to force KTLS connections to
switch to software TLS and -i is used to switch to ifnet TLS.
Various sysctls and counters are available under the kern.ipc.tls
sysctl node. The kern.ipc.tls.enable node must be set to true to
enable KTLS (it is off by default). The use of unmapped mbufs must
also be enabled via kern.ipc.mb_use_ext_pgs to enable KTLS.
KTLS is enabled via the KERN_TLS kernel option.
This patch is the culmination of years of work by several folks
including Scott Long and Randall Stewart for the original design and
implementation; Drew Gallatin for several optimizations including the
use of ext_pgs mbufs, the M_NOTREADY mechanism for TLS records
awaiting software encryption, and pluggable software crypto backends;
and John Baldwin for modifications to support hardware TLS offload.
thj [Thu, 8 Aug 2019 11:43:09 +0000 (11:43 +0000)]
Rename IPPROTO 33 from SEP to DCCP
IPPROTO 33 is DCCP in the IANA Registry:
https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml
IPPROTO_SEP was added about 20 years ago in r33804. The entries were added
straight from RFC1700, without regard to whether they were used.
The reference in RFC1700 for SEP is '[JC120] <mystery contact>', this is an
indication that the protocol number was probably in use in a private network.
As RFC1700 is no longer the authoritative list of internet numbers and that
IANA assinged 33 to DCCP in RFC4340, change the header to the actual
authoritative source.
being used at NF as well as sets in some of the groundwork for
committing BBR. The hpts system is updated as well as some other needed
utilities for the entrance of BBR. This is actually part 1 of 3 more
needed commits which will finally complete with BBRv1 being added as a
new tcp stack.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D20834
jhb [Sat, 29 Jun 2019 00:48:33 +0000 (00:48 +0000)]
Add an external mbuf buffer type that holds
multiple unmapped pages.
Unmapped mbufs allow sendfile to carry multiple pages of data in a
single mbuf, without mapping those pages. It is a requirement for
Netflix's in-kernel TLS, and provides a 5-10% CPU savings on heavy web
serving workloads when used by sendfile, due to effectively
compressing socket buffers by an order of magnitude, and hence
reducing cache misses.
For this new external mbuf buffer type (EXT_PGS), the ext_buf pointer
now points to a struct mbuf_ext_pgs structure instead of a data
buffer. This structure contains an array of physical addresses (this
reduces cache misses compared to an earlier version that stored an
array of vm_page_t pointers). It also stores additional fields needed
for in-kernel TLS such as the TLS header and trailer data that are
currently unused. To more easily detect these mbufs, the M_NOMAP flag
is set in m_flags in addition to M_EXT.
Various functions like m_copydata() have been updated to safely access
packet contents (using uiomove_fromphys()), to make things like BPF
safe.
NIC drivers advertise support for unmapped mbufs on transmit via a new
IFCAP_NOMAP capability. This capability can be toggled via the new
'nomap' and '-nomap' ifconfig(8) commands. For NIC drivers that only
transmit packet contents via DMA and use bus_dma, adding the
capability to if_capabilities and if_capenable should be all that is
required.
If a NIC does not support unmapped mbufs, they are converted to a
chain of mapped mbufs (using sf_bufs to provide the mapping) in
ip_output or ip6_output. If an unmapped mbuf requires software
checksums, it is also converted to a chain of mapped mbufs before
computing the checksum.
hselasky [Tue, 25 Jun 2019 11:54:41 +0000 (11:54 +0000)]
Convert all IPv4 and IPv6 multicast memberships
into using a STAILQ instead of a linear array.
The multicast memberships for the inpcb structure are protected by a
non-sleepable lock, INP_WLOCK(), which needs to be dropped when
calling the underlying possibly sleeping if_ioctl() method. When using
a linear array to keep track of multicast memberships, the computed
memory location of the multicast filter may suddenly change, due to
concurrent insertion or removal of elements in the linear array. This
in turn leads to various invalid memory access issues and kernel
panics.
To avoid this problem, put all multicast memberships on a STAILQ based
list. Then the memory location of the IPv4 and IPv6 multicast filters
become fixed during their lifetime and use after free and memory leak
issues are easier to track, for example by: vmstat -m | grep multi
All list manipulation has been factored into inline functions
including some macros, to easily allow for a future hash-list
implementation, if needed.
brooks [Thu, 20 Jun 2019 18:24:16 +0000 (18:24 +0000)]
Extend mmap/mprotect API to specify the max page
protections.
A new macro PROT_MAX() alters a protection value so it can be OR'd with
a regular protection value to specify the maximum permissions. If
present, these flags specify the maximum permissions.
While these flags are non-portable, they can be used in portable code
with simple ifdefs to expand PROT_MAX() to 0.
This change allows (e.g.) a region that must be writable during run-time
linking or JIT code generation to be made permanently read+execute after
writes are complete. This complements W^X protections allowing more
precise control by the programmer.
This change alters mprotect argument checking and returns an error when
unhandled protection flags are set. This differs from POSIX (in that
POSIX only specifies an error), but is the documented behavior on Linux
and more closely matches historical mmap behavior.
In addition to explicit setting of the maximum permissions, an
experimental sysctl vm.imply_prot_max causes mmap to assume that the
initial permissions requested should be the maximum when the sysctl is
set to 1. PROT_NONE mappings are excluded from this for compatibility
with rtld and other consumers that use such mappings to reserve
address space before mapping contents into part of the reservation. A
final version this is expected to provide per-binary and per-process
opt-in/out options and this sysctl will go away in its current form.
As such it is undocumented.
shurd [Wed, 12 Jun 2019 18:07:04 +0000 (18:07 +0000)]
Some devices take undesired actions when RTS and
DTR are asserted. Some development boards for example will reset on DTR,
and some radio interfaces will transmit on RTS.
This patch allows "stty -f /dev/ttyu9.init -rtsdtr" to prevent
RTS and DTR from being asserted on open(), allowing these devices
to be used without problems.
pfg [Sun, 23 Dec 2018 18:15:48 +0000 (18:15 +0000)]
gai_strerror() - Update string error messages according to RFC 3493.
Error messages in gai_strerror(3) vary largely among OSs.
For new software we largely replaced the obsoleted EAI_NONAME and
with EAI_NODATA but we never updated the corresponding message to better
match the intended use. We also have references to ai_flags and ai_family
which are not very descriptive for non-developer end users.
Bring new new error messages based on informational RFC 3493, which has
obsoleted RFC 2553, and make them consistent among the header adn
manpage.
Ken Brown [Sun, 22 Sep 2019 15:33:34 +0000 (11:33 -0400)]
Cygwin: rmdir: fail if last component is a symlink, as on Linux
If the last component of the directory name is a symlink followed by a
slash, rmdir now fails, following Linux but not POSIX, even if the
symlink resolves to an existing empty directory.
Ken Brown [Sat, 21 Sep 2019 17:09:09 +0000 (13:09 -0400)]
Cygwin: remove old cruft from path_conv::check
Prior to commit b0717aae, path_conv::check had the following code:
if (strncmp (path, "\\\\.\\", 4))
{
/* Windows ignores trailing dots and spaces in the last path
component, and ignores exactly one trailing dot in inner
path components. */
char *tail = NULL;
[...]
if (!tail || tail == path)
/* nothing */;
else if (tail[-1] != '\\')
{
*tail = '\0';
[...]
}
Commit b0717aae0 intended to disable this code, but it inadvertently
disabled only part of it. In particular, the declaration of the local
tail variable was in the disabled code, but the following remained:
if (!tail || tail == path)
/* nothing */;
else if (tail[-1] != '\\')
{
*tail = '\0';
[...]
}
[A later commit removed the disabled code.]
The tail variable here points into a string different from path,
causing that string to be truncated under some circumstances. See
- After commit d4045fdbef60d8e7e0d11dfe38b048ea2cb8708b, the TTY
displayed by ps command is incorrect if the process is non-cygwin
process. This patch fixes this issue.