Using the Streaming SIMD Extensions on Linux

This page provides patches that extend the Linux kernel to support the Pentium III Streaming SIMD Extensions.

What are the Streaming SIMD Extensions?

Intel's Pentium III processor contains some new instructions and registers, known collectively as the Streaming SIMD Extensions (SSE), which increase the Pentium III's floating-point performance. With the proper attention from the programmer, the Streaming SIMD Extensions allow the Pentium III to handle floating-point-intensive applications like video and audio handling, 3-D modeling, and physical simulations much more quickly than its predecessors.

Why do I need to patch my kernel in order to use the Streaming SIMD Extensions?

In a multi-tasking environment, the Streaming SIMD Extensions require support from the operating system: the SIMD registers must be handled properly by the operating system's context switching code. When the system switches control from one process to another, the old process's SIMD registers must be saved away, and the saved values of the new process's SIMD registers must be loaded into the processor. The Pentium III processor prohibits programs from using the Streaming SIMD Extensions unless the operating system tells the processor at system startup time that it is aware of the SIMD registers, and will manage them properly.

The patch available on this page is based on the Pentium III patch provided by Doug Ledford, available from his Linux Kernel Patch Page.

What version of the Linux kernel do these patches apply to?

There are two versions of the kernel patch: one for Linux 2.2.5, and one for Linux 2.2.12. (These are the versions distributed with Red Hat Linux 6.0 and 6.1.) When I wrote the patches, the basic SSE support had not yet appeared in the experimental 2.3 kernel series, so I was unable to produce a patch against a more recent kernel version. However, as soon as the SSE support does appear, I will produce a new set of patches for the debugging support.

How to use the patches

There are four basic steps to applying the patches. In summary:

In full detail:

Build a kernel from the unmodified sources, and test it.

It is essential to verify that you can build a kernel straight from the sources, install it, and boot it up successfully, before you apply any patches whatsoever. Now is also a good time to select the right set of drivers and modules, etc.

This is your best opportunity to distinguish between problems due to the patch, and problems due to the kernel configuration. I had a hard time getting everything going at first, and needed help from someone with more experience. The last thing you want to worry about here are more random external influences, like patches from strangers.

Red Hat Linux includes instructions for building a custom kernel in the Red Hat reference guide (the book you got with your distribution). In the Red Hat 6.1 distribution, look at section 2.8, ``Building a Custom Kernel''. If you are unpacking your sources from an RPM, make sure to execute the RPM's %prep stage, to apply whatever patches are needed for your distribution.

Make sure that you are actually running the kernel you compiled. Put a new call to printk some place safe, like the calibrate_delay function in init/main.c, and verify that your kernel prints it out when it boots up. (You might find the dmesg command helpful here.)

Make sure that you have recompiled and installed your kernel modules. If the boot process hangs after it says ``Finding module dependencies'', one possibility is that you're trying to run your new kernel with your old modules.

If you ask me for help, and I find out that you haven't gotten a straight kernel build working before getting in touch with me, I'm going to be annoyed. There are lots of people out there who have more experience building kernels than I do, so you should find a friend who knows this stuff and get their help.

Apply the kernel patches.

First of all, use your web browser to download the kernel patch to a file somewhere convenient. There are two separate patches: one for version 2.2.5 of the Linux kernel:
http://sourceware.cygnus.com/gdb/papers/linux/jimb.linux-2.2.5-sse-ptrace-2.patch
and one for version 2.2.12 of the kernel:
http://sourceware.cygnus.com/gdb/papers/linux/jimb.linux-2.2.12-sse-ptrace-2.patch
In these instructions, I assume you're using the 2.2.12 patch, but the procedure is the same for the 2.2.5 patch; only the patch's filename is different.

Now, cd to the top directory of your kernel source tree --- the one that contains the files README.kernel-sources, REPORTING-BUGS, and so on.

  $ cd /usr/src/linux
  $ 
  

Use the patch command to apply the patch file to the sources, as shown here:

  $ patch -p0 < ~/incoming/jimb.linux-2.2.12-sse-ptrace-2.patch 
  patching file `Documentation/Configure.help'
  patching file `arch/i386/Makefile'
  patching file `arch/i386/config.in'
  patching file `arch/i386/kernel/head.S'
  patching file `arch/i386/kernel/i386_ksyms.c'
  patching file `arch/i386/kernel/process.c'
  patching file `arch/i386/kernel/ptrace.c'
  patching file `arch/i386/kernel/setup.c'
  patching file `arch/i386/kernel/signal.c'
  patching file `arch/i386/kernel/smp.c'
  patching file `arch/i386/kernel/traps.c'
  patching file `arch/i386/lib/Makefile'
  patching file `arch/i386/lib/simd.c'
  patching file `arch/i386/lib/usercopy.c'
  patching file `arch/i386/mm/init.c'
  patching file `fs/binfmt_elf.c'
  patching file `include/asm-i386/bugs.h'
  patching file `include/asm-i386/i387.h'
  patching file `include/asm-i386/processor.h'
  patching file `include/asm-i386/ptrace.h'
  patching file `include/asm-i386/string.h'
  patching file `include/asm-i386/uaccess.h'
  patching file `include/linux/elf.h'
  $ 
  

Configure the kernel to enable the SSE support.

There are several different ways to configure a Linux kernel; these are generic instructions, which should work for any of the configuration methods.

In the category ``Processor type and features'':

Leave the rest of the configuration the way it was for your previous working build. (You do have a previous working build, don't you?)

Build your patched kernel.

Use the same process here you did for your previous working build.

Install your kernel.

As before, make sure you have built and installed fresh modules, to go along with your kernel. Trying to use old modules with a new kernel doesn't work too well. If your kernel hangs while booting, after saying ``Finding module dependencies'', then you may not have installed your new modules the way you expected.

Make sure to leave a kernel you know is useable in your /etc/lilo.conf file, so if things blow up, you have something to fall back to. Once you've got it working, however, you can make your new kernel the default; see the lilo.conf(5) for more info here.

Reboot, using the new kernel.

When the lilo: prompt appears during the boot process, type the name you gave your patched kernel in /etc/lilo.conf.

Verify that your kernel supports the Streaming SIMD Extensions

The file /proc/cpuinfo contains a list of the CPU-related options you selected when you configured the kernel. If the Streaming SIMD Extensions support is enabled, the line labeled flags: should contain the words fxsr and xmm, usually at the end. For example:

  $ cat /proc/cpuinfo
  processor       : 0
  vendor_id       : GenuineIntel
  cpu family      : 6
  model           : 7
  model name      : Pentium III (Katmai)
  stepping        : 2
  cpu MHz         : 497.440714
  cache size      : 512 KB
  fdiv_bug        : no
  hlt_bug         : no
  sep_bug         : no
  f00f_bug        : no
  coma_bug        : no
  fpu             : yes
  fpu_exception   : yes
  cpuid level     : 2
  wp              : yes
  flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr xmm
  bogomips        : 496.44

  $ 
  

Patch your header files.

Among other things, the kernel patch extends some system calls to give the debugger access to processes' SIMD registers. However, we need to extend the user-side descriptions of these system calls to include the new functionality.

So, use your web browser to download the header file patch to somewhere convenient. In these instructions, I assume you're using the same filename I did: jimb.glibc-sse-ptrace.patch.

The URL for the patch is http://sourceware.cygnus.com/gdb/papers/linux/jimb.glibc-sse-ptrace.patch.

Now, cd to /usr/include/sys --- the directory containing ptrace.h and user.h.

These files are not usually world-writeable, so, as root, use the patch command to apply the patch file to the sources, as shown here:

  $ cd /usr/include/sys
  $ su
  Password: 
  [root@zenia sys]# patch -p0 < ~jimb/incoming/jimb.glibc-sse-ptrace.patch 
  patching file `ptrace.h'
  patching file `user.h'
  [root@zenia sys]# exit
  $ 
  

You don't need to rebuild your kernel after applying this patch. It's only interesting to debuggers.

Make sure you have the right sources for GDB.

We added support for the Streaming SIMD Extensions to GDB rather recently. To verify that your GDB sources contain this support, look in gdb/ChangeLog. There should be an entry no earlier than November 1999 containing the text:

	Add support for SSE registers in core files.
	* corelow.c (get_core_register_section): New function.
  

If the file contains these comments, then your GDB sources should support the Streaming SIMD Extensions.

Configure and build GDB.

GDB's configuration process checks your system's header files to see whether your system has been extended to support the Streaming SIMD Extensions. So, you must configure GDB after patching your header files. Follow the instructions in the README file at the top of the source tree, or the instructions you received from Cygnus.

Verify that your GDB has been built properly.

Start up the GDB you just built, and try to print the value of the SIMD register $xmm0. Since you're not actually debugging a program, GDB doesn't have any value to display for that register, but it should at least recognize that it is a register name:

  $ /umbra/jimb/build/gdb/gdb -nw 
  GNU gdb 4.18-PentiumIII-991112
  Copyright 1998 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain conditions.
  Type "show copying" to see the conditions.  This version of GDB is supported
  for customers of Cygnus Solutions.  Type "show warranty" for details.
  This GDB was configured as "i686-pc-linux-gnu".
  (gdb) print $xmm0
  No registers.
  (gdb) print $foo
  $1 = void
  (gdb) 
  

As you can see, this GDB says that an undefined convenience variable is void, but recognizes that $xmm0 is the name of a register.

Of course, the final test is to try to debug a program that uses the Streaming SIMD Extensions, either at the source level using the Intel intrinsics interface, or at the assembly language level.

If you have any trouble with the patch or with these directions, please send mail to me, Jim Blandy <jimb@cygnus.com>.

Good luck!


Back to the Sourceware GDB Linux page.

Back to the Sourceware GDB page page.