PATCH: Optimize memcmp for ia32

Fri Feb 6 10:08:00 GMT 2004

On Wed, Feb 04, 2004 at 04:11:26PM -0800, H. J. Lu wrote:

> +L(find_diff):
> +	cmpb	%cl, %al
> +	jne	L(set)
> +	cmpw	%cx, %ax

	cmpb	%ch, %ah	This cmpb vs. cmpw will save 1 byte of
				code and add another 1% of performance.

> +	jne	L(set)
> +	shrl	$16,%eax
> +	shrl	$16,%ecx
> +	cmpb	%cl, %al
> +	jne	L(set)
> +	/* We get there only if we already know there is a
> +	   difference.  */
> +	cmpl	%ecx, %eax
> +L(set):

I'm sorry for my intervence.  :)  But 16-bit operations are the big
performance penalties in 32-bit mode.