This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH][SH] CFI directives patch

From: Chung-Lin Tang <chunglin dot tang at gmail dot com>
To: Thomas Schwinge <thomas at codesourcery dot com>
Cc: Kaz Kojima <kkojima at rr dot iij4u dot or dot jp>, libc-alpha at sourceware dot org
Date: Fri, 11 May 2012 17:15:52 +0800
Subject: Re: [PATCH][SH] CFI directives patch
References: <4FA37F28.6040004@codesourcery.com> <87obpxaccw.fsf@schwinge.name>

Hi Thomas,

> I don't really feel strongly about this, but would it make sense to move
> the cfi_remember_state instructions to the place just before the control
> flow (potentially) branches?  It does not make any functional difference,
> but might make things more clear conceptually.  Or, instead push it even
> further forward, right to the place where the last CFI update is done,
> like I had done in
> <http://thread.gmane.org/gmane.comp.lib.glibc.ports/514/focus=518>, for
> example?

IMHO, I just thought it more clear to mark the epilogue-in-the-middle
like a push/pop. In that sense, the cfi_remember_state should be kept
right before where the later thrown away CFI state starts...

>> --- a/nptl/sysdeps/unix/sysv/linux/sh/sysdep-cancel.h
>> +++ b/nptl/sysdeps/unix/sysv/linux/sh/sysdep-cancel.h
>> @@ -71,6 +71,7 @@
>>      CDISABLE; \
>>      mov.l @r15+,r0; \
>>      cfi_adjust_cfa_offset (-4); \
>> +    cfi_restore (r0); \
>>      lds.l @r15+,pr; \
>>      cfi_adjust_cfa_offset (-4); \
>>      cfi_restore (pr); \
>> @@ -97,10 +98,10 @@
>>  # define SAVE_ARGS_6	SAVE_ARGS_5
>>  
>>  # define LOAD_ARGS_0	/* Nothing.  */
>> -# define LOAD_ARGS_1	LOAD_ARGS_0; mov.l @(0,r15),r4
>> -# define LOAD_ARGS_2	LOAD_ARGS_1; mov.l @(4,r15),r5
>> -# define LOAD_ARGS_3	LOAD_ARGS_2; mov.l @(8,r15),r6
>> -# define LOAD_ARGS_4	LOAD_ARGS_3; mov.l @(12,r15),r7
>> +# define LOAD_ARGS_1	LOAD_ARGS_0; mov.l @(0,r15),r4; cfi_restore (r4)
>> +# define LOAD_ARGS_2	LOAD_ARGS_1; mov.l @(4,r15),r5; cfi_restore (r5)
>> +# define LOAD_ARGS_3	LOAD_ARGS_2; mov.l @(8,r15),r6; cfi_restore (r6)
>> +# define LOAD_ARGS_4	LOAD_ARGS_3; mov.l @(12,r15),r7; cfi_restore (r7)
>>  # define LOAD_ARGS_5	LOAD_ARGS_4
>>  # define LOAD_ARGS_6	LOAD_ARGS_5
> 
> Why do we need to care about these caller-save registers here?  You are
> right that the current code is assymetric in that, for example,
> Âcfi_rel_offset (r0, 0)Â is used but not Âcfi_restore (r0)Â -- but what
> is the reason for doing the former in the first place (and likewise for
> the others occurences/registers)?  And if it's not needed to preserve
> these registers' content, should rather the Âcfi_rel_offsetÂs be removed?

I think it has something to do with unwinding across cancellable system
calls, similar to what other architectures have (I see MIPS with similar
code).

I cannot seem to google up a really definitive explanation on why this
point of caller-saves has to be restored though.

>> diff --git a/sysdeps/unix/sh/sysdep.S b/sysdeps/unix/sh/sysdep.S
>> index e816575..939eb9a 100644
>> --- a/sysdeps/unix/sh/sysdep.S
>> +++ b/sysdeps/unix/sh/sysdep.S
>> @@ -41,13 +41,15 @@ skip:
>>  	sts.l	pr, @-r15
>>  	cfi_adjust_cfa_offset (4)
>>  	cfi_rel_offset (pr, 0)
>> +	cfi_adjust_cfa_offset (4)
>>  	jsr	@r1
>>  	 mov.l	r0, @-r15
>> -	cfi_adjust_cfa_offset (4)
> 
> Hmm?

I think this was to fix the case of a frame-insn in a call delay slot
kind of problem. Imagine unwinding from inside the @r1 function (here
it's errno); it will get the stack position one word off.

But reviewing it, I think it's better to move the frame-insn (and CFI)
above the call and just fill nop.

>> diff --git a/sysdeps/unix/sysv/linux/sh/sysdep.h b/sysdeps/unix/sysv/linux/sh/sysdep.h
>> index 5215a84..f5e7524 100644
>> --- a/sysdeps/unix/sysv/linux/sh/sysdep.h
>> +++ b/sysdeps/unix/sysv/linux/sh/sysdep.h
>> @@ -120,14 +120,13 @@
>>  #  endif
>>  #  define SYSCALL_ERROR_HANDLER \
>>  	neg r0,r1; \
>> -	mov r12,r2; \
>> -	mov.l 0f,r12; \
>> +	mov.l 0f,r2; \
>>  	mova 0f,r0; \
>> -	add r0,r12; \
>> +	add r0,r2; \
>>  	mov.l 1f,r0; \
>>  	stc gbr, r4; \
>> -	mov.l @(r0,r12),r0; \
>> -	mov r2,r12; \
>> +	add r0,r2; \
>> +	mov.l @r2,r0; \
>>  	add r4,r0; \
>>  	mov.l r1,@r0; \
>>  	bra .Lpseudo_end; \
>> @@ -139,13 +138,12 @@
>>  /* Store (-r0) into errno through the GOT.  */
>>  #  define SYSCALL_ERROR_HANDLER						      \
>>  	neg r0,r1; \
>> -	mov r12,r2; \
>> -	mov.l 0f,r12; \
>> +	mov.l 0f,r2; \
>>  	mova 0f,r0; \
>> -	add r0,r12; \
>> +	add r0,r2; \
>>  	mov.l 1f,r0; \
>> -	mov.l @(r0,r12),r0; \
>> -	mov r2,r12; \
>> +	add r0,r2; \
>> +	mov.l @r2,r0; \
>>  	mov.l r1,@r0; \
>>  	bra .Lpseudo_end; \
>>  	 mov _IMM1,r0; \
> 
> Hmm.  In my understanding, r12 is the standard register to use for
> accessing the GOT.  I think the linker is only able to optimize this
> exact sequences.  Also, isn't there another case for RTLD_PRIVATE_ERRNO?
> What about using Âcfi_register (r12, r2)Â, and later Âcfi_restore (r12)Â.

I think I simply thought that the modified sequence could avoid
clobbering r12, while also looks like to be a little bit faster; I
overlooked the point on linker optimizations.

I'll send some revised parts later.

Thanks,
Chung-Lin

Follow-Ups:
- Re: [PATCH][SH] CFI directives patch
  - From: Thomas Schwinge
- Re: [PATCH][SH] CFI directives patch
  - From: Thomas Schwinge

References:
- [PATCH][SH] CFI directives patch
  - From: Chung-Lin Tang
- Re: [PATCH][SH] CFI directives patch
  - From: Thomas Schwinge

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]