[PATCH v4] Make chained function calls in expressions work

Mon Nov 3 14:43:00 GMT 2014

Siva Chandra wrote:

>This is a follow up to the thread here:
>https://sourceware.org/ml/gdb-patches/2014-10/msg00000.html
>
>I have made all the suggested changes which now eliminates the need
>for two patches in this set. The single patch is attached.

Thanks!  Now that you've convinced me that we need to store temporaries,
there's still a number of issues about how to actually implement this ...

> struct value *
> evaluate_expression (struct expression *exp)
> {
>+  int i, pc = 0;
>+  struct value *res, *val;
>+  struct cleanup *cleanups;
>+  value_vec *vec = exp->on_stack_temporaries_vec;
>+
>+  cleanups = make_cleanup (VEC_cleanup (value_ptr),
>+			   &exp->on_stack_temporaries_vec);
>+  res = evaluate_subexp (NULL_TYPE, exp, &pc, EVAL_NORMAL);
>+  /* If the result is on the expression stack, fetch it and mark it as
>+     not_lval.  */
>+  for (i = 0; VEC_iterate (value_ptr, vec, 1, val); i++)
>+    {
>+      if (res == val)
>+	{
>+	  if (value_lazy (res))
>+	    value_fetch_lazy (res);
>+	  VALUE_LVAL (res) = not_lval;
>+	  break;
>+	}
>+    }
>+  do_cleanups (cleanups);

There's two issues I see with that:

The minor issue is that modifying a value's lval status by simply assigning
to VALUE_LVAL has been deprecated for a long time.  While this is another one
of those partial transformations in GDB code, it would still be better to
avoid making it worse; if we do need to modify a value's lval, it ought to
be done via a defined interface implemented in value.c (the implementation
can simply assign to ->lval direcly).  See also below ...

The more significant problem is that this code is unsafe as is, in the sense
that it may happen that a temporary is allocated on the stack, but the
cleanup above never runs for it.  The problem is that all the subroutines
of evaluate_subexp will pass an expression to call_function_by_hand, but
only if evaluate_subexp was called via evaluate_expression will the cleanup
be performed.

Unfortunately, there are several code paths where evaluate_subexp or one of
its subroutines is called directly.  For example, there's print_object_command
in objc-lang.c, there's fetch_subexp_value (used from watchpoint code), and
there is even stap_evaluate_probe_argument calling evaluate_subexp_standard
directly (I may have missed some further calls).

This is critical not just to make the sure the values are made non_lval
(on in the future, have destructors called), but also to make sure the
exp->on_stack_temporaries_vec field is cleared -- if we were to evalutate
that expression a second time, we'd have invalid values in that vector:
they might have already been deleted and we access invalid memory, or even
if not, the skip_current_expression_stack may come up with completely wrong
SP values (say, from a different thread's stack).

The question is how to the fix that.  One way would be to enfore a rule that
expression evaluation must go through one (or a set of) particular routines
and fix all callers that currently violate that rule.  (For example, one
could imagine doing the cleanup in evaluate_subexp whenever called with
pc == 0, and changing stap_evaluate_probe_argument to call evaluate_subexp
instead of evaluate_subexp_standard.)  There is a certain risk of future
changes re-introducing violations of that rule if it cannot be enforced
by the compiler ...

Another way might be to continue allowing calls into any subroutine of
expression evaluation, but set things up so that call_function_by_hand
will only allow creation of stack temporaries *if* the call came through
evaluate_expression.  This would require evaluate_expression to take
some action to enable temporary creation, which could e.g. take the
form of setting a flag in the expression struct, allocating some other
structure to hold temporary data and install it into a pointer in the
expression struct, or even storing the information elsewhere (like in
the thread struct).

>+/* Add value V to the expression stack of expression EXP.  */
>+
>+void
>+add_value_to_expression_stack (struct expression *exp, struct value *v)
>+{
>+  gdb_assert (exp != NULL);
>+  VEC_safe_push (value_ptr, exp->on_stack_temporaries_vec, v);
>+}
>+
>+/* Return an address after skipping over the current values on the expression
>+   stack of EXP.  SP is the current stack frame pointer.  Non-zero DOWNWARD
>+   indicates that the stack grows downwards/backwards.  */
> 
>+CORE_ADDR
>+skip_current_expression_stack (struct expression *exp, CORE_ADDR sp,
>+			       int downward)
>+{
>+  CORE_ADDR addr = sp;
>+
>+  gdb_assert (exp != NULL);
>+  if (!VEC_empty (value_ptr, exp->on_stack_temporaries_vec))
>+    {
>+      struct value *v = VEC_last (value_ptr, exp->on_stack_temporaries_vec);
>+      CORE_ADDR val_addr = value_address (v);
>+
>+      if (downward)
>+	{
>+	  gdb_assert (sp >= val_addr);
>+	  addr = val_addr;
>+	}
>+      else
>+	{
>+	  struct type *type;
>+
>+	  gdb_assert (sp <= val_addr);
>+	  type = value_type (v);
>+	  addr = val_addr + TYPE_LENGTH (type);
>+	}
>+    }
>+
>+  return addr;
> }

Maybe rename these to call out they are about on-stack temporaries
("expression stack" seems a bit vague).  In any case, depending on the
particular solution to the problem discussed above, there'll need to be
changes here as well.

>+/* Return true is T is a class or a union.  False otherwise.  */
>+
>+int
>+class_or_union_p (const struct type *t)
>+{
>+  return (TYPE_CODE (t) == TYPE_CODE_STRUCT
>+	  || TYPE_CODE (t) == TYPE_CODE_UNION);
>+}

I understand we need to do this for classes with member functions (so that
f().g() will work) -- do we really need it for classes without member
functions (or plain C structs)?

>+/* Write contents of V at ADDR.  Also, set lval_type of V to lval_memory.
>+   It is as error if V's lval_type is anything other than not_lval.  */
>+
>+void
>+write_value_to_memory (struct value *v, CORE_ADDR addr)
>+{
>+  gdb_assert (VALUE_LVAL (v) == not_lval);
>+
>+  write_memory (addr, value_contents_raw (v), TYPE_LENGTH (value_type (v)));
>+  VALUE_LVAL (v) = lval_memory;
>+  v->location.address = addr;
>+  v->lazy = 0;
>+}

I'd prefer a name more in line with the other value.c routines.  Maybe
something like value_force_lval.  (Could be paired with a value_force_non_lval
to implement the routine to be used from evaluate_expression, see above.)

Should assign to v->lval instead of using VALUE_LVAL on the left-hand side
of an assignment.

I'm not sure setting v->lazy to 0 is the right thing to do here.  (Also,
it must already be 0 here, otherwise accessing value_contents_raw would
have been an error.)  The watchpoint code will assume all nonlazy lval_memory
values, even intermediate results, need watching -- but a stack temporary
never should be watched, it'll be gone as soon as execution continues.

>@@ -532,6 +563,16 @@ call_function_by_hand (struct value *function, int nargs, struct value **args)
>   {
>     CORE_ADDR old_sp = get_frame_sp (frame);
> 
>+    /* Skip over the stack mirrors that might have been generated during the
>+       evaluation of the current expression.  */
>+    if (exp != NULL)
>+      {
>+	if (gdbarch_inner_than (gdbarch, 1, 2))
>+	  old_sp = skip_current_expression_stack (exp, old_sp, 1);
>+	else
>+	  old_sp = skip_current_expression_stack (exp, old_sp, 0);
>+      }
>+
>     if (gdbarch_frame_align_p (gdbarch))
>       {
> 	sp = gdbarch_frame_align (gdbarch, old_sp);

Skipping over existing temporaries *here* may cause the red zone to be added
a second time on platforms that use it.  This will not necessarily cause any
problems, but still seems wasteful.  Might be better to skip over temporaries
*after* the red zone is added, but then need to enforce alignment afterwards
again.

>@@ -1059,13 +1112,8 @@ When the function is done executing, GDB will silently stop."),
>        At this stage, leave the RETBUF alone.  */
>     restore_infcall_control_state (inf_status);
> 
>-    /* Figure out the value returned by the function.  */
>-    retval = allocate_value (values_type);
>-
>     if (hidden_first_param_p)
>-      read_value_memory (retval, 0, 1, struct_addr,
>-			 value_contents_raw (retval),
>-			 TYPE_LENGTH (values_type));
>+      retval = get_return_value_from_memory (values_type, struct_addr, exp);
>     else if (TYPE_CODE (target_values_type) != TYPE_CODE_VOID)
>       {
> 	/* If the function returns void, don't bother fetching the
>@@ -1076,16 +1124,30 @@ When the function is done executing, GDB will silently stop."),
> 	  case RETURN_VALUE_REGISTER_CONVENTION:
> 	  case RETURN_VALUE_ABI_RETURNS_ADDRESS:
> 	  case RETURN_VALUE_ABI_PRESERVES_ADDRESS:
>+	    retval = allocate_value (values_type);
> 	    gdbarch_return_value (gdbarch, function, values_type,
> 				  retbuf, value_contents_raw (retval), NULL);
>+	    if (exp != NULL && class_or_union_p (values_type))
>+	      {
>+		/* Values of class type returned in registers are copied onto
>+		   the stack and their lval_type set to lval_memory.  This is
>+		   required because further evaluation of the expression
>+		   could potentially invoke methods on the return value
>+		   requiring GDB to evaluate the "this" pointer.  To evaluate
>+		   the this pointer, GDB needs the memory address of the
>+		   value.  */
>+		write_value_to_memory (retval, struct_addr);
>+		add_value_to_expression_stack (exp, retval);
>+	      }
> 	    break;
> 	  case RETURN_VALUE_STRUCT_CONVENTION:
>-	    read_value_memory (retval, 0, 1, struct_addr,
>-			       value_contents_raw (retval),
>-			       TYPE_LENGTH (values_type));
>+	    retval = get_return_value_from_memory (values_type, struct_addr,
>+						   exp);
> 	    break;
> 	  }
>       }
>+    else
>+      retval = allocate_value (values_type);
> 
>     do_cleanups (retbuf_cleanup);

For RETURN_VALUE_ABI_RETURNS_ADDRESS and RETURN_VALUE_ABI_PRESERVES_ADDRESS,
we in fact have already allocated the return value on the stack (i.e.
struct_return will be true), so there is no need to write anything back
to the stack.  This didn't matter with current code, since for those two
scenarios, gdbarch_return_value will in effect to the same as the
read_value_memory call in the RETURN_VALUE_STRUCT_CONVENTION case, but
with your patch, it seems better to avoid the extra store.

Might be simplest to remove the switch (gdbarch_return_value) and just do

  if (TYPE_CODE (values_type) == TYPE_CODE_VOID)
    retval = allocate_value (values_type);
  else if (struct_return || hidden_first_param_p)
    retval = get_return_value_from_memory (values_type, struct_addr, exp);
  else
    {
      retval = allocate_value (values_type);
      gdbarch_return_value (gdbarch, function, values_type,
			    retbuf, value_contents_raw (retval), NULL);
      ...
    }

> 	  if (binop_user_defined_p (op, arg1, arg2))
>-	    res_val = value_x_binop (arg1, arg2, op, OP_NULL, EVAL_NORMAL);
>+	    {
>+	      res_val = value_x_binop (arg1, arg2, op, OP_NULL,
>+				       EVAL_NORMAL, NULL);
>+	    }

We don't need to add the braces here.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com