PATCH RFA: New option: -fno-toplevel-order

Ian Lance Taylor ian@airs.com
Mon Jan 9 04:48:00 GMT 2006


This patch introduces a new option -fno-toplevel-reorder.  This option
is only meaningful when in unit-at-a-time mode.  It causes gcc to
output toplevel functions, variables, and asm statements in the same
order in which they appear in the source file.  The implementation is
straightforward, and is loosely based on an old patch from Daniel
Jacobowitz.

My goal here is to separate the issue of eliminating no-unit-at-a-time
mode from concerns about usability.  The -fno-toplevel-reorder option
should work for the cases where people currently use
-fno-unit-at-a-time.  It won't be identical--in particular, gcc may
omit static functions which have been inlined everywhere--but it
should be possible for people to use new versions of gcc with
-fno-toplevel-reorder without major surgery to existing code.

We may still want to keep -fno-unit-at-a-time anyhow because it uses
less memory.  I don't have an opinion on that, although I note that
the Java frontend currently disables unit-at-a-time mode.  I just want
to make that discussion separate from the discussion of effects on
users other than compilation speed.

I tested this patch on i686-pc-linux-gnu without Ada.

1) I bootstrapped/compared the patched compiler.
2) I bootstrapped/compared patched compiler with BOOT_CFLAGS="-g -O2
   -fno-toplevel-reorder", which compiles the whole compiler with
   -fno-toplevel-reorder.
3) I ran the testsuite the usual way.
4) I ran the testsuite with --target_board=unix/-fno-toplevel-reorder,
   which runs all the tests with -fno-toplevel-reorder.

In step 4 a few tests failed as expected.  For example,
gcc.dg/varpool-1.c failed, because it tests that an unreferenced
static variable is eliminated, which is not true when
-fno-toplevel-reorder is used.

There were two tests which failed with -fno-toplevel-reorder, both C++
tests: g++.old-deja/g++.brendan/union1.C and
g++.old-deja/g++.other/anon2.C.  In both cases there was an assembler
error about an attempt to declare the same symbol more than once.  For
some reason two different decls get entered on the cgraph_varpool list
with the same assembler name.  I don't know why, though it is
evidently something the C++ frontend does with static anonymous
unions.  Without -fno-toplevel-reorder, one of the decls is discarded
as unreferenced.  This seems like a low priority issue which I propose
to handle in a followup.

This patch touches the C and C++ frontends, so I would need approval
for that.  In any case I would like to hear comments on it.

Ian


./ChangeLog:
2006-01-08  Ian Lance Taylor  <ian@airs.com>

	* common.opt (ftoplevel-reorder): New option.
	* cgraph.c (cgraph_asm_nodes): New global variable.
	(cgraph_asm_last_node): New static variable.
	(cgraph_order): New global variable.
	(cgraph_create_node): Set new order field.
	(cgraph_varpool_node): Likewise.
	(decide_is_variable_needed): Return true if not
	flag_toplevel_reorder.
	(cgraph_add_asm_node): New function.
	* cgraph.h (struct cgraph_node): Add order field.
	(struct cgraph_varpool_node): Add order field.
	(struct cgraph_asm_node): Define.
	(cgraph_asm_nodes, cgraph_order): Declare.
	(cgraph_add_asm_node): Declare.
	* cgraphunit.c (cgraph_varpool_assemble_decl): New static
	function.
	(cgraph_varpool_assemble_pending_decls): Call it.
	(cgraph_output_pending_asms): New static function.
	(cgraph_finalize_compilation_unit): Call it.
	(struct cgraph_order_sort): Define.
	(cgraph_output_in_order): New static function.
	(cgraph_optimize): Call cgraph_output_pending_asms.  Add code for
	!flag_toplevel_reorder case.
	* c-parser.c: Include "cgraph.h".
	(c_parser_asm_definition): Call cgraph_add_asm_node rather than
	assemble_asm.
	* Makefile.in (CRTSTUFF_CFLAGS): Use -fno-toplevel-reorder rather
	than -fno-unit-at-a-time.
	* doc/invoke.texi (Option Summary): Mention
	-fno-toplevel-reorder.
	(Optimize Options): Document -fno-toplevel-reorder.  Mention it in
	-funit-at-a-time documentation.

cp/ChangeLog:
2006-01-07  Ian Lance Taylor  <ian@airs.com>

	* parser.c: Include "cgraph.h".
	(cp_parser_asm_definition): Call cgraph_add_asm_node rather than
	assemble_asm.


Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 109407)
+++ doc/invoke.texi	(working copy)
@@ -319,7 +319,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-function-cse  -fno-guess-branch-probability @gol
 -fno-inline  -fno-math-errno  -fno-peephole  -fno-peephole2 @gol
 -funsafe-math-optimizations  -funsafe-loop-optimizations  -ffinite-math-only @gol
--fno-trapping-math  -fno-zero-initialized-in-bss @gol
+-fno-toplevel-reorder -fno-trapping-math  -fno-zero-initialized-in-bss @gol
 -fomit-frame-pointer  -foptimize-register-move @gol
 -foptimize-sibling-calls  -fprefetch-loop-arrays @gol
 -fprofile-generate -fprofile-use @gol
@@ -5335,14 +5335,16 @@ Enabled at levels @option{-O2}, @option{
 Parse the whole compilation unit before starting to produce code.
 This allows some extra optimizations to take place but consumes
 more memory (in general).  There are some compatibility issues
-with @emph{unit-at-at-time} mode:
+with @emph{unit-at-a-time} mode:
 @itemize @bullet
 @item
 enabling @emph{unit-at-a-time} mode may change the order
 in which functions, variables, and top-level @code{asm} statements
 are emitted, and will likely break code relying on some particular
 ordering.  The majority of such top-level @code{asm} statements,
-though, can be replaced by @code{section} attributes.
+though, can be replaced by @code{section} attributes.  The
+@option{fno-toplevel-reorder} option may be used to keep the ordering
+used in the input file, at the cost of some optimizations.
 
 @item
 @emph{unit-at-a-time} mode removes unreferenced static variables
@@ -5364,6 +5366,14 @@ but this scheme may not be supported by 
 
 Enabled at levels @option{-O2}, @option{-O3}.
 
+@item -fno-toplevel-reorder
+Do not reorder top-level functions, variables, and @code{asm}
+statements.  Output them in the same order that they appear in the
+input file.  When this option is used, unreferenced static variables
+will not be removed.  This option is intended to support existing code
+which relies on a particular ordering.  For new code, it is better to
+use attributes.
+
 @item -fweb
 @opindex fweb
 Constructs webs as commonly used for register allocation purposes and assign
Index: cgraph.c
===================================================================
--- cgraph.c	(revision 109407)
+++ cgraph.c	(working copy)
@@ -131,13 +131,23 @@ static GTY((param_is (struct cgraph_varp
 /* Queue of cgraph nodes scheduled to be lowered and output.  */
 struct cgraph_varpool_node *cgraph_varpool_nodes_queue, *cgraph_varpool_first_unanalyzed_node;
 
-
 /* The linked list of cgraph varpool nodes.  */
 static GTY(()) struct cgraph_varpool_node *cgraph_varpool_nodes;
 
 /* End of the varpool queue.  Needs to be QTYed to work with PCH.  */
 static GTY(()) struct cgraph_varpool_node *cgraph_varpool_last_needed_node;
 
+/* Linked list of cgraph asm nodes.  */
+struct cgraph_asm_node *cgraph_asm_nodes;
+
+/* Last node in cgraph_asm_nodes.  */
+static GTY(()) struct cgraph_asm_node *cgraph_asm_last_node;
+
+/* The order index of the next cgraph node to be created.  This is
+   used so that we can sort the cgraph nodes in order by when we saw
+   them, to support -fno-toplevel-reorder.  */
+int cgraph_order;
+
 static hashval_t hash_node (const void *);
 static int eq_node (const void *, const void *);
 
@@ -169,6 +179,7 @@ cgraph_create_node (void)
   node = GGC_CNEW (struct cgraph_node);
   node->next = cgraph_nodes;
   node->uid = cgraph_max_uid++;
+  node->order = cgraph_order++;
   if (cgraph_nodes)
     cgraph_nodes->previous = node;
   node->previous = NULL;
@@ -732,6 +743,7 @@ cgraph_varpool_node (tree decl)
     return *slot;
   node = GGC_CNEW (struct cgraph_varpool_node);
   node->decl = decl;
+  node->order = cgraph_order++;
   node->next = cgraph_varpool_nodes;
   cgraph_varpool_nodes = node;
   *slot = node;
@@ -834,12 +846,11 @@ decide_is_variable_needed (struct cgraph
   if (TREE_PUBLIC (decl) && !DECL_COMDAT (decl) && !DECL_EXTERNAL (decl))
     return true;
 
-  if (flag_unit_at_a_time)
+  /* When not reordering top level variables, we have to assume that
+     we are going to keep everything.  */
+  if (flag_unit_at_a_time && flag_toplevel_reorder)
     return false;
 
-  /* If not doing unit at a time, then we'll only defer this function
-     if its marked for inlining.  Otherwise we want to emit it now.  */
-
   /* We want to emit COMDAT variables only when absolutely necessary.  */
   if (DECL_COMDAT (decl))
     return false;
@@ -876,6 +887,25 @@ cgraph_varpool_finalize_decl (tree decl)
     cgraph_varpool_assemble_pending_decls ();
 }
 
+/* Add a top-level asm statement to the list.  */
+
+struct cgraph_asm_node *
+cgraph_add_asm_node (tree asm_str)
+{
+  struct cgraph_asm_node *node;
+
+  node = GGC_CNEW (struct cgraph_asm_node);
+  node->asm_str = asm_str;
+  node->order = cgraph_order++;
+  node->next = NULL;
+  if (cgraph_asm_nodes == NULL)
+    cgraph_asm_nodes = node;
+  else
+    cgraph_asm_last_node->next = node;
+  cgraph_asm_last_node = node;
+  return node;
+}
+
 /* Return true when the DECL can possibly be inlined.  */
 bool
 cgraph_function_possibly_inlined_p (tree decl)
Index: cgraph.h
===================================================================
--- cgraph.h	(revision 109407)
+++ cgraph.h	(working copy)
@@ -144,6 +144,8 @@ struct cgraph_node GTY((chain_next ("%h.
   gcov_type count;
   /* Unique id of the node.  */
   int uid;
+  /* Ordering of all cgraph nodes.  */
+  int order;
   /* Set when function must be output - it is externally visible
      or its address is taken.  */
   bool needed;
@@ -192,6 +194,8 @@ struct cgraph_varpool_node GTY(())
   struct cgraph_varpool_node *next;
   /* Pointer to the next function in cgraph_varpool_nodes_queue.  */
   struct cgraph_varpool_node *next_needed;
+  /* Ordering of all cgraph nodes.  */
+  int order;
 
   /* Set when function must be output - it is externally visible
      or its address is taken.  */
@@ -212,6 +216,18 @@ struct cgraph_varpool_node GTY(())
   bool alias;
 };
 
+/* Every top level asm statement is put into a cgraph_asm_node.  */
+
+struct cgraph_asm_node GTY(())
+{
+  /* Next asm node.  */
+  struct cgraph_asm_node *next;
+  /* String for this asm node.  */
+  tree asm_str;
+  /* Ordering of all cgraph nodes.  */
+  int order;
+};
+
 extern GTY(()) struct cgraph_node *cgraph_nodes;
 extern GTY(()) int cgraph_n_nodes;
 extern GTY(()) int cgraph_max_uid;
@@ -221,6 +237,8 @@ extern GTY(()) struct cgraph_node *cgrap
 
 extern GTY(()) struct cgraph_varpool_node *cgraph_varpool_first_unanalyzed_node;
 extern GTY(()) struct cgraph_varpool_node *cgraph_varpool_nodes_queue;
+extern GTY(()) struct cgraph_asm_node *cgraph_asm_nodes;
+extern GTY(()) int cgraph_order;
 
 /* In cgraph.c  */
 void dump_cgraph (FILE *);
@@ -252,6 +270,8 @@ void cgraph_varpool_mark_needed_node (st
 void cgraph_varpool_finalize_decl (tree);
 void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
 
+struct cgraph_asm_node *cgraph_add_asm_node (tree);
+
 bool cgraph_function_possibly_inlined_p (tree);
 void cgraph_unnest_node (struct cgraph_node *);
 void cgraph_varpool_enqueue_needed_node (struct cgraph_varpool_node *);
Index: cgraphunit.c
===================================================================
--- cgraphunit.c	(revision 109407)
+++ cgraphunit.c	(working copy)
@@ -806,6 +806,31 @@ verify_cgraph (void)
     verify_cgraph_node (node);
 }
 
+/* Output one variable, if necessary.  Return whether we output it.  */
+static bool
+cgraph_varpool_assemble_decl (struct cgraph_varpool_node *node)
+{
+  tree decl = node->decl;
+
+  if (!TREE_ASM_WRITTEN (decl) && !node->alias && !DECL_EXTERNAL (decl))
+    {
+      assemble_variable (decl, 0, 1, 0);
+      /* Local static variables are never seen by check_global_declarations
+	 so we need to output debug info by hand.  */
+      if (DECL_CONTEXT (decl) 
+	  && (TREE_CODE (DECL_CONTEXT (decl)) == BLOCK
+	      || TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
+	  && errorcount == 0 && sorrycount == 0)
+	{
+	  timevar_push (TV_SYMOUT);
+	  (*debug_hooks->global_decl) (decl);
+	  timevar_pop (TV_SYMOUT);
+	}
+      return true;
+    }
+
+  return false;
+}
 
 /* Output all variables enqueued to be assembled.  */
 bool
@@ -823,31 +848,31 @@ cgraph_varpool_assemble_pending_decls (v
 
   while (cgraph_varpool_nodes_queue)
     {
-      tree decl = cgraph_varpool_nodes_queue->decl;
       struct cgraph_varpool_node *node = cgraph_varpool_nodes_queue;
 
       cgraph_varpool_nodes_queue = cgraph_varpool_nodes_queue->next_needed;
-      if (!TREE_ASM_WRITTEN (decl) && !node->alias && !DECL_EXTERNAL (decl))
-	{
-	  assemble_variable (decl, 0, 1, 0);
-	  /* Local static variables are never seen by check_global_declarations
-	     so we need to output debug info by hand.  */
-	  if (DECL_CONTEXT (decl) 
-	      && (TREE_CODE (DECL_CONTEXT (decl)) == BLOCK
-	          || TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
-	      && errorcount == 0 && sorrycount == 0)
-	    {
-	      timevar_push (TV_SYMOUT);
-	      (*debug_hooks->global_decl) (decl);
-	      timevar_pop (TV_SYMOUT);
-	    }
-	  changed = true;
-	}
+      if (cgraph_varpool_assemble_decl (node))
+	changed = true;
       node->next_needed = NULL;
     }
   return changed;
 }
 
+/* Output all asm statements we have stored up to be output.  */
+
+static void
+cgraph_output_pending_asms (void)
+{
+  struct cgraph_asm_node *can;
+
+  if (errorcount || sorrycount)
+    return;
+
+  for (can = cgraph_asm_nodes; can; can = can->next)
+    assemble_asm (can->asm_str);
+  cgraph_asm_nodes = NULL;
+}
+
 /* Analyze the function scheduled to be output.  */
 static void
 cgraph_analyze_function (struct cgraph_node *node)
@@ -891,6 +916,7 @@ cgraph_finalize_compilation_unit (void)
 
   if (!flag_unit_at_a_time)
     {
+      cgraph_output_pending_asms ();
       cgraph_assemble_pending_functions ();
       return;
     }
@@ -1124,6 +1150,97 @@ cgraph_expand_all_functions (void)
   free (order);
 }
 
+/* This is used to sort the node types by the cgraph order number.  */
+
+struct cgraph_order_sort
+{
+  enum { ORDER_UNDEFINED = 0, ORDER_FUNCTION, ORDER_VAR, ORDER_ASM } kind;
+  union
+  {
+    struct cgraph_node *f;
+    struct cgraph_varpool_node *v;
+    struct cgraph_asm_node *a;
+  } u;
+};
+
+/* Output all functions, variables, and asm statements in the order
+   according to their order fields, which is the order in which they
+   appeared in the file.  This implements -fno-toplevel-reorder.  In
+   this mode we may output functions and variables which don't really
+   need to be output.  */
+
+static void
+cgraph_output_in_order (void)
+{
+  int max;
+  size_t size;
+  struct cgraph_order_sort *nodes;
+  int i;
+  struct cgraph_node *pf;
+  struct cgraph_varpool_node *pv;
+  struct cgraph_asm_node *pa;
+
+  max = cgraph_order;
+  size = max * sizeof (struct cgraph_order_sort);
+  nodes = (struct cgraph_order_sort *) alloca (size);
+  memset (nodes, 0, size);
+
+  cgraph_varpool_analyze_pending_decls ();
+
+  for (pf = cgraph_nodes; pf; pf = pf->next)
+    {
+      i = pf->order;
+      gcc_assert (nodes[i].kind == ORDER_UNDEFINED);
+      nodes[i].kind = ORDER_FUNCTION;
+      nodes[i].u.f = pf;
+    }
+
+  for (pv = cgraph_varpool_nodes_queue; pv; pv = pv->next_needed)
+    {
+      i = pv->order;
+      gcc_assert (nodes[i].kind == ORDER_UNDEFINED);
+      nodes[i].kind = ORDER_VAR;
+      nodes[i].u.v = pv;
+    }
+
+  for (pa = cgraph_asm_nodes; pa; pa = pa->next)
+    {
+      i = pa->order;
+      gcc_assert (nodes[i].kind == ORDER_UNDEFINED);
+      nodes[i].kind = ORDER_ASM;
+      nodes[i].u.a = pa;
+    }
+  cgraph_asm_nodes = NULL;
+
+  for (i = 0; i < max; ++i)
+    {
+      switch (nodes[i].kind)
+	{
+	case ORDER_FUNCTION:
+	  if (nodes[i].u.f->output)
+	    {
+	      nodes[i].u.f->output = 0;
+	      cgraph_expand_function (nodes[i].u.f);
+	    }
+	  break;
+
+	case ORDER_VAR:
+	  cgraph_varpool_assemble_decl (nodes[i].u.v);
+	  break;
+
+	case ORDER_ASM:
+	  assemble_asm (nodes[i].u.a->asm_str);
+	  break;
+
+	case ORDER_UNDEFINED:
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	}
+    }
+}
+
 /* Mark visibility of all functions.
    
    A local function is one whose calls can occur only in the current
@@ -1234,6 +1351,7 @@ cgraph_optimize (void)
 #endif
   if (!flag_unit_at_a_time)
     {
+      cgraph_output_pending_asms ();
       cgraph_varpool_assemble_pending_decls ();
       return;
     }
@@ -1273,12 +1391,20 @@ cgraph_optimize (void)
 #ifdef ENABLE_CHECKING
   verify_cgraph ();
 #endif
-  
+
   cgraph_mark_functions_to_output ();
-  cgraph_expand_all_functions ();
-  cgraph_varpool_remove_unreferenced_decls ();
 
-  cgraph_varpool_assemble_pending_decls ();
+  if (!flag_toplevel_reorder)
+    cgraph_output_in_order ();
+  else
+    {
+      cgraph_output_pending_asms ();
+
+      cgraph_expand_all_functions ();
+      cgraph_varpool_remove_unreferenced_decls ();
+
+      cgraph_varpool_assemble_pending_decls ();
+    }
 
   if (cgraph_dump_file)
     {
Index: cp/parser.c
===================================================================
--- cp/parser.c	(revision 109407)
+++ cp/parser.c	(working copy)
@@ -36,6 +36,7 @@
 #include "toplev.h"
 #include "output.h"
 #include "target.h"
+#include "cgraph.h"
 #include "c-common.h"
 
 
@@ -10740,7 +10741,7 @@ cp_parser_asm_definition (cp_parser* par
 	}
     }
   else
-    assemble_asm (string);
+    cgraph_add_asm_node (string);
 }
 
 /* Declarators [gram.dcl.decl] */
Index: common.opt
===================================================================
--- common.opt	(revision 109407)
+++ common.opt	(working copy)
@@ -864,6 +864,10 @@ ftls-model=
 Common Joined RejectNegative
 -ftls-model=[global-dynamic|local-dynamic|initial-exec|local-exec]	Set the default thread-local storage code generation model
 
+ftoplevel-reorder
+Common Report Var(flag_toplevel_reorder) Init(1)
+Reorder top level functions, variables, and asms
+
 ftracer
 Common Report Var(flag_tracer)
 Perform superblock formation via tail duplication
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 109407)
+++ Makefile.in	(working copy)
@@ -554,7 +554,7 @@ TARGET_LIBGCC2_CFLAGS =
 # Options to use when compiling crtbegin/end.
 CRTSTUFF_CFLAGS = -O2 $(GCC_CFLAGS) $(INCLUDES) $(MULTILIB_CFLAGS) -g0 \
   -finhibit-size-directive -fno-inline-functions -fno-exceptions \
-  -fno-zero-initialized-in-bss -fno-unit-at-a-time \
+  -fno-zero-initialized-in-bss -fno-toplevel-reorder \
   $(INHIBIT_LIBC_CFLAGS)
 
 # Additional sources to handle exceptions; overridden by targets as needed.
Index: c-parser.c
===================================================================
--- c-parser.c	(revision 109407)
+++ c-parser.c	(working copy)
@@ -56,6 +56,7 @@ Software Foundation, 51 Franklin Street,
 #include "c-common.h"
 #include "vec.h"
 #include "target.h"
+#include "cgraph.h"
 
 
 /* Miscellaneous data and functions needed for the parser.  */
@@ -1387,12 +1388,8 @@ static void
 c_parser_asm_definition (c_parser *parser)
 {
   tree asm_str = c_parser_simple_asm_expr (parser);
-  /* ??? This only works sensibly in the presence of
-     -fno-unit-at-a-time; file-scope asms really need to be passed to
-     cgraph which needs to preserve the order of functions and
-     file-scope asms.  */
   if (asm_str)
-    assemble_asm (asm_str);
+    cgraph_add_asm_node (asm_str);
   c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>");
 }
 



More information about the Gcc-patches mailing list