[PATCH v2] gdbsupport: add path_join function

Simon Marchi simon.marchi@polymtl.ca
Wed Apr 20 00:20:05 GMT 2022


From: Simon Marchi <simon.marchi@efficios.com>

New in v2:

  - Add some Windows tests.  I didn't add much because I don't know what
    kind of corner cases is important to test.  It would take someone
    who knows and cares about Windows to improve them, if need be.
  - Change the behavior to look more like concatenation of
    std::filesystem::path objects (from C++17).  Our implementation
    isn't as complete as the std::filesystem one though.  If we need
    something that is even closer to std::filesystem::path, perhaps we
    can make a local import of the libstdc++ implementation, like we did
    for string_view, rather than trying to re-implement it.
  - Replace the variadic function with a templated function.

In this review [1], Eli pointed out that we should be careful when
concatenating file names to avoid duplicated slashes.  On Windows, a
double slash at the beginning of a file path has a special meaning.  So
naively concatenating "/"  and "foo/bar" would give "//foo/bar", which
would not give the desired results.  We already have a few spots doing:

  if (first_path ends with a slash)
    path = first_path + second_path
  else
    path = first_path + slash + second_path

In general, I think it's nice to avoid superfluous slashes in file
paths, since they might end up visible to the user and look a bit
unprofessional.

Introduce the path_join function that can be used to join multiple path
components together (along with unit tests).

The behavior is based on what concatenating std::filesystem::path
objects (available in C++17) would do.  With this behavior, joining with
an absolute path on the right hand side, like this:

  path_join ("/foo", "/bar");

results in just the right hand side being kept.  In this example, the
result is "/bar".

I initially wanted to make it possible to join two absolute paths, to
support the use case of prepending a sysroot path to a target file path,
or the prepending the debug-file-directory to a target file path.  But
the code in solib_find_1 shows that it is more complex than this anyway
(for example, when the right hand side is a Windows path with a drive
letter).  So I don't think we need to support that case in path_join.

Change a few spots to use path_join to show how it can be used.  I
believe that all the spots I changed are guarded by some checks that
ensure the right hand side operand is not an absolute path.  So we
shouldn't fall into the case described just above.

Regression-tested on Ubuntu 18.04.  Built-tested on Windows, and I also
ran the new unit-test there.

[1] https://sourceware.org/pipermail/gdb-patches/2022-April/187559.html

Change-Id: I0df889f7e3f644e045f42ff429277b732eb6c752
---
 gdb/Makefile.in                     |  1 +
 gdb/buildsym.c                      |  5 +-
 gdb/dwarf2/line-header.c            |  3 +-
 gdb/dwarf2/read.c                   | 42 ++++++-------
 gdb/macrotab.c                      |  3 +-
 gdb/unittests/path-join-selftests.c | 92 +++++++++++++++++++++++++++++
 gdbsupport/gdb_string_view.h        |  2 +-
 gdbsupport/pathstuff.cc             | 46 ++++++++++-----
 gdbsupport/pathstuff.h              | 24 ++++++++
 9 files changed, 175 insertions(+), 43 deletions(-)
 create mode 100644 gdb/unittests/path-join-selftests.c

diff --git a/gdb/Makefile.in b/gdb/Makefile.in
index c8e140b55449..ec0e55dd803c 100644
--- a/gdb/Makefile.in
+++ b/gdb/Makefile.in
@@ -465,6 +465,7 @@ SELFTESTS_SRCS = \
 	unittests/optional-selftests.c \
 	unittests/parallel-for-selftests.c \
 	unittests/parse-connection-spec-selftests.c \
+	unittests/path-join-selftests.c \
 	unittests/ptid-selftests.c \
 	unittests/main-thread-selftests.c \
 	unittests/mkdir-recursive-selftests.c \
diff --git a/gdb/buildsym.c b/gdb/buildsym.c
index 628903d674f4..b0d794409876 100644
--- a/gdb/buildsym.c
+++ b/gdb/buildsym.c
@@ -20,6 +20,7 @@
 #include "buildsym-legacy.h"
 #include "bfd.h"
 #include "gdbsupport/gdb_obstack.h"
+#include "gdbsupport/pathstuff.h"
 #include "symtab.h"
 #include "symfile.h"
 #include "objfiles.h"
@@ -507,8 +508,8 @@ buildsym_compunit::start_subfile (const char *name)
 	  && !IS_ABSOLUTE_PATH (subfile->name)
 	  && !m_comp_dir.empty ())
 	{
-	  subfile_name_holder = string_printf ("%s/%s", m_comp_dir.c_str (),
-					       subfile->name.c_str ());
+	  subfile_name_holder = path_join (m_comp_dir.c_str (),
+					   subfile->name.c_str ());
 	  subfile_name = subfile_name_holder.c_str ();
 	}
       else
diff --git a/gdb/dwarf2/line-header.c b/gdb/dwarf2/line-header.c
index 77e7e553b21e..8f4eb4f124cf 100644
--- a/gdb/dwarf2/line-header.c
+++ b/gdb/dwarf2/line-header.c
@@ -24,6 +24,7 @@
 #include "dwarf2/read.h"
 #include "complaints.h"
 #include "filenames.h"
+#include "gdbsupport/pathstuff.h"
 
 void
 line_header::add_include_dir (const char *include_dir)
@@ -73,7 +74,7 @@ line_header::file_file_name (int file) const
 	{
 	  const char *dir = fe->include_dir (this);
 	  if (dir != NULL)
-	    return string_printf ("%s/%s", dir, fe->name);
+	    return path_join (dir, fe->name);
 	}
 
       return fe->name;
diff --git a/gdb/dwarf2/read.c b/gdb/dwarf2/read.c
index 6dcd446e5f46..1fd39e903dc0 100644
--- a/gdb/dwarf2/read.c
+++ b/gdb/dwarf2/read.c
@@ -1343,7 +1343,7 @@ static const char *compute_include_file_name
      (const struct line_header *lh,
       const file_entry &fe,
       const file_and_directory &cu_info,
-      gdb::unique_xmalloc_ptr<char> *name_holder);
+      std::string &name_holder);
 
 static htab_up allocate_signatured_type_table ();
 
@@ -2809,9 +2809,9 @@ dw2_get_file_names_reader (const struct die_reader_specs *reader,
     {
       for (const auto &entry : lh->file_names ())
 	{
-	  gdb::unique_xmalloc_ptr<char> name_holder;
+	  std::string name_holder;
 	  const char *include_name =
-	    compute_include_file_name (lh.get (), entry, fnd, &name_holder);
+	    compute_include_file_name (lh.get (), entry, fnd, name_holder);
 	  if (include_name != nullptr)
 	    {
 	      include_name = per_objfile->objfile->intern (include_name);
@@ -11166,14 +11166,13 @@ open_dwo_file (dwarf2_per_objfile *per_objfile,
 
   if (comp_dir != NULL)
     {
-      gdb::unique_xmalloc_ptr<char> path_to_try
-	(concat (comp_dir, SLASH_STRING, file_name, (char *) NULL));
+      std::string path_to_try = path_join (comp_dir, file_name);
 
       /* NOTE: If comp_dir is a relative path, this will also try the
 	 search path, which seems useful.  */
-      gdb_bfd_ref_ptr abfd (try_open_dwop_file (per_objfile, path_to_try.get (),
-						0 /*is_dwp*/,
-						1 /*search_cwd*/));
+      gdb_bfd_ref_ptr abfd (try_open_dwop_file
+	(per_objfile, path_to_try.c_str (), 0 /*is_dwp*/, 1 /*search_cwd*/));
+
       if (abfd != NULL)
 	return abfd;
     }
@@ -19751,14 +19750,14 @@ dwarf_decode_line_header (sect_offset sect_off, struct dwarf2_cu *cu)
 static const char *
 compute_include_file_name (const struct line_header *lh, const file_entry &fe,
 			   const file_and_directory &cu_info,
-			   gdb::unique_xmalloc_ptr<char> *name_holder)
+			   std::string &name_holder)
 {
   const char *include_name = fe.name;
   const char *include_name_to_compare = include_name;
 
   const char *dir_name = fe.include_dir (lh);
 
-  gdb::unique_xmalloc_ptr<char> hold_compare;
+  std::string hold_compare;
   if (!IS_ABSOLUTE_PATH (include_name)
       && (dir_name != nullptr || cu_info.get_comp_dir () != nullptr))
     {
@@ -19785,27 +19784,24 @@ compute_include_file_name (const struct line_header *lh, const file_entry &fe,
 
       if (dir_name != NULL)
 	{
-	  name_holder->reset (concat (dir_name, SLASH_STRING,
-				      include_name, (char *) NULL));
-	  include_name = name_holder->get ();
+	  name_holder = path_join (dir_name, include_name);
+	  include_name = name_holder.c_str ();
 	  include_name_to_compare = include_name;
 	}
       if (!IS_ABSOLUTE_PATH (include_name)
 	  && cu_info.get_comp_dir () != nullptr)
 	{
-	  hold_compare.reset (concat (cu_info.get_comp_dir (), SLASH_STRING,
-				      include_name, (char *) NULL));
-	  include_name_to_compare = hold_compare.get ();
+	  hold_compare = path_join (cu_info.get_comp_dir (), include_name);
+	  include_name_to_compare = hold_compare.c_str ();
 	}
     }
 
-  gdb::unique_xmalloc_ptr<char> copied_name;
+  std::string copied_name;
   const char *cu_filename = cu_info.get_name ();
   if (!IS_ABSOLUTE_PATH (cu_filename) && cu_info.get_comp_dir () != nullptr)
     {
-      copied_name.reset (concat (cu_info.get_comp_dir (), SLASH_STRING,
-				 cu_filename, (char *) NULL));
-      cu_filename = copied_name.get ();
+      copied_name = path_join (cu_info.get_comp_dir (), cu_filename);
+      cu_filename = copied_name.c_str ();
     }
 
   if (FILENAME_CMP (include_name_to_compare, cu_filename) == 0)
@@ -20540,7 +20536,7 @@ static void
 dwarf2_start_subfile (struct dwarf2_cu *cu, const char *filename,
 		      const char *dirname)
 {
-  gdb::unique_xmalloc_ptr<char> copy;
+  std::string copy;
 
   /* In order not to lose the line information directory,
      we concatenate it to the filename when it makes sense.
@@ -20551,8 +20547,8 @@ dwarf2_start_subfile (struct dwarf2_cu *cu, const char *filename,
 
   if (!IS_ABSOLUTE_PATH (filename) && dirname != NULL)
     {
-      copy.reset (concat (dirname, SLASH_STRING, filename, (char *) NULL));
-      filename = copy.get ();
+      copy = path_join (dirname, filename);
+      filename = copy.c_str ();
     }
 
   cu->get_builder ()->start_subfile (filename);
diff --git a/gdb/macrotab.c b/gdb/macrotab.c
index 7ca3f312d330..108e6f1bbaed 100644
--- a/gdb/macrotab.c
+++ b/gdb/macrotab.c
@@ -19,6 +19,7 @@
 
 #include "defs.h"
 #include "gdbsupport/gdb_obstack.h"
+#include "gdbsupport/pathstuff.h"
 #include "splay-tree.h"
 #include "filenames.h"
 #include "symtab.h"
@@ -1071,5 +1072,5 @@ macro_source_fullname (struct macro_source_file *file)
   if (comp_dir == NULL || IS_ABSOLUTE_PATH (file->filename))
     return file->filename;
 
-  return std::string (comp_dir) + SLASH_STRING + file->filename;
+  return path_join (comp_dir, file->filename);
 }
diff --git a/gdb/unittests/path-join-selftests.c b/gdb/unittests/path-join-selftests.c
new file mode 100644
index 000000000000..c43e78b7144e
--- /dev/null
+++ b/gdb/unittests/path-join-selftests.c
@@ -0,0 +1,92 @@
+/* Self tests for path_join for GDB, the GNU debugger.
+
+   Copyright (C) 2022 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include "defs.h"
+#include "gdbsupport/pathstuff.h"
+#include "gdbsupport/selftest.h"
+
+#if __cplusplus >= 201703L
+#include <filesystem>
+#endif
+
+namespace selftests {
+namespace path_join {
+
+template <typename ...Args>
+static void
+test_one (const char *expected, Args... paths)
+{
+  std::string actual = ::path_join (paths...);
+  SELF_CHECK (actual == expected);
+
+  /* If building with C++17 or later, verify that path_join gives the same
+     result as concatenating with std::filesystem::path.  */
+#if __cplusplus >= 201703L
+  std::filesystem::path std_path;
+
+  for (const char *p : {paths...})
+    std_path /= p;
+
+  SELF_CHECK (std_path == actual);
+#endif
+}
+
+/* Test path_join.  */
+
+static void
+test ()
+{
+  test_one ("/foo/bar", "/foo", "bar");
+  test_one ("/bar", "/foo", "/bar");
+  test_one ("/bar", "/", "bar");
+  test_one ("/bar", "/", "/bar");
+  test_one ("/bar", "foo/", "/bar");
+  test_one ("foo/bar/", "foo", "bar", "");
+  test_one ("/foo", "", "/foo");
+  test_one ("foo", "", "foo");
+  test_one ("foo/bar", "foo", "", "bar");
+  test_one ("foo/", "foo", "");
+  test_one ("foo/", "foo/", "");
+
+  test_one ("D:/foo/bar", "D:/foo", "bar");
+  test_one ("D:/foo/bar", "D:/foo/", "bar");
+
+#if defined(_WIN32)
+  /* The current implementation doesn't recognize backslashes as directory
+     separators on Unix-like systems, so only run these tests on Windows.  If
+     we ever switch our implementation to use std::filesystem::path, they
+     should work anywhere, in theory.  */
+  test_one ("D:\\foo/bar", "D:\\foo", "bar");
+  test_one ("D:\\foo\\bar", "D:\\foo\\", "bar");
+  test_one ("\\\\server\\dir\\file", "\\\\server\\dir\\", "file");
+  test_one ("\\\\server\\dir/file", "\\\\server\\dir", "file");
+#endif /* _WIN32 */
+}
+
+}
+}
+
+void _initialize_path_join_selftests ();
+void
+_initialize_path_join_selftests ()
+{
+  selftests::register_test ("path_join",
+			    selftests::path_join::test);
+}
+
diff --git a/gdbsupport/gdb_string_view.h b/gdbsupport/gdb_string_view.h
index f8150288ee57..4e26bd8ff2c1 100644
--- a/gdbsupport/gdb_string_view.h
+++ b/gdbsupport/gdb_string_view.h
@@ -34,7 +34,7 @@
 //
 
 
-#if __cplusplus >= 201703L
+#if __cplusplus >= 201703L && 0
 
 #include <string_view>
 
diff --git a/gdbsupport/pathstuff.cc b/gdbsupport/pathstuff.cc
index ac65651d4492..d80d54740ee4 100644
--- a/gdbsupport/pathstuff.cc
+++ b/gdbsupport/pathstuff.cc
@@ -119,10 +119,7 @@ gdb_realpath_keepfile (const char *filename)
      directory separator, avoid doubling it.  */
   gdb::unique_xmalloc_ptr<char> path_storage = gdb_realpath (dir_name);
   const char *real_path = path_storage.get ();
-  if (IS_DIR_SEPARATOR (real_path[strlen (real_path) - 1]))
-    return string_printf ("%s%s", real_path, base_name);
-  else
-    return string_printf ("%s/%s", real_path, base_name);
+  return path_join (real_path, base_name);
 }
 
 /* See gdbsupport/pathstuff.h.  */
@@ -138,12 +135,7 @@ gdb_abspath (const char *path)
   if (IS_ABSOLUTE_PATH (path) || current_directory == NULL)
     return path;
 
-  /* Beware the // my son, the Emacs barfs, the botch that catch...  */
-  return string_printf
-    ("%s%s%s", current_directory,
-     (IS_DIR_SEPARATOR (current_directory[strlen (current_directory) - 1])
-      ? "" : SLASH_STRING),
-     path);
+  return path_join (current_directory, path);
 }
 
 /* See gdbsupport/pathstuff.h.  */
@@ -198,6 +190,30 @@ child_path (const char *parent, const char *child)
 
 /* See gdbsupport/pathstuff.h.  */
 
+std::string
+path_join (gdb::array_view<const gdb::string_view> paths)
+{
+  std::string ret;
+
+  for (gdb::string_view path : paths)
+    {
+      if (!path.empty () && IS_ABSOLUTE_PATH (path))
+	{
+	  ret = to_string (path);
+	  continue;
+	}
+
+      if (!ret.empty () && !IS_DIR_SEPARATOR (ret.back ()))
+	  ret += '/';
+
+      ret.append (path.data (), path.length ());
+    }
+
+  return ret;
+}
+
+/* See gdbsupport/pathstuff.h.  */
+
 bool
 contains_dir_separator (const char *path)
 {
@@ -227,7 +243,7 @@ get_standard_cache_dir ()
     {
       /* Make sure the path is absolute and tilde-expanded.  */
       std::string abs = gdb_abspath (xdg_cache_home);
-      return string_printf ("%s/gdb", abs.c_str ());
+      return path_join (abs.c_str (), "gdb");
     }
 #endif
 
@@ -236,7 +252,7 @@ get_standard_cache_dir ()
     {
       /* Make sure the path is absolute and tilde-expanded.  */
       std::string abs = gdb_abspath (home);
-      return string_printf ("%s/" HOME_CACHE_DIR "/gdb", abs.c_str ());
+      return path_join (abs.c_str (), HOME_CACHE_DIR,  "gdb");
     }
 
 #ifdef WIN32
@@ -245,7 +261,7 @@ get_standard_cache_dir ()
     {
       /* Make sure the path is absolute and tilde-expanded.  */
       std::string abs = gdb_abspath (win_home);
-      return string_printf ("%s/gdb", abs.c_str ());
+      return path_join (abs.c_str (), "gdb", nullptr);
     }
 #endif
 
@@ -294,7 +310,7 @@ get_standard_config_dir ()
     {
       /* Make sure the path is absolute and tilde-expanded.  */
       std::string abs = gdb_abspath (xdg_config_home);
-      return string_printf ("%s/gdb", abs.c_str ());
+      return path_join (abs.c_str (), "gdb");
     }
 #endif
 
@@ -303,7 +319,7 @@ get_standard_config_dir ()
     {
       /* Make sure the path is absolute and tilde-expanded.  */
       std::string abs = gdb_abspath (home);
-      return string_printf ("%s/" HOME_CONFIG_DIR "/gdb", abs.c_str ());
+      return path_join (abs.c_str (), HOME_CONFIG_DIR, "gdb");
     }
 
   return {};
diff --git a/gdbsupport/pathstuff.h b/gdbsupport/pathstuff.h
index f6c51e9de887..2e5893466c67 100644
--- a/gdbsupport/pathstuff.h
+++ b/gdbsupport/pathstuff.h
@@ -21,6 +21,7 @@
 #define COMMON_PATHSTUFF_H
 
 #include "gdbsupport/byte-vector.h"
+#include "gdbsupport/array-view.h"
 
 #include <sys/types.h>
 #include <sys/stat.h>
@@ -60,6 +61,29 @@ extern std::string gdb_abspath (const char *path);
 
 extern const char *child_path (const char *parent, const char *child);
 
+/* Join elements in PATHS into a single path.
+
+   This function is intended to have the same behavior as concatenating
+   std::filesystem::path objects with std::filesystem::operator/ (which was
+   introduced in C++17).  */
+
+extern std::string path_join (gdb::array_view<const gdb::string_view> paths);
+
+/* Same as the above, but accept paths as distinct parameters.  */
+
+template<typename ...Args>
+std::string
+path_join (Args... paths)
+{
+  /* It doesn't make sense to join less than two paths.  */
+  gdb_static_assert (sizeof... (Args) >= 2);
+
+  std::array<gdb::string_view, sizeof... (Args)> views
+    { gdb::string_view (paths)... };
+
+  return path_join (gdb::array_view<const gdb::string_view> (views));
+}
+
 /* Return whether PATH contains a directory separator character.  */
 
 extern bool contains_dir_separator (const char *path);

base-commit: 6ea673e2d643b7b2283aa73d35b02dfc9aa7115f
-- 
2.35.2



More information about the Gdb-patches mailing list