Proposal: format GDB Python files with black
Luis Machado
luis.machado@linaro.org
Tue May 11 11:38:18 GMT 2021
On 5/10/21 11:55 PM, Simon Marchi via Gdb-patches wrote:
> Changing the subject, so this gets more visibility.
>
> On 2021-05-08 12:00 p.m., Tom Tromey wrote:
>>>>>>> "Simon" == Simon Marchi via Gdb-patches <gdb-patches@sourceware.org> writes:
>>
>> Simon> A few people I talked to about this and I have good experience
>> Simon> with the tool black to auto-format Python code.
>>
>> Inspired by this, I took another look at clang-format. If we could use
>> this, I think it would be a decent productivity improvement, because it
>> would mean contributors would no longer need to worry about the minutiae
>> of the GNU style. We could also stop reviewing this.
>
> I would love this.
>
I'm fine with using a tool to autoformat things. I wouldn't mind getting
used to a slightly different formatting scheme and I feel it would save
me quite a bit of time by not having to bother with little formatting
details.
>> My github has my current hacks, on the t/clang-format branch. This
>> branch just has .clang-format, I didn't reformat the sources. I'm using
>> clang-format 10.0.1.
>>
>> I managed to tweak the penalty values in the format to avoid the
>> bin-packing problem I was seeing previously. I haven't extensively
>> tested this (I usually only reformat gdb_bfd.c and look at it, since it
>> exercised several bugs before...), so I recommend that others play with
>> it a bit.
>
> I formatted the whole code base and checked a few files in Meld. In
> general, I would say that it does more good (fixing little mistakes,
> finding split lines that could actually be one line) than harm. I
> didn't see anything outrageously badly formatted (and if there is, we
> always have the option to disable formatting for a section of a file).
>
> There are a lot of changes that are neutral, where the before and after
> would be equally good.
>
> And of course, some changes where it does things not exactly like we
> would, like:
>
> gcc_type
> compile_cplus_instance::convert_reference_base
> (gcc_type base, enum gcc_cp_ref_qualifiers rquals)
>
> which becomes:
>
> gcc_type
> compile_cplus_instance::convert_reference_base (
> gcc_type base, enum gcc_cp_ref_qualifiers rquals)
>
> But really, I don't think these should be blockers. It would take
> someone some really bad faith to say that the above changes how they
> read the code.
>
> Is the following what you don't like? In gdbsupport/common-debug.h,
> before:
>
> debug_prefixed_printf (m_module, m_func, "%s: <%s debugging was not enabled on entry>",
> m_end_prefix, m_module);
>
> after:
>
> debug_prefixed_printf (
> m_module, m_func,
> "%s: <%s debugging was not enabled on entry>", m_end_prefix,
> m_module);
>
> Note: the current line is too long, my bad.
>
> I have only seen this when we make clang-format's life difficult by
> using long names or long string literals. If we don't like how it
> formats some piece of code, we always have the choice of tweaking it to
> make its like easier. It can be a good motivation to try to reduce
> indentation levels by splitting the code into functions, for example.
>
> I'm surprised you didn't use "UseTab: Always", that gives pretty much
> our scheme of tabs + spaces for indentation. By default it uses just
> spaces (I wouldn't mind using either, but since the goal is to get
> something close to our current style).
>
> I would use "IndentPPDirectives: AfterHash". Unlike what we use
> currently, it will indent with $IndentWidth columns, whereas we
> typically we indent these with a single column when we do it manually.
> But, I think I prefer 2 columns actually. You can try formatting
> gdbsupport/gdb_locale.h for a good example.
>
> Speaking of which, in order to be usable from gdbsupport and gdbserver,
> .clang-format would need to be in the top-level directory. Or maybe it
> could be in gdbsupport, and gdb/gdbserver symlink to it.
>
> We have a number of places that put the namespace brace on the same
> line:
>
> namespace foo {
>
> We could keep that by using "BreakBeforeBraces: Custom". But I don't
> think it's worth the trouble. It adds one extra line for the opening
> brace, but it's typically not in a region of the file that you care much
> about.
>
> I noticed this:
>
> btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end)
> {
> /* Nothing. */
> }
>
> The initializer list is put on the same line as the constructor's
> prototype. If there's a way to tell it that we don't want that, it
> would be closer to our style.
>
> For AlignEscapedNewlines, I'd probably choose Right or DontAlign. The
> reason being that if you modify the longest line of a macro (make that
> line longer or shorter), all the macros line will change as a result.
> So it will be harder to spot what changed when reading the hunk in the
> diff. I have a slight preference for Right, because it puts the
> backslashes out of sight, thus improving readability. It's the kind of
> thing that is quite annoying to maintain by hand, but it's not if a tool
> does it.
>
> I would suggest setting SplitEmptyFunction, SplitEmptyRecord and
> SplitEmptyNamespace to false. We already do it for empty methods.
>
> Enough for now.
>
>> I did see a few issues, though perhaps they are minor.
>>
>> 1. It removes all the Control-L page breaks from the source.
>> I know these often confuse newcomers, so maybe it's fine.
>> I feel that Someone out there will cheer this :)
>
> Who dat?
>
>>
>> 2. clang-format can't handle GNU-style goto label out-denting.
>> There's a clang-format bug on this topic
>> https://bugs.llvm.org/show_bug.cgi?id=24118
>
> Probably not a big deal for us, as we don't use goto much?
>
>>
>> 3. In gdb if we have a large 'if' condition that consists of expression
>> chained with "&&" or "||", we often put each sub-condition on its own
>> line. clang-format packs them instead, and I didn't see a way to
>> counteract this.
>
> I hate too long conditions anyway. If the condition spans too many
> lines or has deep parenthesis nesting, it would deserve to be refactored
> anyway. Like, put that code in a function and just have:
>
> if (my_new_function (...))
>
> With a descriptive name for the function, the code ends up more readable.
>
>> 4. Gettext invocations like _("text") have a space added, like _ ("text").
>> I didn't see a way to change this.
>
> A very little inconvenience IMO, compared to not having to think about
> formatting. We could live with this, and file a bug at the same time.
> There could be an option to add exceptions to SpaceBeforeParens.
>
>> Another possible problem is that, unlike with 'black', ensuring that
>> everyone has the same version is a pain.
>
> Indeed. It's perhaps easy for me to say, because I get to choose what
> Linux distro and version I work on (so I opt for something recent), but
> I would still lean towards just following whatever the current latest
> stable version is. There might be new options in the latest stable
> version we want to use, it would be nice not to have to wait years
> before we can use them. And I suppose there's a not too painful way to
> get it for all the major distros out there (and for Windows and macOS,
> there are binary releases). And you can always run it in Docker or
> something.
I suppose we could have scripts to automate this sort of task. Something
that checks for the latest stable release and downloads it? A helper to
make things easier and more consistent.
Can we do this server-side and take the burden off of the developers to
have to do this as an additional step before pushing changes?
>
> But we would probably encourage people to use git-clang-format, which
> only reformats the lines you touched. So, it wouldn't be a big problem
> for people to use different versions, as it wouldn't create spurious
> changes elsewhere in the file. The differences in formatting would be
> in lines you touch anyway. >
> Speaking of git-clang-format: from my experience, it is not super cool
> when the code base isn't already formatted with the target formatting
> style. It causes discrepancies in style, because you have a mix of new
> style (whatever lines were touched) and old style (whatever lines were
> not touched) which can be annoying and distracting. Or, it causes
> slightly bigger diffs than you would get otherwise. For example, I
> ran git-clang-format on this commit here:
>
> https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10e578d7e00d
>
> ... to see what would have been the resulting patch had the author used
> git-clang-format. This would have been the patch:
>
> https://pastebin.com/pxrZtnDU
>
> If you check the changes in mi/mi-cmd-break.c, you can see that
> git-clang-format re-formatted the whole enum and array initializer.
> Which is nice in itself, the resulting formatting is nice. But, the
> patch would be a bit harder to read.
>
> The other option is to start by doing a big re-format and then tell
> people to use git-clang-format. This way, there might be some small
> incoherences here and there due to different versions, but I think
> they'll be insignificant.
>
> With the Python code, we did a massive re-format. I don't think it did
> / will cause a big inconvenience, because there aren't many patches to
> the Python code. But for the C++ code, it might be different, as there
> are more patches in everyone's pipeline that touch that code.
>
> Another question is: would we re-format the C/C++ code in the testsuite?
> It might require a few adjustments to test cases (like there was one for
> black), but I would be fine with that. My concern is more if there are
> some tests where the formatting of the code was specifically important.
> We could disable formatting for them, but the difficulty is to find
> them.
I think reformatting the test sources has the potential to break things
on testcases with hardcoded assumptions. I wouldn't touch them at first.
>
> Simon
>
More information about the Gdb-patches
mailing list