setup.hint documentation issues

While I've been looking at replacement/improvement for the current upset script, I've come across some minor issues related to under-specification or under-documentation of setup.hint:

* The encoding of setup.hint is unspecified.

Historically both ISO-8859-1 and UTF-8 have been used. (e.g. libspiro used 'bÃzier' with an ISO-8859-1 e-acute, whereas calligra-l10n-nb uses 'BokmÃl' with an UTF-8 a-ring. Various other hints use UTF-8 punctuation marks)

I think currently UTF-8 displays correctly in the HTML package pages, but neither encoding displays correctly in setup.

I'd suggest that we specify UTF-8 and eventually fix setup to handle that.

* 'sdesc' text is mangled in setup.ini (but not the HTML package list)

In particular, it is forced to start with a capital letter (which is incorrect when the sdesc starts with a command name which is properly lower-case, e.g. "dash shell", etc.), and any text up to and including the first colon is removed, presumably in an effort to prevent people writing the package name again, (which mangles perl and ruby module names in the description, e.g. "Ruby Net::HTTP persistent connection support", ""Perl Math::Int64 distribution", etc.)

I'd suggest this mangling is removed, and sdesc starting "packagename:" is explicitly reported.

* Handling of double-quoted text seems over-complicated

A multi-line double-quoted value is terminated only by a double-quote at the end of the line, and embedded double-quotes are silently transformed to single-quotes (e.g proj had a sdesc of ""The PROJ Cartographic Projections Software (utilities)", where the erroneous nested double-quote was being transformed to a single-quote)

There is no escaping of embedded double-quotes, and no way to represent one.

Additionally, spaces after the leading quote are magically removed.

Additionally, genini requires that sdesc and ldesc are double-quoted, but upset does not.

I'd suggest that double-quoting of those keys is made mandatory, and embedded double-quotes are forbidden, as this permits simpler processing of this text, lexing character by character.

* It's not very clear what 'skip' represents

The description "The skip line indicates that that package should not appear in setup. It is intended for directories that exist in the hierarchy that should not be considered." is a bit vague to me.

It's not totally clear if it's intended for indicating directories which should be empty, source-only packages, or something else.

upset knows enough to omit packages which have no install tarfiles (i.e. are source-only) from from setup.ini, irrespective of 'skip'.

However, the presence of 'skip' also causes the package to be omitted from the HTML package list.

I think cygport's behaviour has changed over time, but currently will mark source-packages as 'skip', however there are several packages that are source-only (e.g. attica), that are missing 'skip'.

