Bug 13712 - Add Type Cast to Main and Exec Calls
Summary: Add Type Cast to Main and Exec Calls
Status: RESOLVED INVALID
Alias: None
Product: glibc
Classification: Unclassified
Component: libc (show other bugs)
Version: unspecified
: P1 enhancement
Target Milestone: ---
Assignee: Ulrich Drepper
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-02-20 03:29 UTC by oiaohm
Modified: 2014-06-26 15:21 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Last reconfirmed:
fweimer: security-


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description oiaohm 2012-02-20 03:29:15 UTC
There is a common failure solution will mean rebuilding applications.

cat * simple enough right.  Not really what if there is a -n file in that directory.  So why does not cat not know that -n is a file.  Because bash did not tell it that it was a file.  There is no way for bash currently to tell cat this.  So now cat presumes it a command line directive.  So fails todo what user wished.

Source of problem
int main ( int argc, char *argv[] )

Yes the C standard.  I know this will cause a lot of changes.
I propose providing
int securemain ( int argc, char *argv[],char *flags )


If flags=NULL pointer the call was by the old exec functions.
If not null there will be 1 char per arguments giving its type.

So in the case of cat * on bash.  cat could receive -n as argv but since the user did not enter it F for file.

Flags I would be suggesting.

E environment. That is like /bin/cat ie argv 0.
F for file that is anything from like * and other file search wildcards.  Items known to really be filenames.
U for user entered and Unknown.  Like if user types cat -n * the -n is U because user entered it unless shell detects it as a directive and tags it correctly.
S string
N number
- For program directive.

Now like a python script calling exec on like cat <someoutside source var>  being able to say hey that is a file stops -n file name being missed up with - program directive.

S for string and N for number.  Ok does not mean application does not have to check for these.  But it makes it clear what these are and if the argv does not match declared type application knows it got bad feed of data somehow.  So if application is expecting a F and it gets a S there is a problem type error.

U is user entered its also Unknown.  Bash could pick up auto completes to file and call those files not User.  Same with a shell that is auto completing program directives might - it because it was a program directive the user looked up.

srun could be added to work around any typing problems.  What is basically srun "flagstring" application commandline so allowing user to override shell set typing.

Also if you are calling from a script clearly saying where I use this var here its a file/string/number don't process it as a program directive.

The basic command-line is open to like SQL injection attacks where something can pass a directive causing something the programmer never wanted to happen to happen unless something like this is implemented.

Ideal is you try a srun on an application and its not a securemain executable that it fails and informs you of the case.

I know this does mean that every application that needs to be secure need to be altered.  I see no other way that we can add typing to the process that will work.

-- at the end to say past this point don't process arguements is only useful if user types it and the application is not taking like. 

With cat worse is - where cat * there is a file - in the directory so when it gets to - it stops solid reading standard input and putting it back out to output.

Shells need to clearly be able to tell applications that this is a list of files so files don't get used as special features because someone created a file named the same as a special feature.

Also there need to be ways for scripting to type lock.  Not like a script coder can be expect to know every switch of a command they are calling.  They can know what switches they want and where they placed those.  They can also know if something should be treated as a file name, a string or a number.

This alteration allows for more solid scripting and the shell act more how the user expects so cat * if cat gets a securemain will not take - or -n filenames as a directive so print the contents of those files as user is expecting.

Result is better predictability for shell users and secuirty for script writers that that call out to cat to cat something is not going to get stuck due to some evil filename. 

There is room for this type list to be expanded.  It would be nice to see typed vars in bash but there is no point while the commands bash calls does not support type casting.

The int main wrapper remains simple for platforms that don't support secure main as well.
int main(int argc, char *argv[]) {
int securemain ( argc,argv,NULL);
}

Systems that cannot do securemain basically have a small simple wrapper that disables it.

Sorry I do not have the skills to write a patch to alter glibc to support this.  This has to be supported from the libc first.

I gave this a P1 due to the secuirty improvements and quality of command line experience improvements this opens up.
Comment 1 oiaohm 2012-02-20 23:38:51 UTC
I have come up with a worse/better example of problem.

File -rf exists in a directory as first file. You do a rm -i * on that directory everything bar -rf gets deleted without asking you a question once.

rm saw.  rm -i -rf <list of file>  Yes -rf overrides -i so delete everything without telling the user because that is what force is tell it.  Worse r is telling it to delete every directory in the * list as well.  Something person shell was never telling rm todo.

Really this can of worms is kinda huge.  Anything you do command * on shell that can take a directive might not do what you are intending.

How to fix this is the problem.

Reason why this problem exists. The "int main(int argc, char *argv[])" was designed before shells got wildcards when users had to fill everything in.

If you were designing with the presume that the argv data might be filled in with wildcards you need types.  So the applications know the difference between user instructions.  Not like the -- solution everywhere will work

gcc *.c -o program is a valid call.

Shells would either need a huge rule set that most likely will be wrong.  Or we provide some structure here.
Comment 2 Joseph Myers 2012-02-20 23:49:59 UTC
For better or for worse, this is simply how C and the Unix syscall interface work: programs receive an array of NUL-terminated strings in C, and such an array is what is passed to the execve syscall.  The third argument of main on Unix-like systems is already an array of environment variables and it is not possible to change that incompatibly.  If you wanted a more structured interface as an alternative then it can't be proposed through a bug for one component as this major change to C and Unix would need development in multiple components together; you'd need to prepare implementations for all the components involved, papers for WG14 (C) and the Austin Group (POSIX) explaining the design and get agreement from multiple stakeholders.

The standard technique for shell scripts to deal with this issue is either using "./*" or using the "--" argument to separate options from arguments.
Comment 3 oiaohm 2012-02-21 02:26:28 UTC
Joseph Myers would it be possible to place a forth entry.

argc avgv environment flags and to know if 4 is or is not provided.  Call old exec functions no flags call new exec function flags.  This would keep backwards compatibility if it possible.

"papers for WG14 (C) and the Austin Group (POSIX)"

Basically I cannot do up papers for these if I don't know what is possible.

Also my English skill is not that great.  Creating papers I am not that good at.

I am going to hit you over head with a ten tone book POSIX.

POSIX.1-2008 ’s “base definitions” document section 4.7 (“Filename Portability”) specifically says “Portable filenames shall not have the <hyphen> character as the first character since this may cause problems when filenames are passed as command line arguments”

So why can glibc create files with at - at start.  That is not to POSIX standard.  POSIX solution is no -* file-names at all so solving problem.  glibc is non conforming same with Linux filesystem drivers.

So if you wish to keep -* filenames really a proposal has to be put forward to POSIX making the case that they can be used safely this is how to achieve it.

So POSIX standard solution is not "./*" or "--" options this is why this is a problem.

WG14 you need a working reference implementation to get stuff up.  What I do not have the skill to create.  Also WG14 does not apply that much because WG14 still gets to presume it exists in worlds without shells that can do wild-card solving.

"the design and get agreement from multiple stakeholders"
Glibc is one of the largest stakeholders in this problem.

Basically if what I am putting forward is not workable with glibc I might as well not put it forwards.

I am bring the same issue up with bash at the moment.
http://savannah.gnu.org/support/index.php?107960

Any other parties I should contact.

The issue I am referring is only happening because the implementations are breaching POSIX standard.  The error does not exist in 100 percent POSIX conforming enforcing portable filenames.

No point tell me to go to POSIX and get a ruling when there is a ruling that you are not following so causing issue.  This is a case of how are we going to fix this to put something up to POSIX so you can keep on doing what you are doing.

Invalid would be true if what you were doing was POSIX confirming.
Comment 4 Joseph Myers 2012-02-21 03:14:10 UTC
There is a difference in POSIX between filenames, defined as "A name consisting of 1 to {NAME_MAX} bytes used to name a file. The characters composing the name may be selected from the set of all character values excluding the <slash> character and the null byte.", and portable filenames.  glibc and most filesystems on POSIX systems support all valid filenames, not just portable filenames.

http://austingroupbugs.net/view.php?id=251

is a proposal for disallowing newlines in filenames, as the most problematic characters - all other characters can be handled in shell scripts significantly more easily.  There has been a huge amount written on that subject on the Austin Group mailing list over the past couple of years, and anyone seriously interested in addressing such issues needs to digest the previous discussion and engage with the main forum where they are discussed, which is the Austin Group - you need to argue how a new C API should be part of the solution.

In any case, it is not possible to implement such a feature in glibc without the underlying kernel interfaces supplementing execve, so having a well-defined and accepted kernel interface would be required for a meaningful request to implement such a feature in glibc.

Bug trackers are simply not an appropriate place for designing features involving multiple components because they do not support discussion of the interactions between those components.  Thus, if you wish to discuss this further, please do not do it here; raise it on appropriate mailing lists that deal with those interactions.  In practice the Austin Group lists would be the best place; for changes specific to GNU/Linux, you could consider the linux-api list which deals with the kernel/userspace interface.
Comment 5 oiaohm 2012-02-22 05:21:43 UTC
Joseph Myers
"In any case, it is not possible to implement such a feature in glibc without
the underlying kernel interfaces supplementing execve, so having a well-defined
and accepted kernel interface would be required for a meaningful request to
implement such a feature in glibc."

This is what I need to know.  There is no simple fix it will require a kernel alteration to go the path I am looking at.