Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are many people arguing specific semantics of the authors arguments, but I believe the core problem is C and C++ both dramatically overusing “undefined” vs “unspecified”.

The difference is huge. Signed integer overflow is (per spec) undefined behaviour, so an obvious bounds check if UB and so can be removed. If it were unspecified the compiler would be required to be at least self-consistent. Eg it couldn’t do 2s complement in one place, but then treat arithmetic as not being 2s complement elsewhere (the overflow checks). E.g if the compiler emits code where MAX_INT+1 is MIN_INT, then the compiler can’t also pretend that that doesn’t happen.

Undefined should be reserved solely for things that cannot have a specified behavior (UaF, OoB memory, IO weirdness, etc).



Some compilers even make it hard to test for overflow:

if(a + 1 < a) printf("Overflow error!"); else a++;

gets converted to:

a++;

because the compiler thinks that a + 1 can never be smaller then a since it doesn't have to consider signed overflow.


While that looks stupid in isolation (and I agree it's annoyingly hard to check for overflow, although gcc and clang have special builtins nowadays to do it), it turns out there are important reasons for that optimisation.

In general, knowing that 'a+1' is '1 larger than a' allows for lots of optimisations, when writing to an array in order we can vectorise, do things in bigger chunks, all sorts of useful and important optimsiations. If every time those were used the compiler had to check for overflow, it would seriously effect performance.


And that is how many CVE entries end up being created, because performance above anything else is what matters.


Many CVE entries are created simply because the underlying software was written in C.


Written under the premises of "performance trumps all".

This is why it getting rid of the underlying software written in C should be a concern, or at very least, adopt hardware and development practices that tame C. After all UNIX/POSIX clones won't get replaced overnight.

Butchers that care for their hands also make use of protective gloves when dealing with sharp knives.


In practice, a lot of software is written in C because it depends on interfaces that are defined in terms of their C APIs, without caring all that much about performance.


Plenty of safer languages offer seamless C FFI, no need to write software in C just because those interfaces are defined as C ones.

H2PAS was already a thing in MS-DOS days, just as possible example.


Only feasible if you only use libraries which maintain ABI compatibility scrupulously or if you just take a narrow view of portability.

There's a lot of code in the wild which maintains compatibility only at the C source level using preprocessor macros.

FFIs are a nice toy for one-offs or integrating with vendored dependencies. Most never get past that stage.


No C library is changing their ABI every couple of seconds, and plus many of those tools understand C header files, quite feasible to fix broken bindings every now and then.


The problem isn't changes, it's accommodating multiple versions. Even figuring out where to find headers is not necessarily easy if you're not the local C compiler, for whom the tooling must only begrudgingly exist.


I don't think the compiler should have to check for overflows. but I also don't think its right to assume there is no such thing as overflows.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: