Would You Bet $100M on Your Pet Programming Language? (2007)

Ixiaus · on Dec 28, 2014

Caveat before reading my comment: this article is from 2007 and the scene 8 years later is pretty different but the spirit still holds, I think. The author also specifically placed "pet programming language" in the article's title which makes my reaction to it a bit unsound since, for example, Haskell is not a "pet programming language".

WhatsApp built a 19B product on-top of Erlang. I'm building our startup's product on a mix of Haskell and Erlang; so far it's going well.

There's pessimism in this article that neglects to acknowledge adaptability. I agree to a degree with the thought that if I chose to build our product on-top of Idris I would be spending more time building tooling and dealing with immature parts of the language than I would actually building the product.

This is not the case with Haskell and other innovative but mature languages / platforms. Once at a certain size we may need to train programmers or develop some tooling internally but I see that happening even with Java and C++.

If you can boot up the product with the language / platform effectively and quickly, it's good enough to tackle a 100M product and adaptability can take care of the long-tail.

sheetjs · on Dec 28, 2014

It's funny you mention WhatsApp and Erlang, because the article actually addresses that point:

> You're more dependent on the decisions made by the language implementers than you think ... some odd case that didn't matter at all for the problem domain the language was created for

The language implementers (employee of Ericsson) made design decisions based on telecom constraints, exactly the same constraints faced by WhatsApp and other messaging systems. In the parlance of the article, WA is in Erlang's problem domain, so a strict interpretation of the author's words would have him agreeing with WhatsApp here

MetaCosm · on Dec 28, 2014

Exactly, it is silly to consider Erlang a "pet language", it is an industry hardened language designed for the a near cousin to problem domain of WhatsApp. 20+ years routing billions of calls a day under soft real-time, high reliability requirements... because you simply expect your phone to work, even phone calls from one end of the world to the other.

kedean · on Dec 28, 2014

I think you did kind of just prove his point though. 2007 was just about the time that Haskell started seeing its big boom in popularity and the unending series of 'how to monad' articles starting popping up, so when this was written Haskell wasn't really a proven language. In other words, WhatsApp kind of DID bet their $100,000,000 on their pet language, and came out ahead as a result.

My takeaway from the article wasn't that less popular languages are bad, it's that you should definitely consider everything before using them for a really important project. If your decision is that it will be a good solution, great! As long as you understand the limitations that you will run into.

These days I'd say that Rust is where Haskell was in 2007 (ignoring that Haskell came about in the late eighties). Its gone through a really long incubation period, and is about to get released into the wild, and is getting lots of press in the blogosphere. Now is the point where people will have to decide if its worth banking everything on writing a new, real project in Rust, because right now its still not much more than the pet language of some very vocal bloggers.

MetaCosm · on Dec 28, 2014

It is very annoying to me to hear Erlang referenced as a "pet language". It has been around as a language for 28 years, and it has been doing mission critical, soft realtime work of routing billions of phone calls a day for over 20. It wasn't some academic language with no industry chops.

Using it to dependably route billions of messages a day is not some crazy off the wall "adopting a pet language" -- it was picking a language that had already been proven (and explicitly designed) to work in a very similar problem domain. It was a smart, reasonable, sane choice that they then pushed to awesome extremes.

IndianAstronaut · on Dec 28, 2014

>I'm building our startup's product on a mix of Haskell and Erlang; so far it's going well.

Erlang is a very fascinating language. I am currently working on a small scale state and signaling system built in Java. After looking into Erlang, its way of dealing with similar problems is more elegant and much more scalable.

azth · on Dec 29, 2014

Out of curiosity, what do you think of Akka running on top of Java or Scala?

mathattack · on Dec 28, 2014

Massive jobs like this tend to require a lot of inter-team dependency. The best language is also one that everyone on the current team understands, and that everyone supporting it will understand. And that will grow in the direction of such a massive project.

My 2 cents... Companies have to selectively pick where they are on the bleeding edge. Extra large projects aren't the best spot to play with the bleeding edge of new programming languages. (This is why COBOL stuck on so long for billing systems)

The place for experimenting with new technologies are smaller self-contained projects where the risk is less, and the benefit can be more clearly explained.

nbardy · on Dec 27, 2014

A hundred million dollars would change my decisions, but it also changes the nature of the problem I'll be working on. The contracts I get are on a much smaller scale which means the problem I'm solving is also of a different scale. The language I chose is based on the fact that my client wants their project done on time and on budget along with the ability to change direction when new requests come in.

If my client wants something as robust as a hundred million dollar budget; they will need a 100 million dollar budget. More often then not they just want something that works.

3pt14159 · on Dec 27, 2014

(Note: This article, while good, is from 2007 so perhaps its advice is slightly out of date.)

Personally, if I were going in completely blind I'd choose Python (with C as a backup). Numpy and Scipy are fast enough for most of the things that you need to do. It's the "it does pretty much everything" language. Is it great at very high performance games? No. But you can drop into C pretty easily (although I find Ruby + C to be easier) and most of the code you'd be writing isn't going to be core game logic.

Now as to why I'm not coding it now: Because I know what I'm building: Dynamic web apps. Ember (with EmberUI) + a pure Rails JSON backend (with xdomain) will get me there much faster than the needlessly verbose Python. It will also let me hire people that won't have to figure out my stack. It is simpler and less risky. I'd never write this in C.

robrenaud · on Dec 28, 2014

What projects on the scale/complexity of Photoshop are implemented in Python? Possibly large chunks of Youtube or Dropbox?

I love Python for small scripts, and even don't regret it too much for programs with mild complexity (~5000 lines of Python), but the lack of explicit typing makes figuring out what any given function/class does pretty hard. What are the arguments? What do they do? What does the return type do? Where is this thing initialized? Finding the answers to those kinds of questions are greatly aided by an IDE/source code browsing that is anchored by types.

wting · on Dec 28, 2014

Yelp is a multi MLOC Python code base. AFAIK YouTube is slowly being ported from Python to Go.

There are plenty of other large codebases in dynamic languages:

  - Wikipedia and Facebook written in PHP
  - Twitter was originally written in RoR

To be honest, it's my personal opinion that good static languages (i.e. type inferenced) are better than dynamic languages. I believe it's situational compared to traditional static languages (e.g. Java, C++).

jacquesm · on Dec 28, 2014

Both twitter and facebook ran into some pretty hard limits on those platforms which in the case of twitter led them to abandon RoR and in the case of Facebook led them to create a compiler for the language (doable on that 100M budget but still quite an undertaking).

tangue · on Dec 28, 2014

"needlessly verbose Python"... For me Python and Ruby are right on par. Could you elaborate further ?

3pt14159 · on Dec 28, 2014

Part of Python's mantra is "Explicit is better than implicit"[1] which I fully, and wholly disagree with. You end up with a file that has 10 import statements and on larger codebases this just get worse and worse. Furthermore, this leads to less DSLs and other helpers that Ruby is just better at.

In Ruby (or possibly just Rails?) say I want to get the start of the day 3 hours from now, when a sports game is supposed to end, say.

I go (Time.now + 3.hours).end_of_day whereas in Python I have to do "import datetime" (Or is it "import timedelta"? Let me Google) Then I need to do this: http://stackoverflow.com/questions/7985756/whats-the-most-el...

And it's like, how many times do I need to Google Datetime and timedelta. Let alone getting into the weeds with different time zones and daylight savings time changing its calendar day. I could show other examples: Why do I have to pass self into a method when that is the default thing I want to define in a class?

And you can call this all petty bickering but when you sometimes work in machine learning and your model takes 45 minutes to train and to be told by Python that you forgot to import the production configuration file, it's just such a drag. And I know I can type "import *" or include something in my __init__.py but then people will think I don't know what I'm doing. Also what is with "def __init__(self)"? Why isn't it def initialize? Like, the first thing we tell new programmers is that "__" means "warning, don't use me unless you really know what you are doing" then the next thing we say is "go ahead and change __init__ though, that is normal.

1. Type "import this" in a Python REPL to read the full thing.

lifeisstillgood · on Dec 28, 2014

But, and I am a Ruby neophyte, someone, not the language designer, wrote a library / DSL that somewhere has "end_of_day" defined.

Now I would never ever trust that this nice person got end of day right for my 100M project. Because they cannot. Is that end of day as in close of business or as in midnight? Is that end of day as in the time some we start that process in ? Or end of day as some arbitrary cut off for a global company (frequently in my experience it's a New York day whereas every library starts using UTC). Don't start on leap seconds, leap days, process allocates that can kick off your code on a machine in HK or NY.

Really this stuff needs to be written as a stdlib in the organisation using it. Datetime might be verbose but I can read that code and see what it does in any python program - "end of day" is something I need to go and find the underlying library for.

jacquesm · on Dec 28, 2014

It's ok to add tests for code you didn't write.

lifeisstillgood · on Dec 29, 2014

I was just meaning that a single line of code with "end of day" in it would almost certainly not meet my needs.

It is worthwhile exploring the date-time-calendar libraries and adding tests and third party addons that meet my needs - yes. But I will almost certainly need to do work - out of the box rarely works for something so complicated as dates. And that extends to a hell of a lot of other stuff to

Mostly I guess I am grumpy that "do what I mean" does not work very well.

majormajor · on Dec 28, 2014

The bit where you don't know for sure what's in Ruby and what's Rails-specific, argues against your own point for me. I can't help but feel that that sort of attitude is why the more work I take that involves maintaining Rails applications, the less I want to write a new application in Rails... I'd consider using just plain Rails, though my experiences with the code quality of even some aspects of that aren't super great, but most 3rd party Gems for Rails scare the crap out of me at this point in terms of future support burdens. And people throw so many of those into their gemfiles.

  2.1.0 :002 > Time.now + 3.hours
  NoMethodError: undefined method `hours' for 3:Fixnum

  2.1.0 :003 > Time.now.end_of_day
  NoMethodError: undefined method `end_of_day' for 2014-12-27 20:52:11 -0600:Time

EDIT: to elaborate on what I mean by that sort of attitude, I feel like Rails's reliance on "magic" (or things that appear to be such) makes it too easy to do things quickly that the cost to do it right doesn't get paid and this interacts very badly with a common pitfall for users of open source components where people just don't put in enough time vetting quality before integrating things into a project.

3pt14159 · on Dec 28, 2014

See, I actually knew that it was just part of Rails, but this is my whole problem with the Python community. They have this explicit is better than implicit attitude that infects their appreciation of what makes code good.

Code is good if it is legible. Doing things quickly is good. Nobody actually writes "end_of_day" and doesn't expect it to work outside of a rails project without running it or testing it.

The Ruby community doesn't need to clutter up their codebases with senseless imports to make programming fun and useful. And people hand wave about how 3.hours is cluttering up codebases, but it really isn't. What clutters them is broken frameworks or abandoned protocol buffers. Rails (and ember) have this amazing idea that code shouldn't be anymore verbose than is necessary and Python just doesn't have that.

Navarr · on Dec 28, 2014

    <?php

    $threeHours = strtotime("+3 hours");  
    $startInUnixTime = mktime(0,0,0,date("n",$threeHours),date("j",$threeHours),date("Y",$threeHours)); // had to check php.net/mktime

apdinin · on Dec 27, 2014

I have trouble accepting the general premise of the question. Not the $100m part... even a $100k job or a $10k job... doesn't matter. It implies that the program being developed has a definitive "done" moment. But software (at least software in constant use) doesn't really get finished. So you build, and you expand, and you rebuild, and new things happen that change your approach and new tech gets developed and your software changes. I can (technically) start coding in Language X and ultimately port the entire codebase to Language Y (or take bits of Language A, B, and C). It's not so much about choosing the right language as reacting to the ever-changing world in which your software is being deployed.

chas · on Dec 28, 2014

I feel like the corollary to this is how quickly one could solve the problems in a pet programming language if $100M was on the line. If the compiler doesn't scale well, hire one of the authors for a bit as a consultant. C with Classes was once someone's pet programming language.

username223 · on Dec 28, 2014

... and if the baby isn't born in one month, hire nine women! A lot of the development of mature software can't be parallelized. Paying Simon Peyton-Jones twice as much, or even cloning him, won't make Haskell improve twice as fast.

nhaehnle · on Dec 28, 2014

Peyton-Jones is not the only person working on Haskell. In fact, if it were only Peyton-Jones working on Haskell, the language wouldn't be where it is today.

Yes, the speed of development probably almost always scales sublinearly with the number of people working on it, but beware that perception may also be distorted by the old adage that "the last 20% of the work take 80% of the time". That is, when people are added to a project that has finished the first 80%, it may seem like progress scales terribly sublinearly with the people added, but what is observed may simply be the fact that the same amount of work leads to a smaller perceptible change once a project has sufficiently advanced.

hacknat · on Dec 28, 2014

A project this huge is likely to be a distributed system. I can't think of any single binary that could provide $100m in value to one customer. Maybe some really important financial database, but even then...distributed? Right? That being said, we're going to have to use way more than one language, right? Like most projects, right?

No. This question is a red-herring. I think a semi-decent point is being attempted, but I think most Software Engineers ARE good at deciding when it's appropriate to use the right tool.

Quick example: I'm a fan of NodeJS, but I would never use it to try to solve a computation heavy problem, it's good at IO multiplexing, but very little should happen in between connections. My experience has been that most of the Node community is aware of this.

imanaccount247 · on Dec 28, 2014

>I'm a fan of NodeJS, but I would never use it to try to solve a computation heavy problem, it's good at IO multiplexing

No it isn't. It is very bad at it. It just uses the most primitive event loop and foists all the complexity of that onto you as the developer using it.

Ixiaus · on Dec 28, 2014

You're being downvoted but you're right.

BuckRogers · on Dec 28, 2014

He is, and I'm seeing more of this on HN over time. Many people who use <technology x> see a bothersome comment, and while snide- he was accurate. People may use Node.js, drank the Kool-Aid, but in 5 years there's going to be an industry movement off of it and the messes being created today. I've used it, was not impressed by its technical merits, and wrote it off as yet more technological churn.

I'm not a fan of churn, and keep a keen eye eye out for true innovation. Which happens far less than people are convinced to believe. That's the biggest scam the tech world convinced everyone, that innovation is rampant and fast moving, when in reality everything moves at glacial pace.

I'm a late adopter of technology, proud of it because it's generally the smart move for most of us. I test drive shiny things I can make time for, but bringing it into my stack doesn't happen by reading a few blogs. It not only requires significant technical merit, but someone has to maintain all this shit.

hacknat · on Dec 28, 2014

K. From my personal experience it has held up quite well.

SamReidHughes · on Dec 28, 2014

It's par for the course as far as event-loop I/O goes.

_almosnow · on Dec 28, 2014

Could you provide an example? I'm using node and are pretty satisfied with the performance I have.

edgyswingset · on Dec 27, 2014

The need for reliable and extensible libraries is a huge one, and that is largely why I am a huge fan of F#. Not that .NET is some silver bullet to solve all your problems, but one could use F# to develop a $100,000,000 system largely because it shares the same framework C# does.

jeremyjh · on Dec 27, 2014

I don't know why people always say that. How many full-stack web frameworks are there that use idiomatic F#? Haskell has at least three that were mature three years ago and still under active development. In general the .NET open-source ecosystem is weak; enterprise just waits for Microsoft to reveal the One True Way or spends thousands per seat on clunky tool-kits. That same enterprise will use hundreds of open-source Java libraries. Maybe its changed in the last few years and certainly there is enough there to work with, but I don't think it can rival Python or the JVM and I don't think F# has any edge over Haskell at all when it comes to libraries and community. You get a large, solid core to start with which is more coherent than the de facto Haskell standards but as soon as you start looking for nice idiomatic test frameworks, distributed application frameworks, embedded databases, FRP libraries or many other examples Haskell really leaves it far behind.

edgyswingset · on Dec 28, 2014

Well, you're not tied into using strictly F#. The C# MVC framework, or really any web framework, will play nice. The majority of an application like the one the article suggests will sit on a server, anyways. Once you're in server-land, things like web frameworks matter less because .NET already takes care of what you need. F# already has well-supported testing frameworks which integrate into Visual Studio, distributed computing frameworks, database interactivity, graphics libraries, and a slew of others. You can find plenty of testimonails of using F# for large-scale and enterprise systems.

F# and C# are really just your tooling for .NET, which is sufficient for a massive-scale project. Whether or not .NET is appropriate for the project is a different story altogether - I wouldn't use it for anything that's close to the metal, for example.

pjmlp · on Dec 28, 2014

Which Haskell libraries interoperate with classical enterprise stacks, alongside existing code?

jeremyjh · on Dec 28, 2014

There are lots of libraries for building and consuming web services or using messaging products, which is the same as how any heterogeneous environment is integrated. Big companies integrate .NET, Python and Java like this all the time. There ARE also tools and libraries [1] for generating all the JNI code necessary to call directly into JVM libraries but I don't think it has seen much adoption. I'm not really arguing that Haskell is suitable for enterprise applications - it is not. For greenfield SAAS product companies though I think its as viable as F#.

[1] http://hackage.haskell.org/package/java-bridge

codygman · on Dec 28, 2014

It depends on how you define interoperate. The way you phrase this it sounds like the only answer is the JVM. In which case:

https://hackage.haskell.org/package/java-bridge

However in the case where there is a lot of prior Java code, it might be easier to look at Scala or even Clojure.

to3m · on Dec 27, 2014

If the target will run it, .NET would probably prove pretty good. You've got a good selection of languages, and even the de facto standard of C# is actually pretty decent (well, I think so anyway - has all the things I liked about Java, and mostly fixes most of the things I didn't). And if you need to write bits in C/C++, that's not a big problem, because the FFI is very easy to use.

(MS's long-term plans for .NET always seem a bit of a mystery, though, and Mono never seemed to get much mindshare. Probably nothing that a bit of your $100,000,000 budget couldn't help with...)

elwell · on Dec 28, 2014

Well healthcare.gov went with Java [0], but that was about 20x this budget [1], so I think I'll have to go with Forth.

[0] - http://www.randalolson.com/2014/05/22/programming-language-b...

[1] - http://www.bloomberg.com/news/2014-09-24/obamacare-website-c...

percept · on Dec 27, 2014

I was going to comment on a different article from the same author, but decided this one was better:

http://prog21.dadgum.com/57.html

im3w1l · on Dec 28, 2014

>What would you do? And if a hundred million dollars changes your approach to getting things done in a quick and reliable fashion, then why isn't it your standard approach?

Ordinarily you try to strike a balance between "exploration and exploitation". For a small project where success isn't that hugely important, the learning that a less conservative choice offers can be more heavily weighted.

zzzcpan · on Dec 27, 2014

Here's one idea: why don't we compile every new language into a pretty C code? This way we will be able to use every available C library with a compiler of our choice and any extra C code, necessary for our real world application.

lomnakkus · on Dec 28, 2014

One potential objection I see (aside from the ones raised by others in this thread) is that C isn't actually that great as a compile target. Aliasing analysis is extremely hard in C (thus preventing obvious optimizations), you'd still have to implement your own GC on top of C, C doesn't do tail-call optimization so you'd have to do your own CPS transform anyway... at which point most of the advantages evaporate.

(If you're thinking that interfacing with C would be easier: No, the FFIs to C are usually as complex as they are for exactly the right reasons, namely that C's semantics don't match very well with $OTHER_LANGUAGE.)

... at which point you're practically implementing your own VM anyway, so, y'know...

username223 · on Dec 28, 2014

> C isn't actually that great as a compile target.

That's not the point. C is a great common tongue, as its function call and name lookup semantics are simple, and it doesn't insist on being in control of memory, scheduling, etc. Perl and Python are pretty similar, and both talk to C, but calling between Perl and Python is enough of an unholy mess that it's usually easier to just print text over a pipe.

lomnakkus · on Dec 28, 2014

> That's not the point.

Maybe not, but did you miss the bit in my post about FFI and GC? Have you actually tried to implement a non-trivial extension to a GC'ed language which needed to interface with C code?

EDIT: The point is not so much that it cannot be done -- the point is that it needs a human ("programmer") to do it. Which kind of defeats the point of "C-as-lingua-franca". Which, if you think about it C already kind of is, sadly. Any FFI currently in existence focuses on C already. What more do you want? ;)

username223 · on Dec 28, 2014

> did you miss the bit in my post about FFI and GC?

No. It's a huge pain, but much less of a pain than calling between two GC'd languages. (I haven't tried calling directly between e.g. Perl and Haskell, but I'd rather gouge my eyes out with a spoon. The handles alone would bury me.) If there were one obvious GC solution, then all GC'd languages would use it, and C would not be as useful. But we've had the JVM and countless other abstractions, yet none has managed to be a higher common denominator than C, so it looks like human labor for the foreseeable future.

lmm · on Dec 27, 2014

Because if you use C libraries you will inherit their problems (and also their interface is rarely idiomatic in the new language). Usually if you want to use a different language it's because you think that language has advantages over C. E.g. a large part of the point of using OCaml is to avoid the safety problems of C, but that only applies if your libraries aren't written in C. See https://github.com/mirleft/ocaml-tls#why-a-new-tls-implement...

(Almost all languages do offer some "FFI" to call C. But in languages that don't naturally fit, there will often be significant overhead to e.g. aligning memory management so that the same memory doesn't get freed twice, as might happen when a garbage-collected language called a C library)

zzzcpan · on Dec 28, 2014

That's why you would compile it into C instead of having hand-written C when safety is important and for most part of your program it probably is. But sometimes you absolutely need to be able to write assembly and use existing unsafe C libraries.

lmm · on Dec 28, 2014

If you absolutely need a C library, almost all languages give you a way to do that. But why hobble your compilation strategy for all programs just to support this occasional use case better? You may well want to target a runtime (JVM, .net, Javascript) for which no good C compiler exists. Even if you're solely interested in building native executables, compiling to C would mean throwing away a lot of information (e.g. knowing which values are immutable) that an optimizer could use. Look at Haskell; performance-optimized Haskell performs a lot better than machine-generated C ever would (indeed, it often comes close to the performance of hand-tuned C).

(If you're worried about duplicating effort in optimizers, do what many languages do and offer an LLVM frontend. LLVM optimizations are (mostly) language-independent, and LLVM bytecode, while imperfect, often allows a better representation of language semantics than C source would)

zzzcpan · on Dec 28, 2014

I don't think it is so rare, that you could call it occasional. That's the author's point and mine as well. Sometimes I need sqlite, other times libjpeg, giflib, libpng and some simple image processing with at least some low-level code. And it takes a lot of effort to make these things work with other languages on different platforms. Golang tried to make it easier and it still isn't.

Optimizers is also one thing I was worried about, LLVM is not the only compiler out there. And also there is friction. Most of the systems already have C compiler installed, if you target it, there is no friction in setting up building environment and so on.

lmm · on Dec 28, 2014

> Golang tried to make it easier and it still isn't.

How much of that is because Golang doesn't compile to C, and how much is because the languages just don't fit? E.g. you'd still need a way to mark whether Golang was supposed to garbage-collect a value that had been returned from a C call, or not.

> Most of the systems already have C compiler installed, if you target it, there is no friction in setting up building environment and so on.

Yes there is, because you still need to build the thing that interprets to C. If anything there's more friction than having a compiler that directly builds native binaries like e.g. ghc, because you need two tools - your translate-to-C tool and your C compiler. "Most systems already have a C compiler" is only really true on *nix; it will be easier for a Mac or Windows user to run your system if you distribute e.g. java bytecode.

username223 · on Dec 28, 2014

> and also their interface is rarely idiomatic in the new language

This. If you just want C interop, SWIG[1] has been around forever, and is pretty good at generating bindings from a C/C++ header with minimal effort. But if you've ever tried it, you'll know that you still have to do most of the work by hand to create an idiomatic interface in your language of choice.

[1] http://www.swig.org/

chas · on Dec 28, 2014

I think zzzcpan is suggesting to compile from the new toy language to C and then use a C compiler to get native code. This doesn't preclude only using libraries written in the new language. CHICKEN is a Scheme compiler that takes this approach.[0]

The submitted essay addresses the problems that can result from this sort of compilation style with the question "Will native compilation of a big project result in so much code that there's a global slowdown (something actually true of mid-1990s Erlang to C translators)?"

[0]http://www.call-cc.org/

_wmd · on Dec 27, 2014

Portable C significantly restricts the machine primitives available, e.g. no explicit SIMD, no explicit control over struct/memory layout, no obviously efficient way to represent type information (say, for implementation of a moving GC), no ability to access machine registers (again GC), ...

zzzcpan · on Dec 27, 2014

You don't have to write extra code in a portable manner, that's the point. You write it how you need it in a real world. But still benefit from refcounting, nice safe strings, higher-order functions, etc. in main part of your program.

idlewan · on Dec 28, 2014

That's exactly what Nim does, and it works quite well.

The bindings are easy to create (there is an automated tool that does most of the work), they look like Nim function declaration and can add better typing checks on FFI calls. You still have manual memory management and pausable, optional GC.

desdiv · on Dec 28, 2014

None of the primitive data structures will match up. JVM languages, CLR languages, Rust, Go, et al all have array bound checking, so their arrays won't be compatible with C libraries without a wrapper.

kibwen · on Dec 28, 2014

Rust arrays are perfectly compatible with C arrays, they're both just contiguous segments of memory with elements at fixed offsets. You can pass Rust arrays to C functions trivially, and bounds checking has nothing to do with it.

I should also note that all correct C code does bounds checking somewhere. The difference with languages that are bounds-checked by default is that they pessimistically assume that you haven't checked the bound manually, and so can potentially do more bounds checking than is strictly necessary if their optimizers aren't capable of hoisting out the checks.

natefinch · on Dec 28, 2014

Ditto for go arrays.

dreadfulgoat · on Dec 28, 2014

If what you want is to use the C library, shoehorning the library APIs into your language is guaranteed to be nastier than just writing in C.

BuckRogers · on Dec 28, 2014

If $100M were on the line you'd get a mess that met your requirements.

If I wanted good code, I'd take guys who worked on a project for free.

lwh · on Dec 28, 2014

The language is irrelevant at that price. Given an architect of sub-average skill they'll pick an acceptable one.

michaelochurch · on Dec 28, 2014

Yes. There's risk with every technical decision. I'd probably be inclined to use Haskell, for access to a top-notch community and a level of robustness (esp. in the face of refactoring) that, while it can be achieved in a dynamic language, is hard to hit without a static type system.

An upshot of Haskell is that it probably wouldn't take a $100 million budget because you can do a lot with a small team. We're going to need far fewer people than on a Java project, and you won't need the layers of management that come with big teams, and you'll also have smaller code and more of an ability to use the Unix philosophy (systems compromised to multiple, usually small, programs). Those wins aren't additive, but multiplicative.

So, yes, I'd definitely trust Haskell (or any mature FP language) on a $100M project. I'd trust Java on one too-- if I could find someone to do all the work, and trusted that person-- but I'd rather use an FP language and really kick ass.

lmm · on Dec 27, 2014

I was nodding along until I realized that the author has a very different notion of esoteric from mine.

The last month of Fridays I've been tinkering with Idris, trying to get it to... well, do anything useful. I'd heard it was a dependently typed JVM language, which it is... kinda. Turns out you need to install a full Haskell toolchain to do anything with it. With the right esoteric option incantation, you can make it build an executable that turns out to be a shell script concatenated with a jar file. Nice. But you can't build a library with it, even within the library. You can't invoke it to build some classes to use later. You can't even call the build tool from anything remotely standard in JVM-land, and when I asked about making the compiler selfhosting the response was a kind of lukewarm "yeah, sometime". The current toolchain is in Haskell but there's not even an FFI from Idris into Haskell, so porting it would be... challenging. I tried to join the two together via their C FFI, but in between undocumented linker options and the fact that the Idris runtime is already linked to part of the Haskell runtime, I eventually gave up. If nothing else, it gave me a real appreciation for how much work went into making Scala a serious, commercially usable language, something I'd previously rather taken for granted.

But C, holy shit, C? You'd write a program in C? And expect it to work? I've seen programs go wrong in a lot of languages, but most of them can be eventually fixed. In C you get irreproducible voodoo random crashing that, sure, you can usually track down with a static analyzer, valgrind, debugging, intelligence and luck. But what if you couldn't? What if the program was just broken, and you couldn't fix it? With $100,000,000 on the line, there is no way on earth I would risk letting C (or any other unmanaged language) anywhere near my codebase. It might mean more work, and less features, if I were to use e.g. that pure-OCaml SSL library, or a JVM-native multimedia library. But I'd do it in a heartbeat, all the same. "The final say on overall data sizes" is such a tiny, trivial concern compared to using a language where failure is understandable, reliably diagnosable.

(In fact I'd say the scenarios where a measly factor of 2x memory consumption makes the difference between "working" and "not working" are just vanishingly narrow. If you need to scale horizontally in OCaml, you're going to need to scale horizontally in C a couple of weeks later. Particularly with $100,000,000 on the line, sod it, buy a bigger server with more RAM if using 32GB rather than 64GB really makes the all-important difference).

Would I use a "pet language" I tinker with, like Idris? No, but I wouldn't use that for any kind of serious commercial work. With $100,000,000 on the line, I'd use the same language I use for almost all my work: Scala. And I would stay the hell away from JNI, because I don't want C anywhere near this system, lest it bring it all crashing down.

dietrichepp · on Dec 28, 2014

You can write reliable programs in C. How about a successful $2.5 billion mission to Mars with 500,000 lines of C?

http://programmers.stackexchange.com/questions/159637/what-i...

There are a lot of tools available for the design of reliable systems in C. Valgrind is just the beginning. A lot of smart people have put an enormous amount of effort into static and dynamic checkers and theorem provers, as well as standards and guidelines for what you need to do in order to write reliable systems in C. These systems have hard realtime and hard memory requirements.

This is how it works at the extreme end of reliability requirements: you avoid Java and use C, because you'd have to validate the Java runtime anyway, much of which is written in C, and it's just plain easier to validate your C code under an otherwise crippling set of constraints.

And then, as ill as you speak of C, you turn around and plug your Java systems into C systems, such as Varnish, Nginx, Apache, PostgreSQL, or thousands others that you use every day, not to mention the OS itself.

Yes, I agree that it is crazy to write your web app in C. But web apps are not the entire world.

lmm · on Dec 28, 2014

At the point where you're using a theorem prover you're not really writing C any more (I mean, do you count ATS as writing C?). It's a perfectly good way to produce reliable code, sure. But I'm pretty sure it's not what the article is advocating.

> And then, as ill as you speak of C, you turn around and plug your Java systems into C systems, such as Varnish, Nginx, Apache, PostgreSQL, or thousands others that you use every day, not to mention the OS itself.

I do my best to minimize the C surface, and I worry about what I am exposing. E.g. I don't put Apache/Nginx/etc. anywhere in my stack (I either route directly to the JVM or use an Erlang load balancer), which means I wasn't running around patching them in the wake of Heartbleed. I'd use e.g. Riak over PostgreSQL wherever possible.

dietrichepp · on Dec 28, 2014

I'm just taking two lines from the article: "Reliability and proven tools are even more important than libraries"... in this case, there are a lot of proven tools for writing C programs, including the theorem provers. And "You're more dependent on the decisions made by the language implementers than you think"... there have been a lot of flaws in the JRE over time, most of them from the parts of the JRE that are written in C. If you had been writing in C, you could have avoided those flaws if you were in the top 2% most diligent and careful C programmers, because the C tools are rock solid, whereas Java tools haven't been (January 2013 is still pretty recent).

Problems in libraries, such as heartbleed, can be mitigated in C using the same tools that you'd use in Java. Process isolation is great.

Of course, if you are writing your web app in C then I am going to replace you with someone else. But the web is not everything.

lmm · on Dec 28, 2014

The January 2013 flaws affected end users running the browser plugin, not systems written in Java. I suppose if you had a system that allowed users to submit their own bytecode and used Java's native facilities for handling this, then you'd be vulnerable. But nothing forces you to do that; someone who's capable of writing a safe bytecode verifier in C is certainly capable of writing one in Java.

The only even vaguely recent flaw I can remember in the JRE was, as you say, in the part written in C (I think in some image loading code). It takes a very perverse kind of logic to say: this system which is mostly not-C and a tiny bit C keeps having security flaws in the C part, therefore we should switch to systems which are written more in C. If the tooling for verifying C is really so good, why not verify the C parts of the JRE? Then they'd be proven once and for all, and lots of systems would benefit.

dietrichepp · on Dec 29, 2014

> If the tooling for verifying C is really so good, why not verify the C parts of the JRE? Then they'd be proven once and for all, and lots of systems would benefit.

I'm going to quote the article again...

> You're more dependent on the decisions made by the language implementers than you think.

When you use Java, you don't have the opportunity to second-guess the choices that produced the JRE. And I think you're not quite getting what I'm saying: I'm not saying that "we should switch to systems which are written more in C", I'm saying that writing systems in C protects you from mistakes in the JRE (which you have no control over) in exchange for exposing you to your own mistakes (which you can control). You can then spend a large amount of time and money developing and verifying your system. The goals and constraints of your project will determine whether this is a good trade-off. I'm certain that Java is preferable for writing the vast majority of web apps, but the web is not everything.

> If the tooling for verifying C is really so good, why not verify the C parts of the JRE?

First, I'm going to guess that an enormous amount of static and dynamic analysis has been done on the JRE. Bugs in it are rather rare these days, given its size and complexity.

However, verification tools are generally not suited to this particular task. Verification tools are better at verifying typical application code, and the JRE needs to do a lot of very unusual operations in order to work. In cases where you'd use verification, you'd also typically use a "safe" subset of C. Some of these subsets don't even permit dynamic memory allocation, or if so, only permit it at program startup.

So, it may actually be more straightforward to deliver a working Mars rover in C than it would be to deliver a verified JRE. Neither task is easy.

heinrich5991 · on Dec 28, 2014

What is such a theorem prover? I'm interested in it for trying to prove properties of my own programs...

spc476 · on Dec 28, 2014

Erlang is written in C, and the latest version is 456,000 lines of C. You might want to double think your reliance on such unstable technology.

phkahler · on Dec 28, 2014

>> But C, holy shit, C? You'd write a program in C?

Most of the worlds software rides on the back of C. Pick a language, the runtime is probably written in C. Any high performance libraries are either in C or C++ (QT anyone?) and for numerics you may still find <gasp> Fortran. If performance is a requirement or running on a micro controller, you're going to have some C or you fail to get the best performance. That said, stuff like string processing sucks in C - some say that's why C++ was invented.

pjmlp · on Dec 28, 2014

That is an historical accident of UNIX widespread into the industry.

I am old enough to remember C wasn't even an option unless the customer was willing to shell out money for UNIX systems.

lmm · on Dec 28, 2014

> Most of the worlds software rides on the back of C. Pick a language, the runtime is probably written in C. Any high performance libraries are either in C or C++ (QT anyone?) and for numerics you may still find <gasp> Fortran.

Maybe you can't eliminate C entirely. But you can avoid using anything in C that doesn't already have millions of users. Even then, I'd still be more worried about those small slivers of heavily tested C than about the rest of the code put together - working on my $100,000,000 project in Scala, the single biggest thing I'd be worried about would be hitting a bug in the JVM itself. The JVM is mostly Java these days, but there's some C/C++ in there and a disproportionate number of critical JVM bugs happen in the C/C++ code (who'd've thought?). And debugging those problems is Not Fun.

C is a hard requirement much less often than people seem to think. I've coded for microcontrollers in C, but I've also done so in Java. Better slow and reliable than fast and buggy.

gpfault · on Dec 28, 2014

By that logic, you should probably stop programming altogether because all runtime of managed languages is written in C. What if it all comes crashing down?!

lmm · on Dec 28, 2014

I do worry about it. I probably wouldn't gamble $100,000,000 on any program, if I had that money to start with, because I could never be 100% confident in a system being bug-free. But I can minimize the surface area of C that I expose, and use better tools where possible (which is often). Given that I still do need to write programs (even C is more reliable than doing things by hand), what other option is there?

comment38296 · on Dec 28, 2014

is the quality of HN commenters going down, perhaps?

1.) It's about strategy. re-read Sun Tze, Art of Strategy (and avoiding war). The title is amistranslation cleverly designed to fool non-native LANGUAGE cunning linguists. - english joke - cunn ing lin gui st - end joek.

2.) those who armed welll with the RIGHT language, did not need to fight or program well. that's why Haskell or Ocaml is at the 2nd stage.

3.) But it AINT COQ or provable...LANGUAGE.

4.) what's the second point U missed? Yes, the weak spot is "inter-team dependency" - ref: mathattack post. Yes, the ??? is true and necessary.

BUT THE CONDIITION is NOT SUFFICIENT for AI or artificial intelligence.

8.) so what is the closest to ai. Seems to me prolog, but darn... most of the obscure conferences are in japanese....language barrier.

9.) We conclude the 'lesson of the day" by quote: nbardy: "client wants their project done on time and on budget along with the ability to change direction"

So, THE CLIENT wants it FREE, Yesterday (get the time machine working) and High quality and CHANGEABLE - agilie.

Yes, of course, as long as you pay up front, I THE VENDOR guarantees ALL FOUR ASPECTS in the 'game.'

No wonder, the track record of software is so BAD compared to engineering, building of the Pyramids in Egypt, farming using bio-dynamic techniques, other human endeavor - spacecraft.

addendum: fun note: just simply download basic OPEN SOURCE that includes libressl, openssl, gettext GNU, etc.

SOME OF IT DONT WORK, at least on basic systems.