Not doing a URL hunt as I'm at work, but as mentioned in the comment you're repl...

klmr · on June 4, 2018

> there was a large study showing the opposite on HN a while back.

Yes, sorry: I didn’t mean to ignore that part of your answer but the research I’ve linked to essentially supersedes everything else in that area. Prior studies were riddled with inadequacies in controlling for confounders and outright methodological errors. I’m assuming you’re referring to [1] and while that was an impressive study in its own right, their classification of bug sources is indirect and, ultimately, simply didn’t allow any assertions regarding the prevalence of type errors.

[1] https://paperpile.com/app/p/54ee3fdd-a6ef-0767-add5-55b60419...

didibus · on June 4, 2018

You'll have to justify this: "Prior studies were riddled with inadequacies in controlling for confounders and outright methodological errors."

Your recent study is pay walled, so I can only go by the abstract, but it doesn't sound any more rigorous to me.

I also find its comparing very different things. It says bugs in release, but it sounds like they consider commits to be releases, which would be very unfortunate.

Its also specifically targeting JavaScript, which prior study already showed to be one of the poorest performers in terms of defects (i.e., it tends to have more). For example, it would be interesting to know how many bugs were due to weak typing.

I'm also curious, adding the typed annotations should be a blind activity to be rigorous. Was it? Or did they knew there was a bug and what the bug was then they added the type info?

I also feel they're doing a one way comparison, having to add types can introduce other forms of bugs, but since this study is targeted, I find it to suffer a little from confirmation bias. Like type checker detects type errors, is a bit of a no brainer conclusion. I understand they mean this in the sense that 15% of bugs were type errors. But that doesn't mean TypeScript code would have 15% less bugs, because they did it one way only. It could introduce other bugs in the process.

Would have been interesting too to analyse how long it took for the bug fix commit to be made over the introduction of the bug. So you would know the real cost of them. Which is important, since annotating adds dev time, but so does big fixing. Would be nice to have time data.

Finally, the prior study demonstrate the same outcome. That static tends to have lower defect rate then dynamic and functional less then imperative. The difference is in the outliers. Clojure and Ruby were both outliers that outdid almost all static languages.

So actually, it would be great to reapply your study to them. Clojure has an optional type checker already so it could be a good one. It was also the strongest outlier.

klmr · on June 5, 2018

> You'll have to justify this: "Prior studies were riddled with inadequacies in controlling for confounders and outright methodological errors."

That topic has filled whole blog posts. But in a nutshell, no previous study was able to compare directly the effect of typing disciplines due to confounders that they couldn’t control for in their experimental setup. For instance, virtually all previous studies compared different programming languages, which obviously differ by more than just their typing discipline. Many studies had very small sample sizes due to using human test subjects to write sample programs. These test subjects were almost exclusively students without real-world experience — or, in some case, any prior experience in the respective programming languages. Thus, many studies — from the outset — tested beginner-friendliness of languages rather than everyday use. Fair enough, but not the same as type checking benefits. Furthermore, most of these studies, for the same reason, were restricted to testing on artificial, academic, small toy programs rather than real-world applications.

The few studies that looked at big real-world data sets (the biggest and most rigorous being linked above) didn’t even look at typing discipline as an individual factor — again because they couldn’t regress it out as a single factor.

> Your recent study is pay walled

The link I posted has the full text PDF (that’s the whole reason for me to share Paperpile links rather than the original or DOI). See also an academic discussion [1] of the manuscript.

> Its also specifically targeting JavaScript

Which is explained in the paper: no other constellation allows to compare the effect of added static type checks as directly as JavaScript/TypeScript/Flow. That’s because (a) these languages only differ in their type checker, and are gradually typed, i.e. types can be added to just a subset of the program, which enabled the study methodology. And (b) there’s an extensive database of real-world code to analyse.

> For example, it would be interesting to know how many bugs were due to weak typing.

Fair enough. I’d expect the effect to be (much) less pronounced in a language with stronger type guarantees. Then again, the 15% number is absolutely an underestimate to begin with (see the paper and the blog post [1]).

> adding the typed annotations should be a blind activity to be rigorous

That’s not at all obvious, and from experience I disagree. Thinking about types is automatic neither in static nor in dynamic languages. Once you’ve figured out the correct type, yes, it’s a blind activity … but that’s a pretty empty statement.

> having to add types can introduce other forms of bugs

Again, this claim is far from obvious, beyond the trivial “if I make this type an `int` even though it should be a `string` then that’s a bug”. Fair enough, but adding the type annotation and performing a type check merely reveals this bug. The bug itself was present in the programmer’s flawed assumptions about the invariants in the code. Your assertion is equivalent to saying “compiling or interpreting the code [rather than writing it on a piece of paper and never touching it] can introduce bugs”. — The bugs are already in the code, we just didn’t detect them.

> Would have been interesting too to analyse how long it took for the bug fix commit

They only considered bugs they could fix within a highly constrained time frame, and only looking at local code around the bug. Hence, again, why the 15% is an extreme underestimate. Furthermore,

> So you would know the real cost of them.

The study calculated the token cost of adding types (and provide a justification for this metric). In a nutshell, the cost is negligible, especially in languages with strong type inference capabilities (Flow outperforms TypeScript here).

> Clojure and Ruby were both outliers that outdid almost all static languages.

Right, and I’d expect the same to still hold. But it shows one of the crucial shortcomings of previous studies: they primarily did not examine the effect of static vs dynamic typing. Rather, they examined the effect of different programming languages, to which typing is just one contributor amongst many. It should come as no surprise to experienced programmers that some languages (regardless of static vs dynamic discipline) vastly outperform others. For me, this is all the more reason to be excited about the headline topic: If done well, statically type checked Ruby could be an amazing language, with the added benefit of static typing at (as shown) virtually no cost.

[1] https://blog.acolyer.org/2017/09/19/to-type-or-not-to-type-q...

didibus · on June 6, 2018

> If done well, statically type checked Ruby could be an amazing language, with the added benefit of static typing at (as shown) virtually no cost.

I think this is where I feel the jump in causation to be too high. No one knows why Ruby and Clojure match up to Scala and Haskell. The prior study couldn't isolate the impact of a particular feature, but it did a pretty good job, at least similar to your study in finding overall defect rates.

Ruby with types, if we take your conclusion, should end up even outperforming Haskell by almost 10% in overall defect rates. That just doesn't sound believable to me.

I fear types are not independent variables. They affect other variables. And that's where I argue that they might in fact come at some cost which we still do not understand.

I still strongly feel they should have added types without knowing the bug. There's many ways to type data, those choices do matter. There's code that can't be typed precisely, or requires considerable efforts to type, there's even code that can not be typed at all, and must be rewritten differently to be typed.

Now I'm not going to argue JavaScript doesn't benefit immensely from static type checking. What I argue is that doesn't mean every language will see benefits from static types.

That said, I'm a big fan of gradient typing, especially when it can add runtime contracts over boundaries between typed and untyped. But, I'm not sure how successful they would be in Ruby. They've failed in Clojure, while no real analyses was performed, the community rolled back most adoption of it due to their impression of it not providing any value while adding extra effort. Obviously, the Clojure gradient typing system isn't as well maintained as Flow and TypeScript, so maybe that played a role.

nailer · on June 4, 2018

No problem! Thanks for clarifying.