Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Before I started work on Spiral I spent way too long, like a year working on a ML library before throwing in the towel. I think I skimmed through the source of DeepML over a year ago and my impression was that the author was struggling against the limitations of F#'s type system. There were a bunch of places where he was doing the equivalent of telling the compiler to fuck itself by downcasting to `System.Object`.

The way these projects go is that you start of with an AD library and then when the type system and the language starts getting in the way you either do a lot, and I mean a LOT of engineering with inferior tools or you make something better.

The 'make something better' is always picked and can take various forms. Generally, people would not make a decision to work on a language for a ML library for almost a year before they write down the first line for it. They instead take the middle road of starting with an AD library and gradually extending it.

First they realize that they cannot really express tensors properly with the confines of the language so they make everything symbolic, then they build engines to run such symbolic AST, then they realize that they need JITs and other optimizers to make everything run fast. Tensorflow and PyTorch are currently at this stage. There are more stages after this.

Of course the task of working with ASTs is what a compiler does. I think it is a great pity that Tensorflow and PyTorch are written in C++ under the hood. C++ is probably the worst language I can imagine for working on compilers, so I applaud the authors of DeepML for picking F# instead. Had the TF team picked statically typed functional language for this they could have cut the size of TF by over 10x and saved themselves over a million lines of code.

[ML](https://en.wikipedia.org/wiki/Standard_ML) derivatives like it were made for that sort of thing and I cannot imagine doing Spiral in a non-ML styled language.



Thanks for the response. I always thought that the point of "making everything symbolic" is that you can differentiate a symbolic expression and thus implement backpropagation automatically. No? How do you handle this in Spiral?


Yes, it is, but there exist an alternative approach called automatic differentiation which is more imperative and flexible in nature than symbolic differentiation.

The idea is to keep a tape (you can think of it as a list) and then record all the operations on it as you step forward through the program. Then at the end you execute the operations backwards from the tape.

I take this approach in Spiral's ML library.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: