What's New in Lua 5.4

gautamcgoel · on July 15, 2020

Lua is a language I really want to love. I like the emphasis on simplicity and minimalism, and the Scheme-like semantics, which mix imperative and functional styles, really hits a sweet spot IMO. LuaJIT is a crazy impressive feat of software engineering. However, there are some specific issues which hold Lua back IMO. First, as LuaJIT author Mike Pall famously noted, the Lua authors constantly break compatibility between releases. Lua is really several different, incompatible languages (Lua 5.1, 5.2, etc). LuaJIT is still at Lua 5.1, IIRC. Second, there are a bunch of minor nitpicks (1-based-indexing, anyone?) which turn off a bunch of people. Lastly, because Lua is so minimal and focused on portability, people end up reimplementing their own abstractions (such as object systems) from scratch, further fracturing the ecosystem. I think there's a space for a new project, which takes LuaJIT as a starting point and addresses some of the issues I described. It would also be great if this hypothetical new language had better support for Unicode and concurrency.

bch · on July 15, 2020

If you’re not already familiar, check out Tcl. It’s well established, similar domain, and addresses some of your concerns head-on - very stable API/semantics, not 1-based, comes with quite a few “batteries” included, excellent concurrency and parallelism (lovely thread model (apartment [0])), native Unicode... but still portable and actively developed.

[0] http://www.opengroup.org/comsource/techref2/CHP06GDC.HTM

the_hoser · on July 15, 2020

TCL is... semantically, amazing. It is syntactically hideous. As much as we like to claim that we're technical people and form over function blah blah blah... aesthetics and readability matter.

int_19h · on July 16, 2020

The bigger problem is that "everything is a string" semantics hinders garbage collection. Ever noticed how all OO libraries for Tcl require explicit disposal of objects?

bch · on July 16, 2020

I don’t use Tcl OO these days (used to use [incr tcl], quite a while ago), but don’t know how EIAS (everything is a string) affects this. It’s more accurately described as “everything has a string-representation”... but that’s the same for an int, a list, a float... and they don’t need special destruction, and Tcl doesn’t leak because of that.

int_19h · on July 16, 2020

They don't need destruction because they're all value types, and assignments are copies. But for OO, you need some form of references. And if your reference is a string, how do you know that it is a reference? Worse yet, if it's a part of a larger string, how do you know that said string contains a reference?

Keep in mind that while, yes, Tcl lists have a more efficient internal representation when Tcl knows that it's a list, a string that happens to be a valid representation of a list is also a list, just by virtue of existing - but it won't get a special list representation until you try to use it like one! It's just a string like "foo bar baz". Or, say, "objref#1 objref#2 objref#3".

And the semantics of the language is EIAS through and through - all those internal representations are optimizations that are not supposed to change the observed behavior, only performance. A GC cannot treat lists without internal representation differently from those with it without breaking this. So it has to assume that any string is potentially a list that contains object references.

So a proper Tcl GC would have to be incredibly pessimistic about what it considers to be roots of the object graph - basically, any substring that looks like an object reference, has to be considered as one. Now imagine just how much string scanning such a GC would have to do in a non-trivial app for a single sweep!

networked · on July 20, 2020

> So a proper Tcl GC would have to be incredibly pessimistic about what it considers to be roots of the object graph - basically, any substring that looks like an object reference, has to be considered as one. Now imagine just how much string scanning such a GC would have to do in a non-trivial app for a single sweep!

This is how the garbage collector in Jim Tcl works. It scans the string representation of all objects [1] for the reference syntax [2]. In practice, it does not cause performance problems. Since Jim Tcl is intended as a smaller, more easily embedded counterpart to Tcl 8, nobody runs it with a large heap.

How slow is it? I have just tried a naive benchmark on a first-gen Raspberry Pi Model B. With a live list of one millions strings and 50 thousand references garbage collection took around 750 ms. On a Core 2 Duo laptop running Linux it took around 120 ms.

  . for {set i 0} {$i < 1000000} {incr i} {
        set v$i "<reference.<--NOT padding padding $i"
    }
  . for {set i 0} {$i < 50000} {incr i} {
        set r$i [ref $i test]
    }
  . time collect 5  ;# Average of five runs
  765443 microseconds per iteration

The tracing garbage collector only affects references. Normal strings are reference-counted like in Tcl 8. References are the low-level basis of Jim's OO system, which is implemented in pure Tcl.

[1] "Objects" in the sense of Jim_Obj (Tcl_Obj). https://github.com/msteveb/jimtcl/blob/da293a1eef2ddd709f10b...

[2] http://jim.tcl.tk/fossil/doc/trunk/Tcl_shipped.html#_garbage...

bch · on July 21, 2020

I could be wrong but I don’t think “object” and “GC” are being used in the same manner for these posts.

networked · on July 21, 2020

I used "all objects" to mean the Jim_Obj objects in the C code I linked to, and int_19h used "object" for OO objects, but we are talking about the same kind of GC. int_19h's comment explains how implementing tracing garbage collection for OO object references without breaking EIAS would require a very pessimistic garbage collector. My reply is about how Jim Tcl implements this exact type of pessimistic tracing garbage collection for references, the basis of its OO system.

I should have just said "all values". The name Tcl_Obj/Jim_Obj is a legacy artifact that leads to lots of confusion. Perhaps Tcl 9 will rename Tcl_Obj to Tcl_Value.

bch · on July 16, 2020

I’ll have to dig in and study a bit - I don’t know how [the object system] works off the top of my head.

It sounds like you’re familiar with Tcl internals, so I presume you know that (if we stick w the list as a prototypical “complex” structure) that the management is reference counts, freed when count==0, and may be >1 if this is a shared item (I’m going to refrain from calling it an object, though internally they’re a Tcl_Obj type, which has nothing to do with OO-programming). In a list, the list is an item, and ea. element is an item, with (of course) incr/decr references at when appropriate, as determined by the language rules itself.

Wrt strings vs lists, a string could be considered a single “blob”, but coercion to a proper list (“shimmering” in Tcl parlance) does the tokenization with requisite ref. counting as part of the conversion.

...I think the above may be entertainment more for anybody else who’s following along, but if that illuminating for you too, that’s a bonus.

Perhaps you could show me by example:

What’s a piece of (eg) Ruby and what you think a Tcl workalike would appear as, and where that falls down.

int_19h · on July 16, 2020

An object is really just a namespace with a generated name, variables in which correspond to fields of the object. A reference to the object, then, is simply a string that is the name of its namespace.

(I'm oversimplifying and ignoring the command part of it, because it doesn't really matter for any of this.)

For the sake of readability, let's assume that those object IDs / namespace names are just numbers, e.g. ::42 (in practice, it's something like ::oo::Obj42). Since that number was generated by TclOO machinery under the hood, said machinery can properly tag it as a reference at that point, and return such tagged value from [... new].

Now, let's say that we have some class, and some code that constructs an instance of that class, and then builds a list that references that instance from one of the elements. One way to do it would be start with an empty list, and build it element by element:

   set foo [Foo new]
   set lst {}
   lappend lst blah
   lappend lst $foo
   lappend lst blah
   unset foo

This produces the list {blah ::42 blah}. Because we built it element-by-element, our Tcl-with-GC tags it as a list, and uses an optimized representation for it, in which ::42 is a separate element that's known to be an object reference. Even after we remove the direct reference by unsetting foo, GC can use that metadata to trace a path to ::42 via lst, and know to keep the object alive. If we want to recover foo later, we can safely do:

   set foo [lindex lst 1]

So far, so good. But... that wasn't idiomatic Tcl. We're much more likely to build the list thus:

   set lst "blah $foo blah"
   unset foo

It's still the same list {blah ::42 blah}. But, since we haven't performed any list operations on it, Tcl had no reason to tag it as one, and to switch it to the optimized representation - it's really just a string at this point. And because the elements haven't been separated out, we've also lost the "this is an object reference" tag on ::42 - there's no separate element as yet, so there's nothing to tag!

So then, how would GC know to look at lst, treat it as a list, and find the object reference ::42 inside? Note that if it doesn't do that, then the object will possibly be garbage collected by the time we get to:

   set foo [lindex lst 1]

Tcl finally realizes that lst is a list because of our use of lindex here, and converts it to the efficient representation that separates the elements... but at this point, how would it know that the element ::42 originated as an object reference, rather than a random string that looks like one? And even if it just pessimistically assumes that it's a reference, that doesn't really help - the object is already collected, and the reference is invalid anyway. So, we just made shimmering observable, and thus broke EIAS semantics. Our Tcl-with-GC is no longer Tcl.

So, to remain true to EIAS, our GC can't rely on the presence or absence of efficient list representation for a value to decide whether to scan it for references, or not. It has to proactively treat every variable in the program as a potential list with potential object references in it - because, well, it could be; and if it is, then it's also a GC root that must be traced!

And yet this is a trivial case. We actually have to consider nested lists, and dicts, and unevaluated Tcl code stored in a string, and all possible permutations thereof...

virtue3 · on July 16, 2020

one of my professors described TCL as "what a bunch of grad students made for a programming language". Always got a chuckle out of me.

the_hoser · on July 16, 2020

Pretty close. It was a professor that was tired of people writing languages to embed in their applications, so he wrote a language to embed in applications.

Very much an XKCD 927 story.

bch · on July 16, 2020

The guy who also brought us raft[0], log structured file systems[1], and Ousterhouts Dichotomy[2], among other things, in case you were thinking “not a bunch of grad students, but ‘some professor’.”

[0] https://en.wikipedia.org/wiki/Raft_(computer_science)

[1] https://en.wikipedia.org/wiki/Log-structured_file_system

[2] https://en.wikipedia.org/wiki/Ousterhout's_dichotomy

the_hoser · on July 16, 2020

I figured that details like that didn't matter in the context of the joke.

He was a smart and productive person that designed an ugly programming language.

_ph_ · on July 16, 2020

I quite like TCL, but had my issues with the syntax as well, unless I read about how it works in the TCL book by Ousterhout (highly recommended). The key bing, that the syntax is extremely simple and systematic - a little bit like Lisp s-expressions - which allows for a very simple general parser and also makes it possible to write TCL extensions in pure TCL. This is a very Lisp-like trait of TCL.

the_hoser · on July 16, 2020

All of this is true, but it doesn't matter. Most programmers are put off by s-expressions, too.

Aesthetics matter.

_ph_ · on July 16, 2020

It does matter. Aesthetics do matter, but for a programming language, technical capabilites matter much more. The regular syntaxes of Lisp and TCL give it many powers not being present in languages with more "fancy" syntaxes.

bch · on July 17, 2020

_ph_, on this case I think it’s fair to say the aesthetic and ergonomics are the technical capability, to degrees. For example, building a custom looping control does benefit aesthetics and ergonomics. It could be viewed as 2 sides of the same coin.

the_hoser · on July 17, 2020

I'd like to live in a world where that is true, but we just don't. Languages only need to be powerful enough to solve the problems actually being solved by 9 to 5 programmers. Aesthetics and ergonomics matter much more.

petre · on July 16, 2020

It's also quite slow.

bch · on July 16, 2020

That means nothing in isolation. Fill us in.

petre · on July 16, 2020

In Tcl everything is a string. If you work with numbers for instance it's quite slow. Or maybe loop or set operations are slow. Dunno. Compare these two:

  Lua https://p.thorsen.pm/5ecb8ce026ab
  Tcl https://p.thorsen.pm/ee8a792acd52

Results:

  Lua 5.2: 0m0,027s
  LuaJIT 2.1: 0m0,016s
  Tcl 8.6: 0m0,630s

I'd pick Lua over Tcl any day.

bch · on July 16, 2020

Thanks for the exercise -

I got an order of magnitude improvement on Tcl just by bracing the expressions:

Bad: expr $i +$j + 2

Good: expr {$i + $j + 2}

... it saves interpretation and evaluation to occurring once, hoisted in the [expr] engine; it’s idiomatic. The lua version still appeared faster, though I’ve only spent 2 minutes looking at this.

Again, strings should play no issue here; values must be able to generate a string-representation, but it’s computed lazily, and in this case, the $i, $j, etc in Tcl are working with native integers.

I’m happy we have things like Tcl and Lua; either one of them makes development enjoyable. I appreciate you taking time to write your code samples.

google234123 · on July 15, 2020

Tcl code always looked ugly to me

pansa2 · on July 16, 2020

> I think there's a space for a new project, which takes LuaJIT as a starting point

Unfortunately, LuaJIT is tightly-coupled to the Lua language and its code is very complex. I'd be surprised if anyone other than Mike Pall was able to retarget it to support a different language.

Also, is JIT compilation really necessary for a scripting language? Projects like the PyPy JIT for Python have seen little adoption, and many platforms (e.g. iOS and game consoles) don't support JIT at all.

saagarjha · on July 16, 2020

PyPy JIT is unsuccessful because CPython makes it hard for PyPy, not because people don't want it. When there's a need and support for it, we have extremely good JITs spring into existence–JavaScript has at least three, for example.

sitkack · on July 16, 2020

CPython needs to grow up and not hoard the ecosystem. A successful PyPy is a successful Python.

setr · on July 16, 2020

I don't think they're doing anything in particular to hinder PyPy -- the damage was done long back, when things like the C-API exposed too much detail on CPython internals that severely limit PyPy's potential to replace it with more optimal systems/datastructures

sitkack · on July 16, 2020

The issue with crowding out alternative Pythons is a socio-political one. The C-API issue is as well, as you said Python extensions are over coupled to the runtime. ctypes and subsequently cffi fixed this 10+ years ago.

1. Decouple the extension mechanism. Use cffi

2. Modularize the batteries

Each alternative Python has to re-implement large swaths of the stdlib. The code is the spec in Python.

Stackless got the PyPy treatment before PyPy even existed. Besides PyPy, we now have graalpython, rustpython and ...

https://github.com/beeware/ouroboros/issues/21 points to a python stdlib I had never heard of, https://github.com/pfalcon/pycopy-lib

The point is, the core CPython devs need stop being so selfish with their toys. The greater Python ecosystem is diverse in spite of their actions not because of it.

ksec · on July 16, 2020

It is similar situation in CRuby as well.

jashmatthews · on July 16, 2020

Compared to other JIT compilers LuaJIT is very simple. Thomas Schilling forked LuaJIT to run Haskell with what looked like some good results http://files.catwell.info/misc/mirror/tracing-jit-haskell-sc...

Rochus · on July 16, 2020

> Compared to other JIT compilers LuaJIT is very simple.

Wow, the understatement of the year.

Rochus · on July 16, 2020

> I'd be surprised if anyone other than Mike Pall was able to retarget it to support a different language.

Not that difficult. See e.g. https://github.com/rochus-keller/Oberon#lua-source-and-bytec..., https://github.com/rochus-keller/Smalltalk#a-smalltalk-80-in..., or here a whole list of other language frontends: https://github.com/hengestone/lua-languages

pansa2 · on July 16, 2020

Sorry, that's not quite what I had in mind. I was considering the feasibility of modifying LuaJIT to directly target a different language.

The projects you link to all either compile to, or are written in, Lua source code or LuaJIT bytecode. That's less efficient than retargeting the LuaJIT VM. For example, LuaJIT includes bytecode instructions specifically tailored to Lua tables. It looks like these projects build, say, their array data structures on top of those existing instructions, rather than adding array-specific support to LuaJIT.

Rochus · on July 16, 2020

> That's less efficient than retargeting the LuaJIT VM

That's like saying LLVM is inefficient because all the frontends compile to LLVM IR.

E.g. the Oberon compiler directly generates efficient LuaJIT bytecode also using FFI and even reuses the VM features for it's source-level debugger. Then benchmark written in Oberon has nearly the same performance like compiled to native when it runs on LuaJIT

mingodad · on July 16, 2020

Maybe this https://github.com/mingodad/ljs and this https://github.com/mingodad/ljsjit does to some degree what you have in mind ?

nitrogen · on July 16, 2020

many platforms (e.g. iOS and game consoles) don't support JIT at all.

I was under the impression that LuaJIT has support for at least one game console because it's used in games. Is that not the case?

pansa2 · on July 16, 2020

The LuaJIT project includes both a fast interpreter and a JIT compiler.

"Due to restrictions on consoles, the JIT compiler is disabled and only the fast interpreter is built." [0]

[0] https://luajit.org/install.html

scythe · on July 16, 2020

Aside from Javascript: Racket, Guile, Raku, and Ruby all have JIT in the main implementation these days. Even PHP has moved towards JIT compilation.

ebg13 · on July 15, 2020

> LuaJIT is still at Lua 5.1, IIRC

It's 5.1 plus some of 5.2.

jhoechtl · on July 16, 2020

You have all my upvotes. I really would like to like Lua but the lack of the mentioned basic infrastructure elements like powerful Unicode support and platform-independent parallelism abstractions make every larger Lua project a DSL of it's own.

anonymoushn · on July 16, 2020

In Openresty you don't end up facing much trouble around parallelism. In other contexts, I might go for love2d's thread module, which can be loaded from liblove on its own https://love2d.org/wiki/love.thread

I've had an ok using unicode libraries but I haven't had to do much with them so far.

Sophistifunk · on July 16, 2020

What you're looking for is the JVM.

sriku · on July 16, 2020

Julia also has 1-based indexing .. You get used to it pretty quickly.

Reelin · on July 16, 2020

> You get used to it pretty quickly.

You don't if you're writing low level algorithms. It's difficult for me to articulate the exact problem but I liked 1-based indexing until I needed to implement some algorithms that operated on multi-dimensional arrays element wise. Things just don't compose the right way.

It's actually the primary reason I don't make more use of Julia. I like the syntax and overall design more than Python but it just isn't worth it to me.

(If you're wondering what I'm doing in an Lua comments section, it's because I really like the ideas behind some of the Lua based languages such as Terra.)

henrikeh · on July 16, 2020

I primarily work in 1-indexed languages, but I think everyone is missing the forest from the trees in this discussion: indexing should be selectable by the programmer.

One or zero indexing is a question about domain and communication. In signal processing, it is quite normal to need a negative index (autocorrelations f.x.). Ada (and I believe Pascal and Julia) has supported for specifying the indexing type, which can then be selected to match the problem domain.

freemint · on July 16, 2020

I don't get that 1 based indexing debate in regard to LUA and Julia. Both came up recently on hackernews.

In LUA you can use the metatable functionality to implement 0 based indexes.

In Julia you can overload the array access operation too. There a packages which allow zero based indexing. Or even StarWars based indexing including the machete viewing order. https://github.com/giordano/StarWarsArrays.jl Julia's array interface has has functionality what the first/last index is, so you can use it in other peoples code. If they haer coded 1 as first index just make a pull request to fix it.

pansa2 · on July 16, 2020

> In LUA you can use the metatable functionality to implement 0 based indexes.

I tried to do that a while back and kept coming across corner cases that didn’t work. Is there a complete implementation of zero-based arrays for Lua available anywhere?

ygra · on July 16, 2020

Visual Basic’s Option Base comes to mind. And Pascal had very nice facilities for arbitrary array bases. I think you could define an array with indices from 18 to 22 is you liked, which can definitely simplify code doing the indexing.

google234123 · on July 16, 2020

Oh god, not more inconsistency! That sounds horrible

henrikeh · on July 16, 2020

It really isn’t. It’s just a way of communicating the problem domain in the language.

Maybe you should try it before passing judgement.

markrages · on July 16, 2020

Yes, you just crap up your code by subtracting 1 anywhere you need to calculate an index.

I moved from Matlab to Python and the biggest improvement was zero-based indexes.

pjmlp · on July 16, 2020

Just like you do with the upper index in zero-based languages , as length > max index.

Apparently this little detail is never taken into account.

didip · on July 15, 2020

I always thought Lua would be great for config files. Better than YAML even, since it's an actual language.

Imagine if Kubernetes config files are actually Lua files.

davidgould · on July 16, 2020

Lua was originally invented as a way to describe complex data. At some point it was enhanced with executable data and from there it’s a small step to become a programming language.

xud · on July 16, 2020

I don't think it's a good idea to make config files complicate. I don't want to run a debugger to know how my configuration file works.

krapp · on July 16, 2020

Using Lua tables alone would be anything but complicated. The syntax is almost the same as JSON.

You just have to practice self-discipline and not let Turing creep take over.

bewo001 · on July 16, 2020

Still beats a yaml file format that has evolved so far that is effectively describing a program and has no tooling at all.

Jaxan · on July 16, 2020

Agreed. Last thing you want is a Turing complete language for config. Imagine your config to be non terminating! (And no way to check that.)

seany · on July 16, 2020

You can do that with nginx https://docs.nginx.com/nginx/admin-guide/dynamic-modules/lua... . IIRC Cloudflare uses this extensively.

lioeters · on July 16, 2020

> Cloudflare uses this extensively

Thanks for this mention - I found:

CloudFlare repos on GitHub using Lua: https://github.com/cloudflare?q=lua

Blog posts with "Lua" tag: https://blog.cloudflare.com/tag/lua/

Pushing Nginx to its limit with Lua - https://blog.cloudflare.com/pushing-nginx-to-its-limit-with-...

jgrahamc · on July 16, 2020

We make use of OpenResty and LuaJIT although we are moving away from them and more and more towards code written in Rust and other languages.

stephenr · on July 16, 2020

Yaml is already too complex to be reliable.

pjmlp · on July 16, 2020

I was already doing it in 1999 with Tcl, nothing new really. :)

blibble · on July 15, 2020

I used to quite like Lua, but got quite consistently wound by up several things

    - global variables by defauilt
    - weird choice of certain operators (~= for not equal)
    - arrays start at 1

(fixed yet?)

ufo · on July 16, 2020

I find that the global by default thing is not as much of a problem if using a linter like Luacheck. If the text editor is configured to show linter warnings it catches typos before they become a problem.

https://luacheck.readthedocs.io/

anonymoushn · on July 16, 2020

Global by default and block level scope seems like the right choice. In a popular dynamic language that uses local by default and function level scope, the language had to include two keywords for accessing variables other than locals, and the following becomes a bug when it would not be a bug in any sane language:

  x = calculate_n_dogs()
  # later
  ok = all(x.status_code == 200 for x in my_other_responses)
  # later
  let_the_dogs_out(x)
  # oh no, why isn't x n_dogs anymore

kgm · on July 16, 2020

This is clearly about Python, so I will note that this is not actually the case: The generator expression introduces a new scope, and so it does not reassign the value of x. Python does not require the "global" and "nonlocal" keywords in order to access variables, but to reassign names brought in from an outer scope.

Python 2 did have the issue where list comprehensions (though still not generator expressions) didn't introduce a new scope, and so you could indeed get issues quite similar to your example above. This was corrected in Python 3.

All that said, I will agree that Python's scoping rules are pants-on-head crazy. In very nearly every other programming language under the sun, you declare variables at the point where they exist. In Python, you declare a variable when it exists somewhere else (and you then want to reassign it). This does mean that you don't need to put a "local" or "var" or what-have-you in front of each new variable you declare (which is basically the reason Python did it this way), but the semantics of the thing can take some getting used to.

anonymoushn · on July 16, 2020

I appreciate the correction and I'm glad to know this is fixed in python 3!

I've run into similar issues where the variable was used in a for loop declaration rather than a list comprehension, but these are a bit easier to spot. Also, maybe people who don't use lua constantly won't be in the habit of thinking they can safely shadow locals with their looping variables.

G4BB3R · on July 16, 2020

I am impressed modern languages haven't adopted 1-indexed array. This was a design mistake from BCPL and C.

sitkack · on July 16, 2020

Lua was initially designed as a data description language for Oil and Gas simulations written in Fortran. 1 based indexing makes perfect sense in that context.

idle_zealot · on July 16, 2020

Having 0-indexed arrays[1] makes sense. It's lists[1] that should be 1-indexed.

[0]: where "array" means contiguous memory.

[1]: where "list" refers to a high-level collection of elements

Pxtl · on July 16, 2020

To me it's the lack of some functional features for 1-liners that really trip me up. Verbose function declaration, no null-coalesce operator, no ternary operators, stuff like that.

I love the platform and ideals and goals - a simple, lightweight embedding language... but the language details rub me the wrong way.

anonymoushn · on July 16, 2020

A minor hobby of mine is collecting proposed changes for Lua. Let me make sure I have these right:

- some syntax like js arrow function (this is tough because {} are not block delimiters, maybe single-expression lambdas like in python? I kind of hate those though.)

- `??` operator which does the same thing as `or` except when the left operand is exactly `false`

- `?:` operator. Should this evaluate to exactly one value or any number of values? e.g. x?2,3:4,5

I recently encountered a library that addresses some of the first need: https://github.com/starwing/luaiter#the-selector-interface

mingodad · on July 16, 2020

Also there is https://github.com/mingodad/ljs that changed the parser to accept a Javascript like syntax, including https://github.com/mingodad/ljsjit for LuaJit and includes a tool to automatically convert lua to ljs https://github.com/mingodad/ljs/tree/master/lua2ljs and did several project conversions to check that it's feasible and works (see the readme).

Pxtl · on July 16, 2020

Exactly.

Theoretically I could get this stuff by using Moonscript, but then I'm using a somewhat esoteric language that brings in a lot of other very unusual ideas.

Although for ternary, I honestly think the "if/elseif/else can be used as an expression" has always been my favorite syntax. Or the CASE...WHEN syntax of SQL.

anonymoushn · on July 17, 2020

Some compile-to-lua languages include if expressions, but they commonly compile to IIFEs to support evaluating to a number of values other than 1. It's troublesome. I think the bytecode can express what you want.

scythe · on July 16, 2020

>no null-coalesce operator

`or` does this unless there is a possibility your function returns boolean `false`

ebg13 · on July 15, 2020

~= and 1-indexing are only weird if you haven't seen them before. MATLAB uses them as well.

samatman · on July 16, 2020

I work with Lua extensively, including abstract syntax trees.

To mark a node at the third position, containing one character, we say {3,3}.

But what if there's a node at the third position, that contains no character? Well, thats {3,2}.

It's like nails on a chalkboard every time. 1-based indexing is a mistake.

~=, by contrast, is just notation. You get used to it quickly.

saagarjha · on July 16, 2020

> To mark a node at the third position, containing one character, we say {3,3}. > But what if there's a node at the third position, that contains no character? Well, thats {3,2}

Why not shift over indexing by one consistently? So you'd use {3, 4} and {3, 3} for each, so the length property is preserved.

samatman · on July 16, 2020

Because that's not how the string library works.

("12345"):sub(3,3) returns "3", while ("12345"):sub(3,4) returns "34", and ("12345"):sub(3,2) returns "".

Indexing in the array portion of tables, and strings, is at least consistent; but Djikstra was right.

The Wiki[0] has a good discussion of Djikstra's iconic essay. The best of these arguments, imho: 0 based indexing unifies enumeration and measurement.

[0]: https://wiki.c2.com/?WhyNumberingShouldStartAtZero

int_19h · on July 16, 2020

That's an argument against end-inclusive ranges more so than against 1-based indexing. If they were end-exclusive, like in Python, then {3,3} would be an empty string, and {3,4} would be a single character at 3. Note how it suddenly doesn't matter at all whether the indices are 0-based or 1-based - all of the above is valid regardless.

FWIW, I think that the real mistake was to make inclusivity/exclusivity implicit to begin with. If the syntax is explicit, and the choice is captured in the resulting value (i.e. it's more than just a pair of numbers), then you just use whatever is more convenient for the task at hand. Nim almost gets there: (x .. y) is end-inclusive, and (x ..< y) is end-exclusive - but the syntax still exhibits a preference.

pwdisswordfish2 · on July 16, 2020

Thing is, end-exclusive ranges go hand-in-hand with zero-based indexing. I don’t think I need to link the EWD article.

int_19h · on July 16, 2020

I don't find it particularly persuasive. End-exclusive and end-inclusive can both be natural in different scenarios, regardless of the base - the article simply cherry-picks the ones that it needs to "prove" its case. That's why it's best when the language just offers both.

pansa2 · on July 16, 2020

> Nim almost gets there: (x .. y) is end-inclusive, and (x ..< y) is end-exclusive - but the syntax still exhibits a preference.

Swift is similar, using `...` and `..<`. Rust uses an alternative convention, `..=` and `..`.

Would you prefer to see `..=` and `..<` so that inclusivity/exclusivity is always explicit?

int_19h · on July 16, 2020

Yep, pretty much. I'm a strong proponent of "explicit is better than implicit" in all cases where there's no default that's obviously preferable, and this is is one of those cases - that different languages picked different defaults is all the evidence I need.

deathanatos · on July 15, 2020

It's not just weird, or different, though; there are reasons to prefer 0-based indexing: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E...

curtisf · on July 15, 2020

I really don't like seeing this cited as a "proof" that 0-based indexing of lists is better. It is too shallow and ignores too many important facets of the question.

The main contrast Dijkstra is drawing is between closed and half open intervals. Typically, the choice "starting from one" is tied to "closed" intervals, but that's not strictly necessary. Nothing stops you from using half-open intervals on Lua lists any more than you can used closed intervals to index into C arrays.

The main advantage of the typical coupling is avoiding `+`s and `-`s in your "typical" loops over the first n indices:

    for i = 1, n do -- closed
    for (int i = 0; i < n; i++) { // half open

but what if you want to go in reverse? Getting the _final_ indexes requires arithmetic with half-open intervals, but it doesn't with closed (which also requires changing the operator from exclude-eq to include-eq):

    for i = n, 1, -1 do -- closed
    for (int i = n - 1; i >= 0; i--) { // half open

Of course, probably the most important property of half-open intervals is the way that they can be broken down into two disjoint intervals: `[A, C) = [A, B) U [B, C)`. But if your indexes are integers this is straightforward with closed intervals too: `[A, C] = [A, B - 1] U [B, C]` or `[A, C] = [A, B] U [B + 1, C]`.

But again, if you're using this property, you probably don't actually care about the "absolute" indexes at all; indexing with `[1, #list + 1)` isn't a problem _at all_ in Lua.

Plus, between the operations of "get final index" and "split interval into non-overlapping pieces", "get final index" is far more common (it's crucial to stacks, queues, etc, in addition to various uses like iterating backwards). Choosing "closed" intervals is then the cleaner option to avoid an ugly "+" or "-"!

Dijkstra mentions that requiring -1 to refer to an initial interval as "ugly" since -1 is not a valid index, but he has no qualms with using the length of the list, n, and allowing it to not be an index!

This is all said without even going into the awkwardness of having [k] being the {k+1}th element (e.g., [4] is how we write "fifth" in C -- I distinctly remember at least one conversation at work where the wrong thing was communicated because someone said "fourth" to mean [4])

Now, don't take this as a serious argument that 0-indexed-lists are inferior to 1-indexed lists. It's just a demonstration that it's easy to argue either side. That's because it's really just an unimportant convention. Pick one and stick with it and everything will work fine, modulo confusion about what "fourth" means.

anonymoushn · on July 15, 2020

> indexing with `[1, #list + 1)` isn't a problem _at all_ in Lua.

That's not what string.sub accepts though.

> Plus, between the operations of "get final index" and "split interval into non-overlapping pieces", "get final index" is far more common (it's crucial to stacks, queues, etc, in addition to various uses like iterating backwards). Choosing "closed" intervals is then the cleaner option to avoid an ugly "+" or "-"!

It (rather unnecessarily) takes log time, so you probably shouldn't do it if you can avoid it.

ufo · on July 16, 2020

> It (rather unnecessarily) takes log time, so you probably shouldn't do it if you can avoid it.

This is one of the things that improved in Lua 5.4. Now it caches the table length so that the # operator is O(1) in many common situations.

ebg13 · on July 16, 2020

Will it still give the wrong answer if your table isn't sequentially indexed?

That is, does #({[1]="a", [3]="b"}) return 1 or 2?

edit: I see that it still returns 1. That's a shame. This trips up so many newcomers who've been told that "#" means "length". IMO they need to either start calling it something other than the "length operator" or make it _actually_ return the length. Sadly at this point it would just be yet another breaking change.

anonymoushn · on July 16, 2020

1 or 3 but not 2

edit: yes, in fantasy-lua there will be an array type that can represent [1,nil,3] (sequence of 3 values)

ufo · on July 16, 2020

The ideal fix would to allow tables to store nil values, instead of always treating nil as a "hole". Instead of assigning nil to remove a value, you would call a separate "delete" function or operator. This way there would still be only one table type instead of separate types for tables and arrays.

Some early alpha versions of Lua 5.4 played around with this idea but it turns out that it would break too many things. I'd expect that if Lua ever makes a major jump to Lua 6.0 then nils in tables will likely be one of the main changes.

anonymoushn · on July 16, 2020

I'd like to read more about this. It would be a little unfortunate to end up with Python's "x in y" and "del y[x]" just to differentiate between missing entries and present entries set to nil.

On the other hand, my not-very-well-considered preferred approach creates some asymmetry between arrays and maps as to what values can be stored. Or it perpetuates the existing asymmetry between varargs and tables...

pansa2 · on July 16, 2020

From http://lua-users.org/lists/lua-l/2018-03/msg00155.html. Looks like this was in the first pre-alpha release of 5.4 but removed shortly afterwards:

> t[i]=undef is a special syntax form that means "remove the key 'i' from table 't'". You can only use 'undef' in three specific forms:

  t[i] = undef     -- remove a key from a table
  t[i] == undef    -- test whether a table has a key
  t[i] ~= undef    -- test whether a table has a key

anonymoushn · on July 16, 2020

Thanks! I like this proposal a lot.

a1369209993 · on July 16, 2020

  > for (int i = n - 1; i >= 0; i--) { // half open

Well, there's your problem right there[0]; it should be:

  for(size_t i = n; i-- > 0 ;) {

> This is all said without even going into the awkwardness of having [k] being the {k+1}th element

You mean the awkwardness of humans using "kth" to refer to element number k-1?

0: /sarcasm

mingodad · on July 16, 2020

Your example doesn't hold water, size_t is unsigned so the test condition would never be met.

anonymoushn · on July 16, 2020

It works. https://repl.it/repls/TanWeeOctal

sk0g · on July 15, 2020

Isn't "weird == not seen it before" anyway? I get what you're trying to say, but MATLAB is not really a programming language SWEs would use, so if that's the best example you can think of, Lua's bound to confuse most people in the industry.

May work better for mathematicians, non-SW engineers etc, but I wouldn't know.

ebg13 · on July 15, 2020

> I get what you're trying to say, but MATLAB is not really a programming language SWEs would use

It may not be a language that website jockeys use, but "I don't use it therefore probably no one does" is almost never correct. MATLAB is huge in state machine and dynamic systems programming as well as being the original commercial powerhouse for linear algebra algorithms.

sk0g · on July 16, 2020

I meant to say most SWEs, which you seemed to have inferred anyway.

Why do you think that's wrong? I simply claimed MATLAB is much more commonly used by scientists and non-SW engineers. Most SWE jobs are in B2B, web development, embedded, etc fields. Which one of these sees a lot of MATLAB usage?

Maybe don't resort to insults straight away too.

ebg13 · on July 16, 2020

> Most SWE jobs are in B2B, web development

Citation?

> embedded, etc fields. Which one of these sees a lot of MATLAB usage?

Both embedded and etc. MATLAB is used widely in signal processing and controls software.

> Maybe don't resort to insults straight away too.

Sorry. I get annoyed when someone pretends that Software==Websites as if any of our trusted electronics are designed or programmed in javascript.

rpdillon · on July 15, 2020

Julia is 1-indexed as well.

luhn · on July 15, 2020

Which is also targeted towards scientific computing.

jontro · on July 15, 2020

The tilde ~ char is pretty annoying to type on swedish keyboards at least. Alt + ¨ and a space on my mac. Having to type this regularly on my main languages would be off putting

saagarjha · on July 16, 2020

> weird choice of certain operators (~= for not equal)

It's a little unconventional, but then again ~ is a fairly standard (bitwise) negation operator.

> arrays start at 1

Lua doesn't do arrays, it does tables. Though they have similar syntax in certain cases, and are occasionally used for the same things, they are fundamentally different things.

Analemma_ · on July 15, 2020

I've heard that a major use case for Lua is to enable scripting in game engines - for something like that, it sounds perfect, because the entire runtime is just a couple thousand lines of C with few dependencies, that can be literally copy-pasted into your codebase if need be.

But honestly, I can't recommend Lua for any other use case. You'll be chugging along and then run into a brick wall when you need e.g. real regexes instead of Lua's comparatively crippled pattern matching, or a networking library that isn't hosted on one professor's personal web page that hasn't been updated in years. Take it from someone who had to throw out and port a bunch of Lua scripts when they just couldn't keep up with new business requirements: Python or Perl or Ruby is almost always a better choice.

sago · on July 16, 2020

> I've heard that a major use case

It was. At peak perhaps 10 years ago. It is telling that the game engine citation in that article is from 2009.

Now the ecosystem of game engines is very different. Very few games are written in raw C/++ and have to have their own tiny scripting language packaged. They are mostly based on more established engines. 'Small studios' have gone, niched out of end-to-end dev, or re-invented as the 'indie game' world, and pretty much none have their own multi-game engine.

In fact, I've only come across Lua used in Pico-8 gamejam games in the last 2 yrs, personally. (Though I'm retired now.)

It was a phenomenal language for gamedev. I commissioned a Lua consultant to extend the language in a small way for a game, I ended up paying for about half the hours I expected.

pansa2 · on July 16, 2020

> It was. At peak perhaps 10 years ago.

Yeah, I wondered if that was still the case. Most of the examples I can find of “games that use Lua” are indeed around 10 years old.

Are more recent games using different scripting languages, or are they designed to not use a scripting language at all?

anonymoushn · on July 16, 2020

Some more recent titles I know of are Pocket Rumble and Blue Revolver.

A lot of game devs use frameworks or engines that strongly encourage or require the user to use C#, javascript, gdscript, or gml. Solar2d and love2d use Lua but don't seem very popular. Fewer people are throwing together their own thing in C++ and putting Lua inside.

corysama · on July 16, 2020

Lua is not designed to be used standalone. Many people get frustrated assuming that it's a Python alternative. Or, by trying to make it into one by writing stuff like socket libraries for it.

Lua is designed specifically to be embedded within a larger program to manipulate only the features of that specific program. It is for scripting your program. Not, for scripting-up a program in Lua.

Trying to do the opposite, embed Python as a scripting language inside of a larger program, is certainly possible. But, it is much, much larger endeavor that almost certainly drags in 100 pieces of functionality that you do not want inside your program for every 1 that you actually want to use.

Sounds like for your situation Python/Ruby/Perl are much better matches.

throw_m239339 · on July 16, 2020

So what are the alternatives to Lula in the specific use case you just described, that are not Python? (embeddable languages).

corysama · on July 16, 2020

http://chaiscript.com/

http://www.squirrel-lang.org/

https://micropython.org/

https://docs.racket-lang.org/inside/embedding.html

I've heard that some of the new WebAssembly implementations are actually nice to embed. But, that just moves the problem to "What language to compile to WebAssembly?" ;)

fanf2 · on July 16, 2020

Tcl and Guile are aimed at the same area as Lua.

Pxtl · on July 16, 2020

Iirc there are JS engines that can fit this bill too.

samatman · on July 16, 2020

You chose a bad example, because Lua has lpeg as a "blessed" library.

lpeg is leaps and bounds beyond regex. When I have to work with mere regex, I miss lpeg, the same way someone who is used to regex would feel when stuck with Lua patterns.

Networking isn't a problem for me personally, I just use luv and I'm very happy with it. But it's true that this is under-documented, and you do have to do more thinking and less Stack Overflow copypasting than you would with a language with a larger user base.

anonymoushn · on July 16, 2020

I'd probably use PCRE there. Every C library becomes a Luajit library when you run the relevant header through the C preprocessor and paste the result into your code. Sometimes you have to do this yourself, which makes it more work than you'd need in Python though.

spc476 · on July 16, 2020

Don't tell my coworkers this---we use Lua to process SIP messages for the Oligarchist Cell Phone Company. I found LPEG to be way nicer than Lua's patterns (or even regex) for parsing, and there are other networking modules besides that one professor's version (which one is it, by the way?).

It's also used in Redis, and there's at least one major network router that uses it.

otabdeveloper4 · on July 16, 2020

Lua is probably way more popular as a scripting language for web servers than for games.

edsiper2 · on July 16, 2020

Lua is just great. In Fluent Bit[0] we expose Lua filter capabilities. So end-users can write their own scripts to manipulate record logs and accommodate keys/values easily with a simple programming logic:

- https://docs.fluentbit.io/manual/pipeline/filters/lua

I am happy to see performance is being improved, as of now we stick to LuaJIT, since our use case is pretty simple we have not found a good reason to move away from it.

[0] https://fluentbit.io

Pxtl · on July 16, 2020

I love Lua for its architecture and goals, but not for its linguistics... and this is more of the same. The platform is getting even better, but I feel like the language is getting worse. <const> and <close> are kinda hideous, aren't they?

pull_my_finger · on July 16, 2020

I had a knee jerk reaction against it as well. But when you think about it, it kind of makes sense.

A) these are properties of the variable, not the value. <close> triggers when the variable goes out of scope, and <const> supposedly was a feature that came out of implementing <close>. Both only work with local variables.

B) the angle brackets leave room for extension with more properties in the future without creating an additional new syntax.

C) you can multi-declare several variables with different properties i.e.: `local a <const>, b, c <const> = "apple", "banana", "carrot"` with only a and c being const.

D) no new keyword(s) added to the language

Compared to the simple syntax of the rest of the language it may be a little jarring, but it was well thought out and imo it's still better than weird stuff like "spaceship operators", "hashrockets" or other weird syntax oddities in more popular languages. Would you really prefer chains of keywords? `static const char *strprbrk` We have angle brackets now. If it turns out to be a mistake, I'm sure Roberto and his team will just take it back out of the language.

aquova · on July 16, 2020

I had heard that Lua was adding const support, which I was excited for, but I hadn't seen the syntax until now. I don't care for it. I'm guessing this is due to ambiguity as to whether variables declared as const would be local or global, but I would rather have something like

const x = 1

local const y = 2

over what they're adding.

ufo · on July 16, 2020

The problem with putting the "const" first is that it is ambiguous when there is more than one variable. For example, suppose that you want to open a file and test for errors:

    local f <close>, err = io.open("foo.txt")
    if not f then error(err) end

vs

    local <close> f, err = io.open("foo.txt")
    if not f then error(err) end

In the first version it is clearer that only the "f" variable is marked as a to-be-closed variable. The second version looks nicer when there is a single variable but is more confusing when there are multiple variables.

Pxtl · on July 16, 2020

Either way, why are the <> necessary?

    local f close = io.open("foo.txt")

seems fine.

ufo · on July 16, 2020

That wouldn't work because it is already valid Lua code with a different meaning. Lua parses it as a declaration of a local variable called f followed by an assignment to a variable named close:

    local f
    close = io.open("foo.txt")

Despite the lack of semicolons, Lua is a free-form language and newlines are treated the same as any other whitespace character.

rjeli · on July 15, 2020

Very nice, congrats to the Liz team :)

The new `const` and RAII/resource scope language features remind me of the golang discussion yesterday, someone linked Pike saying JS/TS, C++, Hack, etc keep borrowing features from each other. Of course we won’t fully converge on some perfect syntax any time soon, or ever, but we seem to be settling into a general consensus on syntax in general, and semantics of dynamic languages.

_wqje · on July 16, 2020

A link to subscriber-only content on a public news site?