Lua is a language I really want to love. I like the emphasis on simplicity and minimalism, and the Scheme-like semantics, which mix imperative and functional styles, really hits a sweet spot IMO. LuaJIT is a crazy impressive feat of software engineering. However, there are some specific issues which hold Lua back IMO. First, as LuaJIT author Mike Pall famously noted, the Lua authors constantly break compatibility between releases. Lua is really several different, incompatible languages (Lua 5.1, 5.2, etc). LuaJIT is still at Lua 5.1, IIRC. Second, there are a bunch of minor nitpicks (1-based-indexing, anyone?) which turn off a bunch of people. Lastly, because Lua is so minimal and focused on portability, people end up reimplementing their own abstractions (such as object systems) from scratch, further fracturing the ecosystem. I think there's a space for a new project, which takes LuaJIT as a starting point and addresses some of the issues I described. It would also be great if this hypothetical new language had better support for Unicode and concurrency.
If you’re not already familiar, check out Tcl. It’s well established, similar domain, and addresses some of your concerns head-on - very stable API/semantics, not 1-based, comes with quite a few “batteries” included, excellent concurrency and parallelism (lovely thread model (apartment [0])), native Unicode... but still portable and actively developed.
TCL is... semantically, amazing. It is syntactically hideous. As much as we like to claim that we're technical people and form over function blah blah blah... aesthetics and readability matter.
The bigger problem is that "everything is a string" semantics hinders garbage collection. Ever noticed how all OO libraries for Tcl require explicit disposal of objects?
I don’t use Tcl OO these days (used to use [incr tcl], quite a while ago), but don’t know how EIAS (everything is a string) affects this. It’s more accurately described as “everything has a string-representation”... but that’s the same for an int, a list, a float... and they don’t need special destruction, and Tcl doesn’t leak because of that.
They don't need destruction because they're all value types, and assignments are copies. But for OO, you need some form of references. And if your reference is a string, how do you know that it is a reference? Worse yet, if it's a part of a larger string, how do you know that said string contains a reference?
Keep in mind that while, yes, Tcl lists have a more efficient internal representation when Tcl knows that it's a list, a string that happens to be a valid representation of a list is also a list, just by virtue of existing - but it won't get a special list representation until you try to use it like one! It's just a string like "foo bar baz". Or, say, "objref#1 objref#2 objref#3".
And the semantics of the language is EIAS through and through - all those internal representations are optimizations that are not supposed to change the observed behavior, only performance. A GC cannot treat lists without internal representation differently from those with it without breaking this. So it has to assume that any string is potentially a list that contains object references.
So a proper Tcl GC would have to be incredibly pessimistic about what it considers to be roots of the object graph - basically, any substring that looks like an object reference, has to be considered as one. Now imagine just how much string scanning such a GC would have to do in a non-trivial app for a single sweep!
> So a proper Tcl GC would have to be incredibly pessimistic about what it considers to be roots of the object graph - basically, any substring that looks like an object reference, has to be considered as one. Now imagine just how much string scanning such a GC would have to do in a non-trivial app for a single sweep!
This is how the garbage collector in Jim Tcl works. It scans the string representation of all objects [1] for the reference syntax [2]. In practice, it does not cause performance problems. Since Jim Tcl is intended as a smaller, more easily embedded counterpart to Tcl 8, nobody runs it with a large heap.
How slow is it? I have just tried a naive benchmark on a first-gen Raspberry Pi Model B. With a live list of one millions strings and 50 thousand references garbage collection took around 750 ms. On a Core 2 Duo laptop running Linux it took around 120 ms.
. for {set i 0} {$i < 1000000} {incr i} {
set v$i "<reference.<--NOT padding padding $i"
}
. for {set i 0} {$i < 50000} {incr i} {
set r$i [ref $i test]
}
. time collect 5 ;# Average of five runs
765443 microseconds per iteration
The tracing garbage collector only affects references. Normal strings are reference-counted like in Tcl 8. References are the low-level basis of Jim's OO system, which is implemented in pure Tcl.
I used "all objects" to mean the Jim_Obj objects in the C code I linked to, and int_19h used "object" for OO objects, but we are talking about the same kind of GC. int_19h's comment explains how implementing tracing garbage collection for OO object references without breaking EIAS would require a very pessimistic garbage collector. My reply is about how Jim Tcl implements this exact type of pessimistic tracing garbage collection for references, the basis of its OO system.
I should have just said "all values". The name Tcl_Obj/Jim_Obj is a legacy artifact that leads to lots of confusion. Perhaps Tcl 9 will rename Tcl_Obj to Tcl_Value.
I’ll have to dig in and study a bit - I don’t know how [the object system] works off the top of my head.
It sounds like you’re familiar with Tcl internals, so I presume you know that (if we stick w the list as a prototypical “complex” structure) that the management is reference counts, freed when count==0, and may be >1 if this is a shared item (I’m going to refrain from calling it an object, though internally they’re a Tcl_Obj type, which has nothing to do with OO-programming). In a list, the list is an item, and ea. element is an item, with (of course) incr/decr references at when appropriate, as determined by the language rules itself.
Wrt strings vs lists, a string could be considered a single “blob”, but coercion to a proper list (“shimmering” in Tcl parlance) does the tokenization with requisite ref. counting as part of the conversion.
...I think the above may be entertainment more for anybody else who’s following along, but if that illuminating for you too, that’s a bonus.
Perhaps you could show me by example:
What’s a piece of (eg) Ruby and what you think a Tcl workalike would appear as, and where that falls down.
An object is really just a namespace with a generated name, variables in which correspond to fields of the object. A reference to the object, then, is simply a string that is the name of its namespace.
(I'm oversimplifying and ignoring the command part of it, because it doesn't really matter for any of this.)
For the sake of readability, let's assume that those object IDs / namespace names are just numbers, e.g. ::42 (in practice, it's something like ::oo::Obj42). Since that number was generated by TclOO machinery under the hood, said machinery can properly tag it as a reference at that point, and return such tagged value from [... new].
Now, let's say that we have some class, and some code that constructs an instance of that class, and then builds a list that references that instance from one of the elements. One way to do it would be start with an empty list, and build it element by element:
set foo [Foo new]
set lst {}
lappend lst blah
lappend lst $foo
lappend lst blah
unset foo
This produces the list {blah ::42 blah}. Because we built it element-by-element, our Tcl-with-GC tags it as a list, and uses an optimized representation for it, in which ::42 is a separate element that's known to be an object reference. Even after we remove the direct reference by unsetting foo, GC can use that metadata to trace a path to ::42 via lst, and know to keep the object alive. If we want to recover foo later, we can safely do:
set foo [lindex lst 1]
So far, so good. But... that wasn't idiomatic Tcl. We're much more likely to build the list thus:
set lst "blah $foo blah"
unset foo
It's still the same list {blah ::42 blah}. But, since we haven't performed any list operations on it, Tcl had no reason to tag it as one, and to switch it to the optimized representation - it's really just a string at this point. And because the elements haven't been separated out, we've also lost the "this is an object reference" tag on ::42 - there's no separate element as yet, so there's nothing to tag!
So then, how would GC know to look at lst, treat it as a list, and find the object reference ::42 inside? Note that if it doesn't do that, then the object will possibly be garbage collected by the time we get to:
set foo [lindex lst 1]
Tcl finally realizes that lst is a list because of our use of lindex here, and converts it to the efficient representation that separates the elements... but at this point, how would it know that the element ::42 originated as an object reference, rather than a random string that looks like one? And even if it just pessimistically assumes that it's a reference, that doesn't really help - the object is already collected, and the reference is invalid anyway. So, we just made shimmering observable, and thus broke EIAS semantics. Our Tcl-with-GC is no longer Tcl.
So, to remain true to EIAS, our GC can't rely on the presence or absence of efficient list representation for a value to decide whether to scan it for references, or not. It has to proactively treat every variable in the program as a potential list with potential object references in it - because, well, it could be; and if it is, then it's also a GC root that must be traced!
And yet this is a trivial case. We actually have to consider nested lists, and dicts, and unevaluated Tcl code stored in a string, and all possible permutations thereof...
Pretty close. It was a professor that was tired of people writing languages to embed in their applications, so he wrote a language to embed in applications.
The guy who also brought us raft[0], log structured file systems[1], and Ousterhouts Dichotomy[2], among other things, in case you were thinking “not a bunch of grad students, but ‘some professor’.”
I quite like TCL, but had my issues with the syntax as well, unless I read about how it works in the TCL book by Ousterhout (highly recommended). The key bing, that the syntax is extremely simple and systematic - a little bit like Lisp s-expressions - which allows for a very simple general parser and also makes it possible to write TCL extensions in pure TCL. This is a very Lisp-like trait of TCL.
It does matter. Aesthetics do matter, but for a programming language, technical capabilites matter much more. The regular syntaxes of Lisp and TCL give it many powers not being present in languages with more "fancy" syntaxes.
_ph_, on this case I think it’s fair to say the aesthetic and ergonomics are the technical capability, to degrees. For example, building a custom looping control does benefit aesthetics and ergonomics. It could be viewed as 2 sides of the same coin.
I'd like to live in a world where that is true, but we just don't. Languages only need to be powerful enough to solve the problems actually being solved by 9 to 5 programmers. Aesthetics and ergonomics matter much more.
In Tcl everything is a string. If you work with numbers for instance it's quite slow. Or maybe loop or set operations are slow. Dunno. Compare these two:
I got an order of magnitude improvement on Tcl just by bracing the expressions:
Bad: expr $i +$j + 2
Good: expr {$i + $j + 2}
... it saves interpretation and evaluation to occurring once, hoisted in the [expr] engine; it’s idiomatic. The lua version still appeared faster, though I’ve only spent 2 minutes looking at this.
Again, strings should play no issue here; values must be able to generate a string-representation, but it’s computed lazily, and in this case, the $i, $j, etc in Tcl are working with native integers.
I’m happy we have things like Tcl and Lua; either one of them makes development enjoyable. I appreciate you taking time to write your code samples.
> I think there's a space for a new project, which takes LuaJIT as a starting point
Unfortunately, LuaJIT is tightly-coupled to the Lua language and its code is very complex. I'd be surprised if anyone other than Mike Pall was able to retarget it to support a different language.
Also, is JIT compilation really necessary for a scripting language? Projects like the PyPy JIT for Python have seen little adoption, and many platforms (e.g. iOS and game consoles) don't support JIT at all.
PyPy JIT is unsuccessful because CPython makes it hard for PyPy, not because people don't want it. When there's a need and support for it, we have extremely good JITs spring into existence–JavaScript has at least three, for example.
I don't think they're doing anything in particular to hinder PyPy -- the damage was done long back, when things like the C-API exposed too much detail on CPython internals that severely limit PyPy's potential to replace it with more optimal systems/datastructures
The issue with crowding out alternative Pythons is a socio-political one. The C-API issue is as well, as you said Python extensions are over coupled to the runtime. ctypes and subsequently cffi fixed this 10+ years ago.
1. Decouple the extension mechanism. Use cffi
2. Modularize the batteries
Each alternative Python has to re-implement large swaths of the stdlib. The code is the spec in Python.
Stackless got the PyPy treatment before PyPy even existed. Besides PyPy, we now have graalpython, rustpython and ...
The point is, the core CPython devs need stop being so selfish with their toys. The greater Python ecosystem is diverse in spite of their actions not because of it.
Sorry, that's not quite what I had in mind. I was considering the feasibility of modifying LuaJIT to directly target a different language.
The projects you link to all either compile to, or are written in, Lua source code or LuaJIT bytecode. That's less efficient than retargeting the LuaJIT VM. For example, LuaJIT includes bytecode instructions specifically tailored to Lua tables. It looks like these projects build, say, their array data structures on top of those existing instructions, rather than adding array-specific support to LuaJIT.
> That's less efficient than retargeting the LuaJIT VM
That's like saying LLVM is inefficient because all the frontends compile to LLVM IR.
E.g. the Oberon compiler directly generates efficient LuaJIT bytecode also using FFI and even reuses the VM features for it's source-level debugger. Then benchmark written in Oberon has nearly the same performance like compiled to native when it runs on LuaJIT
You have all my upvotes. I really would like to like Lua but the lack of the mentioned basic infrastructure elements like powerful Unicode support and platform-independent parallelism abstractions make every larger Lua project a DSL of it's own.
In Openresty you don't end up facing much trouble around parallelism. In other contexts, I might go for love2d's thread module, which can be loaded from liblove on its own https://love2d.org/wiki/love.thread
I've had an ok using unicode libraries but I haven't had to do much with them so far.
You don't if you're writing low level algorithms. It's difficult for me to articulate the exact problem but I liked 1-based indexing until I needed to implement some algorithms that operated on multi-dimensional arrays element wise. Things just don't compose the right way.
It's actually the primary reason I don't make more use of Julia. I like the syntax and overall design more than Python but it just isn't worth it to me.
(If you're wondering what I'm doing in an Lua comments section, it's because I really like the ideas behind some of the Lua based languages such as Terra.)
I primarily work in 1-indexed languages, but I think everyone is missing the forest from the trees in this discussion: indexing should be selectable by the programmer.
One or zero indexing is a question about domain and communication. In signal processing, it is quite normal to need a negative index (autocorrelations f.x.). Ada (and I believe Pascal and Julia) has supported for specifying the indexing type, which can then be selected to match the problem domain.
I don't get that 1 based indexing debate in regard to LUA and Julia. Both came up recently on hackernews.
In LUA you can use the metatable functionality to implement 0 based indexes.
In Julia you can overload the array access operation too. There a packages which allow zero based indexing. Or even StarWars based indexing including the machete viewing order. https://github.com/giordano/StarWarsArrays.jl
Julia's array interface has has functionality what the first/last index is, so you can use it in other peoples code. If they haer coded 1 as first index just make a pull request to fix it.
> In LUA you can use the metatable functionality to implement 0 based indexes.
I tried to do that a while back and kept coming across corner cases that didn’t work. Is there a complete implementation of zero-based arrays for Lua available anywhere?
Visual Basic’s Option Base comes to mind. And Pascal had very nice facilities for arbitrary array bases. I think you could define an array with indices from 18 to 22 is you liked, which can definitely simplify code doing the indexing.
Lua was originally invented as a way to describe complex data. At some point it was enhanced with executable data and from there it’s a small step to become a programming language.
I find that the global by default thing is not as much of a problem if using a linter like Luacheck. If the text editor is configured to show linter warnings it catches typos before they become a problem.
Global by default and block level scope seems like the right choice. In a popular dynamic language that uses local by default and function level scope, the language had to include two keywords for accessing variables other than locals, and the following becomes a bug when it would not be a bug in any sane language:
x = calculate_n_dogs()
# later
ok = all(x.status_code == 200 for x in my_other_responses)
# later
let_the_dogs_out(x)
# oh no, why isn't x n_dogs anymore
This is clearly about Python, so I will note that this is not actually the case: The generator expression introduces a new scope, and so it does not reassign the value of x. Python does not require the "global" and "nonlocal" keywords in order to access variables, but to reassign names brought in from an outer scope.
Python 2 did have the issue where list comprehensions (though still not generator expressions) didn't introduce a new scope, and so you could indeed get issues quite similar to your example above. This was corrected in Python 3.
All that said, I will agree that Python's scoping rules are pants-on-head crazy. In very nearly every other programming language under the sun, you declare variables at the point where they exist. In Python, you declare a variable when it exists somewhere else (and you then want to reassign it). This does mean that you don't need to put a "local" or "var" or what-have-you in front of each new variable you declare (which is basically the reason Python did it this way), but the semantics of the thing can take some getting used to.
I appreciate the correction and I'm glad to know this is fixed in python 3!
I've run into similar issues where the variable was used in a for loop declaration rather than a list comprehension, but these are a bit easier to spot. Also, maybe people who don't use lua constantly won't be in the habit of thinking they can safely shadow locals with their looping variables.
Lua was initially designed as a data description language for Oil and Gas simulations written in Fortran. 1 based indexing makes perfect sense in that context.
To me it's the lack of some functional features for 1-liners that really trip me up. Verbose function declaration, no null-coalesce operator, no ternary operators, stuff like that.
I love the platform and ideals and goals - a simple, lightweight embedding language... but the language details rub me the wrong way.
A minor hobby of mine is collecting proposed changes for Lua. Let me make sure I have these right:
- some syntax like js arrow function (this is tough because {} are not block delimiters, maybe single-expression lambdas like in python? I kind of hate those though.)
- `??` operator which does the same thing as `or` except when the left operand is exactly `false`
- `?:` operator. Should this evaluate to exactly one value or any number of values? e.g. x?2,3:4,5
Theoretically I could get this stuff by using Moonscript, but then I'm using a somewhat esoteric language that brings in a lot of other very unusual ideas.
Although for ternary, I honestly think the "if/elseif/else can be used as an expression" has always been my favorite syntax. Or the CASE...WHEN syntax of SQL.
Some compile-to-lua languages include if expressions, but they commonly compile to IIFEs to support evaluating to a number of values other than 1. It's troublesome. I think the bytecode can express what you want.
> To mark a node at the third position, containing one character, we say {3,3}.
> But what if there's a node at the third position, that contains no character? Well, thats {3,2}
Why not shift over indexing by one consistently? So you'd use {3, 4} and {3, 3} for each, so the length property is preserved.
That's an argument against end-inclusive ranges more so than against 1-based indexing. If they were end-exclusive, like in Python, then {3,3} would be an empty string, and {3,4} would be a single character at 3. Note how it suddenly doesn't matter at all whether the indices are 0-based or 1-based - all of the above is valid regardless.
FWIW, I think that the real mistake was to make inclusivity/exclusivity implicit to begin with. If the syntax is explicit, and the choice is captured in the resulting value (i.e. it's more than just a pair of numbers), then you just use whatever is more convenient for the task at hand. Nim almost gets there: (x .. y) is end-inclusive, and (x ..< y) is end-exclusive - but the syntax still exhibits a preference.
I don't find it particularly persuasive. End-exclusive and end-inclusive can both be natural in different scenarios, regardless of the base - the article simply cherry-picks the ones that it needs to "prove" its case. That's why it's best when the language just offers both.
Yep, pretty much. I'm a strong proponent of "explicit is better than implicit" in all cases where there's no default that's obviously preferable, and this is is one of those cases - that different languages picked different defaults is all the evidence I need.
I really don't like seeing this cited as a "proof" that 0-based indexing of lists is better. It is too shallow and ignores too many important facets of the question.
The main contrast Dijkstra is drawing is between closed and half open intervals. Typically, the choice "starting from one" is tied to "closed" intervals, but that's not strictly necessary. Nothing stops you from using half-open intervals on Lua lists any more than you can used closed intervals to index into C arrays.
The main advantage of the typical coupling is avoiding `+`s and `-`s in your "typical" loops over the first n indices:
for i = 1, n do -- closed
for (int i = 0; i < n; i++) { // half open
but what if you want to go in reverse? Getting the _final_ indexes requires arithmetic with half-open intervals, but it doesn't with closed (which also requires changing the operator from exclude-eq to include-eq):
for i = n, 1, -1 do -- closed
for (int i = n - 1; i >= 0; i--) { // half open
Of course, probably the most important property of half-open intervals is the way that they can be broken down into two disjoint intervals: `[A, C) = [A, B) U [B, C)`. But if your indexes are integers this is straightforward with closed intervals too: `[A, C] = [A, B - 1] U [B, C]` or `[A, C] = [A, B] U [B + 1, C]`.
But again, if you're using this property, you probably don't actually care about the "absolute" indexes at all; indexing with `[1, #list + 1)` isn't a problem _at all_ in Lua.
Plus, between the operations of "get final index" and "split interval into non-overlapping pieces", "get final index" is far more common (it's crucial to stacks, queues, etc, in addition to various uses like iterating backwards). Choosing "closed" intervals is then the cleaner option to avoid an ugly "+" or "-"!
Dijkstra mentions that requiring -1 to refer to an initial interval as "ugly" since -1 is not a valid index, but he has no qualms with using the length of the list, n, and allowing it to not be an index!
This is all said without even going into the awkwardness of having [k] being the {k+1}th element (e.g., [4] is how we write "fifth" in C -- I distinctly remember at least one conversation at work where the wrong thing was communicated because someone said "fourth" to mean [4])
Now, don't take this as a serious argument that 0-indexed-lists are inferior to 1-indexed lists. It's just a demonstration that it's easy to argue either side. That's because it's really just an unimportant convention. Pick one and stick with it and everything will work fine, modulo confusion about what "fourth" means.
> indexing with `[1, #list + 1)` isn't a problem _at all_ in Lua.
That's not what string.sub accepts though.
> Plus, between the operations of "get final index" and "split interval into non-overlapping pieces", "get final index" is far more common (it's crucial to stacks, queues, etc, in addition to various uses like iterating backwards). Choosing "closed" intervals is then the cleaner option to avoid an ugly "+" or "-"!
It (rather unnecessarily) takes log time, so you probably shouldn't do it if you can avoid it.
Will it still give the wrong answer if your table isn't sequentially indexed?
That is, does #({[1]="a", [3]="b"}) return 1 or 2?
edit: I see that it still returns 1. That's a shame. This trips up so many newcomers who've been told that "#" means "length". IMO they need to either start calling it something other than the "length operator" or make it _actually_ return the length. Sadly at this point it would just be yet another breaking change.
The ideal fix would to allow tables to store nil values, instead of always treating nil as a "hole". Instead of assigning nil to remove a value, you would call a separate "delete" function or operator. This way there would still be only one table type instead of separate types for tables and arrays.
Some early alpha versions of Lua 5.4 played around with this idea but it turns out that it would break too many things. I'd expect that if Lua ever makes a major jump to Lua 6.0 then nils in tables will likely be one of the main changes.
I'd like to read more about this. It would be a little unfortunate to end up with Python's "x in y" and "del y[x]" just to differentiate between missing entries and present entries set to nil.
On the other hand, my not-very-well-considered preferred approach creates some asymmetry between arrays and maps as to what values can be stored. Or it perpetuates the existing asymmetry between varargs and tables...
Isn't "weird == not seen it before" anyway? I get what you're trying to say, but MATLAB is not really a programming language SWEs would use, so if that's the best example you can think of, Lua's bound to confuse most people in the industry.
May work better for mathematicians, non-SW engineers etc, but I wouldn't know.
> I get what you're trying to say, but MATLAB is not really a programming language SWEs would use
It may not be a language that website jockeys use, but "I don't use it therefore probably no one does" is almost never correct. MATLAB is huge in state machine and dynamic systems programming as well as being the original commercial powerhouse for linear algebra algorithms.
I meant to say most SWEs, which you seemed to have inferred anyway.
Why do you think that's wrong? I simply claimed MATLAB is much more commonly used by scientists and non-SW engineers. Most SWE jobs are in B2B, web development, embedded, etc fields. Which one of these sees a lot of MATLAB usage?
The tilde ~ char is pretty annoying to type on swedish keyboards at least. Alt + ¨ and a space on my mac. Having to type this regularly on my main languages would be off putting
> weird choice of certain operators (~= for not equal)
It's a little unconventional, but then again ~ is a fairly standard (bitwise) negation operator.
> arrays start at 1
Lua doesn't do arrays, it does tables. Though they have similar syntax in certain cases, and are occasionally used for the same things, they are fundamentally different things.
I've heard that a major use case for Lua is to enable scripting in game engines - for something like that, it sounds perfect, because the entire runtime is just a couple thousand lines of C with few dependencies, that can be literally copy-pasted into your codebase if need be.
But honestly, I can't recommend Lua for any other use case. You'll be chugging along and then run into a brick wall when you need e.g. real regexes instead of Lua's comparatively crippled pattern matching, or a networking library that isn't hosted on one professor's personal web page that hasn't been updated in years. Take it from someone who had to throw out and port a bunch of Lua scripts when they just couldn't keep up with new business requirements: Python or Perl or Ruby is almost always a better choice.
It was. At peak perhaps 10 years ago. It is telling that the game engine citation in that article is from 2009.
Now the ecosystem of game engines is very different. Very few games are written in raw C/++ and have to have their own tiny scripting language packaged. They are mostly based on more established engines. 'Small studios' have gone, niched out of end-to-end dev, or re-invented as the 'indie game' world, and pretty much none have their own multi-game engine.
In fact, I've only come across Lua used in Pico-8 gamejam games in the last 2 yrs, personally. (Though I'm retired now.)
It was a phenomenal language for gamedev. I commissioned a Lua consultant to extend the language in a small way for a game, I ended up paying for about half the hours I expected.
Some more recent titles I know of are Pocket Rumble and Blue Revolver.
A lot of game devs use frameworks or engines that strongly encourage or require the user to use C#, javascript, gdscript, or gml. Solar2d and love2d use Lua but don't seem very popular. Fewer people are throwing together their own thing in C++ and putting Lua inside.
Lua is not designed to be used standalone. Many people get frustrated assuming that it's a Python alternative. Or, by trying to make it into one by writing stuff like socket libraries for it.
Lua is designed specifically to be embedded within a larger program to manipulate only the features of that specific program. It is for scripting your program. Not, for scripting-up a program in Lua.
Trying to do the opposite, embed Python as a scripting language inside of a larger program, is certainly possible. But, it is much, much larger endeavor that almost certainly drags in 100 pieces of functionality that you do not want inside your program for every 1 that you actually want to use.
Sounds like for your situation Python/Ruby/Perl are much better matches.
I've heard that some of the new WebAssembly implementations are actually nice to embed. But, that just moves the problem to "What language to compile to WebAssembly?" ;)
You chose a bad example, because Lua has lpeg as a "blessed" library.
lpeg is leaps and bounds beyond regex. When I have to work with mere regex, I miss lpeg, the same way someone who is used to regex would feel when stuck with Lua patterns.
Networking isn't a problem for me personally, I just use luv and I'm very happy with it. But it's true that this is under-documented, and you do have to do more thinking and less Stack Overflow copypasting than you would with a language with a larger user base.
I'd probably use PCRE there. Every C library becomes a Luajit library when you run the relevant header through the C preprocessor and paste the result into your code. Sometimes you have to do this yourself, which makes it more work than you'd need in Python though.
Don't tell my coworkers this---we use Lua to process SIP messages for the Oligarchist Cell Phone Company. I found LPEG to be way nicer than Lua's patterns (or even regex) for parsing, and there are other networking modules besides that one professor's version (which one is it, by the way?).
It's also used in Redis, and there's at least one major network router that uses it.
Lua is just great. In Fluent Bit[0] we expose Lua filter capabilities. So end-users can write their own scripts to manipulate record logs and accommodate keys/values easily with a simple programming logic:
I am happy to see performance is being improved, as of now we stick to LuaJIT, since our use case is pretty simple we have not found a good reason to move away from it.
I love Lua for its architecture and goals, but not for its linguistics... and this is more of the same. The platform is getting even better, but I feel like the language is getting worse. <const> and <close> are kinda hideous, aren't they?
I had a knee jerk reaction against it as well. But when you think about it, it kind of makes sense.
A) these are properties of the variable, not the value. <close> triggers when the variable goes out of scope, and <const> supposedly was a feature that came out of implementing <close>. Both only work with local variables.
B) the angle brackets leave room for extension with more properties in the future without creating an additional new syntax.
C) you can multi-declare several variables with different properties i.e.: `local a <const>, b, c <const> = "apple", "banana", "carrot"` with only a and c being const.
D) no new keyword(s) added to the language
Compared to the simple syntax of the rest of the language it may be a little jarring, but it was well thought out and imo it's still better than weird stuff like "spaceship operators", "hashrockets" or other weird syntax oddities in more popular languages. Would you really prefer chains of keywords? `static const char *strprbrk` We have angle brackets now. If it turns out to be a mistake, I'm sure Roberto and his team will just take it back out of the language.
I had heard that Lua was adding const support, which I was excited for, but I hadn't seen the syntax until now. I don't care for it. I'm guessing this is due to ambiguity as to whether variables declared as const would be local or global, but I would rather have something like
The problem with putting the "const" first is that it is ambiguous when there is more than one variable. For example, suppose that you want to open a file and test for errors:
local f <close>, err = io.open("foo.txt")
if not f then error(err) end
vs
local <close> f, err = io.open("foo.txt")
if not f then error(err) end
In the first version it is clearer that only the "f" variable is marked as a to-be-closed variable. The second version looks nicer when there is a single variable but is more confusing when there are multiple variables.
That wouldn't work because it is already valid Lua code with a different meaning. Lua parses it as a declaration of a local variable called f followed by an assignment to a variable named close:
local f
close = io.open("foo.txt")
Despite the lack of semicolons, Lua is a free-form language and newlines are treated the same as any other whitespace character.
The new `const` and RAII/resource scope language features remind me of the golang discussion yesterday, someone linked Pike saying JS/TS, C++, Hack, etc keep borrowing features from each other. Of course we won’t fully converge on some perfect syntax any time soon, or ever, but we seem to be settling into a general consensus on syntax in general, and semantics of dynamic languages.