Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A syntax comparison across many languages (rigaux.org)
96 points by Rickasaurus on July 20, 2012 | hide | past | favorite | 30 comments


Eleven ways to comment to the end of the line; five of them used only by one language each.



And the most common way is one of the worst (most inconvenient), although J seems to win the prize.


Well, depends on your layout. That might be the case on a US keyboard but for many others (for example German, Spanish, the Nordic countries' layout and many other European and non-European layouts) it would be preferred over for example "//". For "//" you need to use a modifier (just as with '#') but you need to stretch more (or use both hands) and as well tap it two times. That's one of the reasons why I prefer to code using an English layout in my IDE/editor :)


"Inconvenient" because typing a # requires a modifier key on most keyboard?


Also relevant: http://rosettacode.org/wiki/Main_Page Some parts are about language features, others are about implementations of algorithms in many languages.


This is quite an amazing achievement.

There are a few other dimension to explore beside syntax. Like "are arrays 0 or 1 based?" etc

Apparently the author is now part of the opalang effort. They used to have a slightly weird syntax for their language and later changed that to have it look more like JavaScript. That's a strange move when concise syntaxes à la Ruby or CoffeeScript seem popular.


I haven't looked at everything, but the Perl variable identifier regexp is incorrect, as Unicode characters are allowed as of 5.12, which was released in 2010. For that matter, same goes for Java, except Unicode letters are allowed.


Java also allows currency symbols (e.g. £€$) as identifiers.

I know this because I had a colleague who was quite proud of having switched to Dvorak, and another colleague who immediately wrote a code-generation tool that emitted € characters in variable names...which are untypeable on a Dvorak keyboard. :)


Java allows a vast array of odd characters, including nulls.

http://stackoverflow.com/questions/4838507/why-does-java-all...


Dvorak doesn't support the Alt-<type unicode number on number block> trick?


They are allowed in PHP as well. The set of regexes looks odd for a number of the languages actually.


My big takeaway is that Ruby generally does things the "most-popular" way, and that maybe a lot of those obscure languages are obscure for a reason (object["method"](params) to invoke a method?!?)


You mean like JavaScript? Not sure why it's not listed there next to Pike.

    var f = []
    f['push'](2)
    // f -> [2]
Pike also has the more familiar object->method(params) syntax.

It's actually a pretty useful bit of syntax. I've been using it recently when combining multiple objects deserialised from json.

    var report = [name, dob]
    var person = { name: 'Andrew', dob: '1980/03/23' }
    _.each(report, function(prop) {
        console.log(person[prop])
    }


Good call...and for those not necessarily using something underscore.js-like (thought almost always nice-to-have):

  var report = ['name', 'dob'];
  var person = { name: 'Andrew', dob: '1980/03/23' };

  report.forEach(function(prop){
    console.log(person[prop]);
  });


Not sure about that. The regex for variable identifiers shows that Ruby is alone on the island of languages that force a lowercase first character.


Go uses casing for semantic meaning as well. Also, you can use non-A-Z in ruby source.

http://rosettacode.org/wiki/Unicode_variable_names#Ruby


Go is missing from the analysis (sadly, because it's definitely more popular than some of the languages listed!) - but yes, casing is used to determine the scope of your variables. It's actually a rather elegant solution (I find Go's syntax to be generally elegant, but YMMV)


object["method"](params) is just an alternative syntax. Normally you would do object->method(params).


This glosses over some important syntactic differences. For instance, a semicolon is used differently in C than it is in Pascal. In C, it's a terminator; in Pascal, it's a separator.


What is "Assembler"? Is that supposed to mean assembly? If so what architecture uses '!' as a line comment token?


Assembler is what translates assembly code into executables, a compiler for assembly if you so wish.

In many countries like mine, both terms tend to be used interchangeably.


Not an architecture, an assembler.


Fair enough, my underlying point is that `assembly` isn't a language, it's a family of languages which have a set of characteristics which are tied very tightly to the instruction set used (often called an architecture). Comments would be a characteristic of the compiler, which you may call an 'assembler' if you wish, however an assembler is not a language.


An assembler is a utility with its own dialect regardless of target hardware.


#if 0 ... #endif in C/C++ is not a true comment -- commented text is parsed by preprocessor.


If there was ever a project that required an emergency typographer, this would be one. That's a terrible presentation for a lot of great information.


Oh poor XSLT. I can't decide whether to laugh or cry.


http://hyperpolyglot.org/scripting is a similiar resource for php, perl, python and ruby only.


http://hyperpolyglot.org/ provides comparisons between various "families" of languages - not just the above. But there isn't a comparison between, say, Ruby and Haskell.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: