Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They don't need destruction because they're all value types, and assignments are copies. But for OO, you need some form of references. And if your reference is a string, how do you know that it is a reference? Worse yet, if it's a part of a larger string, how do you know that said string contains a reference?

Keep in mind that while, yes, Tcl lists have a more efficient internal representation when Tcl knows that it's a list, a string that happens to be a valid representation of a list is also a list, just by virtue of existing - but it won't get a special list representation until you try to use it like one! It's just a string like "foo bar baz". Or, say, "objref#1 objref#2 objref#3".

And the semantics of the language is EIAS through and through - all those internal representations are optimizations that are not supposed to change the observed behavior, only performance. A GC cannot treat lists without internal representation differently from those with it without breaking this. So it has to assume that any string is potentially a list that contains object references.

So a proper Tcl GC would have to be incredibly pessimistic about what it considers to be roots of the object graph - basically, any substring that looks like an object reference, has to be considered as one. Now imagine just how much string scanning such a GC would have to do in a non-trivial app for a single sweep!



> So a proper Tcl GC would have to be incredibly pessimistic about what it considers to be roots of the object graph - basically, any substring that looks like an object reference, has to be considered as one. Now imagine just how much string scanning such a GC would have to do in a non-trivial app for a single sweep!

This is how the garbage collector in Jim Tcl works. It scans the string representation of all objects [1] for the reference syntax [2]. In practice, it does not cause performance problems. Since Jim Tcl is intended as a smaller, more easily embedded counterpart to Tcl 8, nobody runs it with a large heap.

How slow is it? I have just tried a naive benchmark on a first-gen Raspberry Pi Model B. With a live list of one millions strings and 50 thousand references garbage collection took around 750 ms. On a Core 2 Duo laptop running Linux it took around 120 ms.

  . for {set i 0} {$i < 1000000} {incr i} {
        set v$i "<reference.<--NOT padding padding $i"
    }
  . for {set i 0} {$i < 50000} {incr i} {
        set r$i [ref $i test]
    }
  . time collect 5  ;# Average of five runs
  765443 microseconds per iteration
The tracing garbage collector only affects references. Normal strings are reference-counted like in Tcl 8. References are the low-level basis of Jim's OO system, which is implemented in pure Tcl.

[1] "Objects" in the sense of Jim_Obj (Tcl_Obj). https://github.com/msteveb/jimtcl/blob/da293a1eef2ddd709f10b...

[2] http://jim.tcl.tk/fossil/doc/trunk/Tcl_shipped.html#_garbage...


I could be wrong but I don’t think “object” and “GC” are being used in the same manner for these posts.


I used "all objects" to mean the Jim_Obj objects in the C code I linked to, and int_19h used "object" for OO objects, but we are talking about the same kind of GC. int_19h's comment explains how implementing tracing garbage collection for OO object references without breaking EIAS would require a very pessimistic garbage collector. My reply is about how Jim Tcl implements this exact type of pessimistic tracing garbage collection for references, the basis of its OO system.

I should have just said "all values". The name Tcl_Obj/Jim_Obj is a legacy artifact that leads to lots of confusion. Perhaps Tcl 9 will rename Tcl_Obj to Tcl_Value.


I’ll have to dig in and study a bit - I don’t know how [the object system] works off the top of my head.

It sounds like you’re familiar with Tcl internals, so I presume you know that (if we stick w the list as a prototypical “complex” structure) that the management is reference counts, freed when count==0, and may be >1 if this is a shared item (I’m going to refrain from calling it an object, though internally they’re a Tcl_Obj type, which has nothing to do with OO-programming). In a list, the list is an item, and ea. element is an item, with (of course) incr/decr references at when appropriate, as determined by the language rules itself.

Wrt strings vs lists, a string could be considered a single “blob”, but coercion to a proper list (“shimmering” in Tcl parlance) does the tokenization with requisite ref. counting as part of the conversion.

...I think the above may be entertainment more for anybody else who’s following along, but if that illuminating for you too, that’s a bonus.

Perhaps you could show me by example:

What’s a piece of (eg) Ruby and what you think a Tcl workalike would appear as, and where that falls down.


An object is really just a namespace with a generated name, variables in which correspond to fields of the object. A reference to the object, then, is simply a string that is the name of its namespace.

(I'm oversimplifying and ignoring the command part of it, because it doesn't really matter for any of this.)

For the sake of readability, let's assume that those object IDs / namespace names are just numbers, e.g. ::42 (in practice, it's something like ::oo::Obj42). Since that number was generated by TclOO machinery under the hood, said machinery can properly tag it as a reference at that point, and return such tagged value from [... new].

Now, let's say that we have some class, and some code that constructs an instance of that class, and then builds a list that references that instance from one of the elements. One way to do it would be start with an empty list, and build it element by element:

   set foo [Foo new]
   set lst {}
   lappend lst blah
   lappend lst $foo
   lappend lst blah
   unset foo
This produces the list {blah ::42 blah}. Because we built it element-by-element, our Tcl-with-GC tags it as a list, and uses an optimized representation for it, in which ::42 is a separate element that's known to be an object reference. Even after we remove the direct reference by unsetting foo, GC can use that metadata to trace a path to ::42 via lst, and know to keep the object alive. If we want to recover foo later, we can safely do:

   set foo [lindex lst 1]
So far, so good. But... that wasn't idiomatic Tcl. We're much more likely to build the list thus:

   set lst "blah $foo blah"
   unset foo
It's still the same list {blah ::42 blah}. But, since we haven't performed any list operations on it, Tcl had no reason to tag it as one, and to switch it to the optimized representation - it's really just a string at this point. And because the elements haven't been separated out, we've also lost the "this is an object reference" tag on ::42 - there's no separate element as yet, so there's nothing to tag!

So then, how would GC know to look at lst, treat it as a list, and find the object reference ::42 inside? Note that if it doesn't do that, then the object will possibly be garbage collected by the time we get to:

   set foo [lindex lst 1]
Tcl finally realizes that lst is a list because of our use of lindex here, and converts it to the efficient representation that separates the elements... but at this point, how would it know that the element ::42 originated as an object reference, rather than a random string that looks like one? And even if it just pessimistically assumes that it's a reference, that doesn't really help - the object is already collected, and the reference is invalid anyway. So, we just made shimmering observable, and thus broke EIAS semantics. Our Tcl-with-GC is no longer Tcl.

So, to remain true to EIAS, our GC can't rely on the presence or absence of efficient list representation for a value to decide whether to scan it for references, or not. It has to proactively treat every variable in the program as a potential list with potential object references in it - because, well, it could be; and if it is, then it's also a GC root that must be traced!

And yet this is a trivial case. We actually have to consider nested lists, and dicts, and unevaluated Tcl code stored in a string, and all possible permutations thereof...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: