That's how I see it, too. I used these benchmarks early in Hoot's development as a rough measure of r7rs compliance and only occasionally as a guide for improving performance. I never published my results but I had Hoot passing more of the benchmarks than Guile itself, which I found funny.
Not bad at comparing compilers and optimisation. Great for checking R7RS compliance. Not what I'd trust to make decisions with.