This is a great lesson in "knowing your data" vs. "creating a model". The validity of the generated model is tainted by the GIGO principle. Said another way, modeling bad data will get you bad models.
I see this shockingly often in my professional life. A data scientist will spend days, weeks, or months building a "perfect" model which replicates unclean, biased, or bad data. And when they are done, it gets thrown away because it cannot solve any real world problem.
This kind of analysis sells because it is like a Rorschach test. People will see a book they know and have some feelings and think that gwern and his algorithm felt it too.
The creation of multiple lists is a good bet on his part because it has more ways to win, lacking an objective criterion for success.
I see this shockingly often in my professional life. A data scientist will spend days, weeks, or months building a "perfect" model which replicates unclean, biased, or bad data. And when they are done, it gets thrown away because it cannot solve any real world problem.