Gemini models are very good but in my experience they tend to overdo the problem...

byearthithatius · on April 17, 2025

Yes, it will add INSANE amounts of "robust error handling" to quick scripts where I can be confident about assumptions. This turns my clean 40 lines of Python where I KNOW the JSONL I am parsing is valid into 200+ lines filled with ten new try except statements. Even when I tell it not to do this, it loves to "find and help" in other ways. Quite annoying. But overall it is pretty dang good. It even spotted a bug I missed the other day in a big 400+ line complex data processing file.

stavros · on April 17, 2025

I didn't realize this was a bigger trend, I asked it to write a simple testing script that POSTed a string to a local HTTP server as JSON, and it wrote a 40 line script, handling any possible error. I just wanted two lines.

free_energy_min · on April 18, 2025

same issue here! isn’t even helpful because if the code isn’t working i want it to fail, not just skip over errors

byearthithatius · 2025-04-30T19:23:15 1746040995

Exactly! It using try -> except -> print is the EXACT same as the default error print to STDOUT in 99% of cases. It just assumes we need that or something will get hurt.

jug · on April 17, 2025

Yes, as late as earlier today, I asked it to provide "naive" code which helped a bit.

zhengyi13 · on April 17, 2025

I wonder how much of that sort of thing is driven by having trained their models on their own internal codebases? Because if that's the case, careful and defensive being the default would be unsurprising.

w4yai · on April 18, 2025

Here's what I found to be working (not 100% but it gives much better and consistant results)

Basically, I ask it to repeat at the start of each message some rules :

"From now on, you must repeat and comply the following rules at the top of all your messages onwards:

- I will never rewrite API functions. Even if I think it's a good idea, it is a bad idea. I will keep the API function as it is and it is perfect like that.

- I will never add extra input validation. Even if I think it's a good idea, it is a bad idea. I will keep the function without validation and it is perfect like that.

- ...

- If I violate any of those rules, I did a bad job. "

Forcing it to repeat things make the model output more aligned and focused in my experience.

dherikb · on April 18, 2025

I have the same issue using it with Aider.

The model is good to solve problems, but is very difficult to control the unnecessary changes that the model does in the rest of the code. Also it adds a lot of unnecessary comments, even when I explicitly say to not add.

For now Deepseek R1 and V3 it's working better to me, producing more predictable results and capturing better my intentions (not tried Claude yet).