Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you bake the model onto the chip itself, which is what should be happening for local LLMs once a good enough one is trained eventually, you’ll be looking at orders of magnitude reduction in power consumption at constant inference speed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: