I wonder how much alternative number representations have been studied for use in matrices. Like, storing the log of values instead so that multiplication can be done by just adding. Or something like Zech's logarithm. Or even, take the log of whole matrices (which is a thing apparently [0]), then adding them to compute their multiplication. I wonder if the logarithm of a matrix can be computed using ML.
There's been some work on doing adds instead of multiplies (e.g., https://arxiv.org/abs/2012.03458). And I think float8 will roughly be doing this under the hood.
Personally, I'm not sure whether this is the path to go down or not. Doing everything in log space could help from a hardware perspective, since multiply-adds are much more complex than adds. But it 1) doesn't reduce the number of operations being performed, 2) might increase the necessary bit width for a given level of fidelity (depending on your input distribution and other factors), and 3) might make accumulate operations expensive, since these aren't cleanly expressed in the log domain.
[0]: https://en.wikipedia.org/wiki/Logarithm_of_a_matrix