> If you are sending the frames via some protocol over the network, like WebRTC, it will be touching the CPU anyway. Software encoding of 4K h264 is real time on a single thread on 65w, decade old CPUs, with low latency.
This is valid for a single stream, but the equation changes when you're trying to squeeze the highest # of simultaneous streams into the least amount of CapEx possible. Sure, you still have to transfer it to the CPU cache just before you send it over WebRTC/HTTP/whatever, but you unlock a lot of capacity by using all the rest of the silicon as much as you can. Being able to use a budget/midrange GPU instead of a high-end ultra-high-core-count CPU could make a big difference to a business with the right use-case.
That said, TFA doesn't seem to be targeting that kind of high stream density use-case either. I don't think e.g. Frigate NVR users are going to switch to any of the mentioned technologies in this blog post.
> This is valid for a single stream, but the equation changes when you're trying to squeeze the highest # of simultaneous streams into the least amount of CapEx possible... Being able to use a budget/midrange GPU instead of a high-end ultra-high-core-count CPU could make a big difference to a business with the right use-case.
okay well, the $5, 9 year old i5-9500T i benchmarked has 6 cores, and 1080p H264 decode is 2x realtime per thread, it would support 12 realtime streams. the scenario you are talking about, with frigate NVR users, well that thing supports a lot of ways to display many thumbnails simultaneously very cheaply - a human being can't even see more than ~3K resolution on a typical display at typical viewing distance - and it is pretty acceptable to drop frames, because the user isn't going to be able to comprehend the simultaneous, realtime video of more than 6 camera feeds.
i think you are proving my point that, this is all a bunch of overengineering. the ancient intel chip has QSV which will work just fine. nobody needs GPUs to be doing this work.
This is valid for a single stream, but the equation changes when you're trying to squeeze the highest # of simultaneous streams into the least amount of CapEx possible. Sure, you still have to transfer it to the CPU cache just before you send it over WebRTC/HTTP/whatever, but you unlock a lot of capacity by using all the rest of the silicon as much as you can. Being able to use a budget/midrange GPU instead of a high-end ultra-high-core-count CPU could make a big difference to a business with the right use-case.
That said, TFA doesn't seem to be targeting that kind of high stream density use-case either. I don't think e.g. Frigate NVR users are going to switch to any of the mentioned technologies in this blog post.