Well… You need like what, 3 floats for position and 4 more for orientation. Multiply that by 3 to get velocity and acceleration values. Then I don’t know a few more floats per sensor and you have your whole state space in a few bytes.
Meanwhile a single image is like a megabyte so yeah.
Source: it’s past midnight and I should have gone to sleep ages ago
Oh shit can’t unsee