The essay that became Silicon Valley’s gospel may not say what the industry thinks it says. And the technology it supposedly validates may be a counterexample.
"It is lumpy, dependent on whether the new data happens to contain novel physical phenomena or merely more instances of phenomena the model has already seen."
Not sure I get the difference to scaling language data -there are a lot of texts that say the same thing
To me the biggest limitation of large scale video data is that:
- It's 2D - there aren't many multi-cam captures with calibration data
- We can't capture forces, so limited to kinematics rather than understanding dynamics
- Frame rates are too low in most videos, e.g. we see aliasing on wheels for 30fps
Really good point, and you're right that the original framing was weak. Your three points are sharper. I've actually updated the piece to reflect this.
Thanks for making the piece better. If you spot anything else or have other suggestions, please let me know. Thank you again
"It is lumpy, dependent on whether the new data happens to contain novel physical phenomena or merely more instances of phenomena the model has already seen."
Not sure I get the difference to scaling language data -there are a lot of texts that say the same thing
To me the biggest limitation of large scale video data is that:
- It's 2D - there aren't many multi-cam captures with calibration data
- We can't capture forces, so limited to kinematics rather than understanding dynamics
- Frame rates are too low in most videos, e.g. we see aliasing on wheels for 30fps
Really good point, and you're right that the original framing was weak. Your three points are sharper. I've actually updated the piece to reflect this.
Thanks for making the piece better. If you spot anything else or have other suggestions, please let me know. Thank you again
That's very kind!
My pleasure - love all the updates on these new paradigms