The world we see is constantly changing: how do intelligent systems generalize to new observations?
This question led me to quest for an understanding of the mechanisms underlying spatial intelligence
and to develop methods for enabling artificial intelligence with this remarkable capability.
Specifically, I am investigating how generalizability can emerge from reusable 3D & 4D representations,
how these representations of the dynamic 3D world could be learned from images & videos,
and how inductive biases could serve as expert knowledge to reduce unknown parameters and make learning more efficient.
Equal Contribution *, Corresponding Author †, Project Lead ⚑
A unified framework to probe "texture and geometry awareness" of visual foundation models. Novel view synthesis serves as an effective proxy for 3D evaluation.
We recover NeRF from tourism images with variable appearance and occlusions,
and consistently render free-occlusion views with hallucinated appearances.