Marko Mihajlovic

I am a final-year PhD candidate at ETH Zurich, specializing in creating digital models of physical environments from visual data. During my PhD, I spent over a year working on immersive digital representations at Meta's research labs in Zurich and Pittsburgh. My research focuses on developing systems that reconstruct the world's structure and dynamics from visual inputs to enable precise digital replicas. I am particularly interested in enhancing machines' ability to perceive and interact with their surroundings with minimal supervision.

Marko Mihajlovic
Marko Mihajlovic

Publications

VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions

Marko Mihajlovic, Siwei Zhang, Gen Li, Kaifeng Zhao, Lea Müller, Siyu Tang

ICCV (Highlight), 2025

VolumetricSMPL is a lightweight, plug-and-play extension for SMPL(-X) models that adds volumetric functionality via Signed Distance Fields (SDFs). With minimal integration—just a single line of code—users gain access to fast and differentiable SDF queries, collision detection, and self-intersection resolution.

Spline Deformation Field

Mingyang Song, Yang Zhang, Marko Mihajlovic, Siyu Tang, Markus Gross, Tunc Aydin

SIGGRAPH, 2025

A spline-based trajectory representation that enables efficient analytical derivation of velocities, preserving spatial coherence and accelerations while mitigating temporal fluctuations. Our method demonstrates superior performance in temporal interpolation for fitting continuous fields with sparse inputs.

SplatFormer: Point Transformer for Robust 3D Gaussian Splatting

Yutong Chen, Marko Mihajlovic, Xiyi Chen, Yiming Wang, Sergey Prokudin, Siyu Tang

ICLR (spotlight), 2025

SplatFormer is a data-driven 3D transformer for refining 3D Gaussian splats to improve quality of novel views from extreme camera viewpoints.

FreSh: Frequency Shifting for Accelerated Neural Representation Learning

Adam Kania, Marko Mihajlovic, Sergey Prokudin, Jacek Tabor, Przemysław Spurek

ICLR, 2025

FreSh aligns the frequencies of an implicit neural representation with its target signal to speed up the convergence.

RISE-SDF: A Relightable Information-Shared Signed Distance Field for Glossy Object Inverse Rendering

Deheng Zhang*, Jingyu Wang*, Shaofei Wang, Marko Mihajlovic, Sergey Prokudin, Hendrik P.A. Lensch, Siyu Tang

3DV, 2025

RISE-SDF reconstructs the geometry and material of glossy objects while achieving high-quality relighting.

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Marko Mihajlovic, Sergey Prokudin, Siyu Tang, Robert Maier, Federica Bogo, Tony Tung, Edmond Boyer

ECCV, 2024

SplatFields regularizes 3D gaussian splats for sparse 3D and 4D reconstruction.

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

Xiyi Chen, Marko Mihajlovic, Shaofei Wang, Sergey Prokudin, Siyu Tang

CVPR, 2024

Morphable diffusion enables consistent controllable novel view synthesis of humans from a single image.

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

Zhiyin Qian, Shaofei Wang, Marko Mihajlovic, Andreas Geiger, Siyu Tang

CVPR, 2024

Given a monocular video, 3DGS-Avatar learns a clothed human avatars with short training time and interactive rendering frame rate.

Inferring Dynamics from Point Trajectories

Yan Zhang, Sergey Prokudin, Marko Mihajlovic, Qianli Ma, Siyu Tang

CVPR, 2024

How to infer scene dynamics from sparse point trajectory observations? We show a simple yet effective solution using a spatiotemporal MLP with carefully designed regularizations. No need for scene-specific priors.

ResFields: Residual Neural Fields for Spatiotemporal Signals

Marko Mihajlovic, Sergey Prokudin, Marc Pollefeys, Siyu Tang

ICLR (spotlight), 2024

ResField layers incorporates time-dependent weights into MLPs to effectively represent complex temporal signals.

KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

Marko Mihajlovic, Aayush Bansal, Michael Zollhoefer, Siyu Tang, Shunsuke Saito

ECCV, 2022

KeypointNeRF is a generalizable neural radiance field for virtual avatars. Given as input 2-3 images, KeypointNeRF generates volumetric radiance representation that can be rendered from novel views.

COAP: Compositional Articulated Occupancy of People

Marko Mihajlovic, Shunsuke Saito, Aayush Bansal, Michael Zollhoefer, Siyu Tang

CVPR, 2022

COAP is a novel neural implicit representation for articulated human bodies that provides an efficient mechanism for modeling self-contact and interactions with the environment.

MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images

Shaofei Wang, Marko Mihajlovic, Qianli Ma, Andreas Geiger, Siyu Tang

NeurIPS, 2021

Generalizable and controllable neural signed distance fields (SDFs) that represent clothed humans from monocular depth observations.

LEAP: Learning Articulated Occupancy of People

Marko Mihajlovic, Yan Zhang, Michael J. Black, Siyu Tang

CVPR, 2021

LEAP is a neural network architecture for representing volumetric animatable human bodies. It follows traditional human body modeling techniques and leverages a statistical human prior to generalize to unseen humans.

DeepSurfels: Learning Online Appearance Fusion

Marko Mihajlovic, Silvan Weder, Marc Pollefeys, Martin R. Oswald

CVPR, 2021

DeepSurfels is a novel 3D representation for geometry and appearance information that combines planar surface primitives with voxel grid representation for improved scalability and rendering quality.