Goutam Bhat

I am a second year PhD student at the Computer Vision Lab in ETH Zurich, under the supervision of Prof. Luc Van Gool. I am also advised by Dr. Martin Danelljan and Dr. Radu Timofte. My main research interests are Computer Vision and its applications, especially in the tasks of visual object tracking and segmentation.

I received my M.Sc. degree in Computer Science from Linköping University, Sweden in 2019. I also worked as a research assistant at the Computer Vision Lab, Linköping University, from 2016-2018. Prior to that, I completed my B.Tech in Electrical Engineering at IIT Bombay, India in 2016.

Email  /  GitHub  /  Google Scholar  / 

profile photo

Publications

project image

Learning What to Learn for Video Object Segmentation


Goutam Bhat*, Felix Järemo Lawin*, Martin Danelljan, Andreas Robinson, Michael Felsberg, Luc Van Gool, Radu Timofte
ECCV, 2020   (Oral Presentation)
arxiv / code /

We introduce an end-to-end trainable video object segmentation architecture that integrates a differentiable few-shot learning module to segment the taget object. We further go beyond standard few-shot learning techniques by learning what the few-shot learner should learn, in order to maximize segmentation accuracy. Our approach sets a new state-ofthe-art on the large-scale YouTube-VOS 2018 dataset.

project image

Know Your Surroundings: Exploiting Scene Information for Object Tracking


Goutam Bhat, Martin Danelljan, Luc Van Gool, Radu Timofte
ECCV, 2020
arxiv / code /

We propose a novel tracking architecture which can exploit the knowledge about the presence and locations of other objects in the surrounding scene to prevent tracking failure. Our tracker represents such information as dense localized state vectors. These state vectors are propagated through the sequence by computing a dense correspondence and combined with the appearance model output to localize the target.

project image

Energy-Based Models for Deep Probabilistic Regression


Fredrik K. Gustafsson, Martin Danelljan, Goutam Bhat, Thomas B. Schön
ECCV, 2020
arxiv / code /

We propose a general and conceptually simple regression method with a clear probabilistic interpretation. We create an energy-based model of the conditional target density p(y|x), using a deep neural network to predict the un-normalized density from (x, y). This model of p(y|x) is trained by directly minimizing the associated negative log-likelihood, approximated using Monte Carlo sampling.

project image

Learning Discriminative Model Prediction for Tracking


Goutam Bhat*, Martin Danelljan*, Luc Van Gool, Radu Timofte
ICCV, 2019   (Oral Presentation)
arxiv / code /

We propose an end-to-end trainable tracking architecture, capable of fully exploiting both target and background appearance information for learning the target model. Our architecture is derived from a discriminative learning loss by designing a dedicated optimization process. The proposed tracker sets a new state-of-the-art on 6 tracking benchmarks, while running at over 40 FPS.

project image

ATOM: Accurate Tracking by Overlap Maximization


Martin Danelljan*, Goutam Bhat*, Fahad Shahbaz Khan, Michael Felsberg
CVPR, 2019   (Oral Presentation)
arxiv / code /

We address the problem of estimating accurate bounding box in generic object tracking. We train a target estimation component offline to predict the overlap between the target object and an estimated bounding box, conditioned on the initial target appearance. We further introduce a target classification component that is trained online to guarantee high discriminative power in the presence of distractors.

project image

Unveiling the Power of Deep Tracking


Goutam Bhat, Joakim Johnander, Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg
ECCV, 2018
arxiv /

We investigate the causes behind the limited success of using deep features for tracking. We identify the limited data and low spatial resolution of deep features as the main challenges, and propose strategies to counter these issues. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy.

project image

ECO: Efficient Convolution Operators for Tracking


Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, Michael Felsberg
CVPR, 2017
arxiv / code /

We tackle the key causes behind the problems of computational complexity and over-fitting in advanced DCF trackers. We introduce: (i) a factorized convolution operator, which drastically reduces the number of model parameters; (ii) a compact generative model of the training sample distribution; (iii) a conservative model update strategy. Our approach can operate at 60 Hz on a single CPU, while obtaining competitive tracking performance.





Design and source code from Jon Barron's website and Leonid Keselman's Jeckyll fork.