Bio

I'm a student in Engineering Science at the University of Oxford co-advised by Philip Torr and Patrick Perez. My main research interests are somewhere in the areas of computer vision/machine learning applied to mobile robotics.

Before that, I worked in the Center for Machine Perception (CMP) at Czech Technical University in Prague with Jiri Matas, Tomas Vojir and Jan Sochman on a short-term tracker.

During my studies, I was a visitor in the Robotics Institute at Carnegie Mellon University, supervised by Martial Hebert and closely collaborated with Daniel Munoz and J. Andrew (Drew) Bagnell on temporal consistency for scene understanding. Also, I was a visitor in the Centre for Vision, Speech and Signal Processing (CVSSP) at University of Surrey, where I worked with Krystian Mikolajczyk on evaluation of local features.

I received my MSc. an Bc. degrees from Brno University of Technology where I worked in The Robotics and Artificial Intelligence Research Group with Ludek Zalud and Petr Petyovsky on a vision-based navigation for semi-autonomous robots.

Awards

2012 - Werner von Siemens Excellence Award 2012
2012 - Czech & Slovakia ACM chapter - Student Project of the Year
2012 - The Master Thesis of Year 2012 in Informatics
2012 - The Prize of the Dean (outstanding master’s thesis)
2012 - ABB University Award, the best MSc project (Robotics)
2011 - Student EEICT 2011, best paper award (MSc projects: cybernetics and automation)
2010 - The Prize of the Dean (outstanding bachelor's thesis)
2007 – Ceska hlava, national prize for the best pre-college scientific publication
2007 – The Herbert Hoover Young Engineer Award
2007 – The Prize of the Dean

Collaborators

Jiri (George) Matas, Martial Hebert, Drew Bagnell, Krystian Mikolajczyk, Tomas Vojir, Jan Sochman, Daniel Munoz, Petr Petyovsky, Ludek Zalud, Pavel Jura, Miloslav Richter, ...

Publications

Miksik O., Vineet V., Perez P. and Torr P.H.S.: Distributed Non-Convex ADMM-inference in Large-scale Random Fields
In Proceedings of the British Machine Vision Conference (BMVC) 2014, Nottingham, UK
PDF Hide BibTex Hide Details

Abstract:

We propose a parallel and distributed algorithm for solving discrete labeling problems in large scale random fields. Our approach is motivated by the following observations: i) very large scale image and video processing problems, such as labeling dozens of million pixels with thousands of labels, are routinely faced in many application domains; ii) the computational complexity of the current state-of-the-art inference algorithms makes them impractical to solve such large scale problems; iii) modern parallel and distributed systems provide high computation power at low cost. At the core of our algorithm is a tree-based decomposition of the original optimization problem which is solved using a non convex form of the method of alternating direction method of multipliers (ADMM). This allows efficient parallel solving of resulting sub-problems. We evaluate the efficiency and accuracy offered by our algorithm on several benchmark low-level vision problems, on both CPU and Nvidia GPU. We consistently achieve a factor of speed-up compared to dual decomposition (DD) approach and other ADMM-based approaches.

@inproceedings{miksik2014bmvc,
  author = {Ondrej Miksik and Vibhav Vineet and Patrick Perez and Philip H. S. Torr},
  title = {Distributed Non-Convex ADMM-inference in Large-scale Random Fields},
  booktitle = {British Machine Vision Conference (BMVC)},
  year = {2014}
}

Miksik O., Munoz D., Bagnell, J. A. and Hebert M.: Efficient Temporal Consistency for Streaming Video Scene Analysis
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2013, Karlsruhe, Germany
PDF Project Page (with datasets) Hide BibTex Hide Details

Abstract:

We address the problem of image-based scene analysis from streaming video, as would be seen from a moving platform, in order to efficiently generate spatially and temporally consistent predictions of semantic categories over time. In contrast to previous techniques which typically address this problem in batch and/or through graphical models, we demonstrate that by learning visual similarities between pixels across frames, a simple filtering algorithm is able to achieve high performance predictions in an efficient and online/causal manner. Our technique is a meta-algorithm that can be efficiently wrapped around any scene analysis technique that produces a per-pixel semantic label distribution. We validate our approach over three different scene analysis techniques on three different datasets that contain different semantic object categories. Our experiments demonstrate our approach is very efficient in practice and substantially improves the quality of predictions over time.

Video:


Images:



@inproceedings{miksik2013icra,
  author = "Ondrej Miksik and Daniel Munoz and J. Andrew Bagnell and Martial Hebert",
  title = "Efficient Temporal Consistency for Streaming Video Scene Analysis",
  booktitle = "IEEE International Conference on Robotics and Automation (ICRA)",
  year = "2013"
}

Miksik O. and Mikolajczyk K.: Evaluation of Local Detectors and Descriptors for Fast Feature Matching
In Proceedings of the International Conference on Pattern Recognition (ICPR) 2012, Tsukuba Science City, Japan
PDF Hide BibTex Hide Details

Abstract:

Local feature detectors and descriptors are widely used in many computer vision applications and various methods have been proposed during the past decade. There have been a number of evaluations focused on various aspects of local features, matching accuracy in particular, however there has been no comparisons considering the accuracy and speed trade-offs of recent extractors such as BRIEF, BRISK, ORB, MRRID, MROGH and LIOP. This paper provides a performance evaluation of recent feature detectors and compares their matching precision and speed in randomized kdtrees setup as well as an evaluation of binary descriptors with efficient computation of Hamming distance.

@inproceedings{miksik2012icpr,
  author = {Ondrej Miksik and Krystian Mikolajczyk},
  title = {Evaluation of Local Detectors and Descriptors for Fast Feature Matching},
  booktitle = {International Conference on Pattern Recognition (ICPR)},
  year = {2012}
}

Miksik O.: Dynamic Scene Understanding for Mobile Robot Navigation
Master's thesis, advisor: Dr Ludek Zalud, supervisor: prof. Martial Hebert
consultants: Daniel Munoz, Dr J. Andrew Bagnell
Brno University of Technology, 2012
PDF Hide BibTex Hide Details

Abstract:

The thesis deals with dynamic scene understanding for mobile robot navigation. In the first part, we propose a novel approach to self-supervised learning - a fusion of frequency based vanishing point estimation and probabilistically based color segmentation. Detection of a vanishing point is based on the estimation of a texture flow produced by a bank of Gabor wavelets and a voting function. Next, the vanishing point defines the training area which is used for self-supervised learning of color models. Finally, road patches are selected by measuring roadness score. A few rules deal with dark cast shadows, overexposed hightlights and adaptivity speed. In addition to that, the whole vanishing point estimation is refined - Gabor filters are approximated by Haar-like box functions, which enables efficient filtering via integral image trick. The tightest bottleneck, a voting scheme, is modified to coarse-to-fine, which provides a significant speed-up (more than 40x), while we loose only 3-5% in precision.

The second part proposes a smoothing filter for spatio-temporal consistency of structured predictions, that are useful for more mature systems. The key part of the proposed smoothing filter is a new similarity metric, which is more discriminative than the standard Euclidean distance and can be used for various computer vision tasks. The smoothing filter first estimates optical flow to define a local neighborhood. This neighborhood is used for recursive averaging based on the similarity metric. The total accuracy of proposed method measured on pixels with inconsistent labels between the raw and smooth predictions is almost 18% higher than original predictions. Although we have used SHIM, the algorithm can be combined with any other system for structured predictions (MRF/CRF, ...). The proposed smoothing filter represents a first step towards full inference.

Results:

Self-supervised learning Spatio-temporal consistency
Results of proposed systems: Self-supervised learning (a), Spatio-temporal consistency (b)

Videos and paper are coming soon.

@MASTERSTHESIS{miksik2012MscThesis,
  author = {Ondrej Miksik},
  title = {Dynamic Scene Understanding for Mobile Robot Navigation},
  school = {Brno University of Technology},
  year = {2012},
  type = {Master's Thesis}
}

Miksik O.: Rapid Vanishing Point Estimation for General Road Detection
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2012, St. Paul, USA
PDF Show BibTex Show Details

Richter M., Petyovsky P. and Miksik O.: Adapting Polynomial Mahalanobis Distance for Self-supervised Learning in an Outdoor Environment
In Proceedings of the IEEE International Conference on Machine Learning and Applications (ICMLA) 2011, Honolulu, USA
PDF Show BibTex Show Details Poster

Miksik O., Petyovsky P., Zalud L. and Jura P.: Robust Detection of Shady and Highlighted Roads for Monocular Camera Based Navigation of UGV
In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2011, Shanghai, China
PDF Hide BibTex Hide Details Poster Errata

Abstract:

This paper addresses the problem of UGV navigation in various environments and lightning conditions. Previous approaches use a combination of different sensors, or work well, only in scenarios with noticeable road marking or borders. Our robot is used for chemical, nuclear and biological contamination measurement. Thus, to avoid complications with decontamination, only a monocular camera serves as a sensor since it is already equipped. In this paper, we propose a novel approach - a fusion of frequency based vanishing point estimation and probabilistically based color segmentation. Detection of a vanishing point, is based on the estimation of a texture flow, produced by a bank of Gabor wavelets and a voting function. Next, the vanishing point defines the training area, which is used for self-supervised learning of color models. Finally, road patches are selected by measuring of the roadness score. A few rules deal with dark cast shadows, overexposed highlights and adaptivity speed. In addition to the robustness of our system, it is easy-to-use since no calibration is needed.

Results:


The blue star denotes estimated vanishing point, yellow trapezoid is the training area, green and blue areas are shadows and highlight preprocessors and red area is the thresholded non-road region.


Video 2 (long):


Video 1 Video 2
@INPROCEEDINGS{miksik2011icra,
  author = {Ondrej Miksik and Petr Petyovsky and Ludek Zalud and Pavel Jura},
  title = {Robust Detection of Shady and Highlighted Roads for Monocular Camera Based Navigation of UGV},
  booktitle = {International Conference on Robotics and Automation (ICRA)},
  year = {2011}
}