Generative models and Deep Reinforcement Learning for Geospatial Computer Vision

Consortium

Presagis Inc

Presagis Inc

Presagis is a Montreal-based software company that supplies the top 100 defense and aeronautic companies in the world with simulation and graphics software. Over the last decade, Presagis has built a strong reputation in helping create the complexity of the real world in a virtual one. Their deep understanding of the defense and aeronautic industries combined with expertise in synthetic environments, simulation & visualization, human-machine interfaces, and sensors positions them to meet today’s goals and prepare for tomorrow’s challenges. Today, Presagis is heavily investing into the research and innovation of virtual reality, artificial intelligence, and big data analysis. By leveraging their experience and recognizing emerging trends, their pioneering team of experts, former military personnel, and programmers are challenging the status quo and building tomorrow’s technology — today.

Concordia University, Montreal, Quebec

Immersive & Creative Technologies Lab

The Immersive and Creative Technologies lab (ICT lab) was established in late 2011 as a premier research lab, committed to fostering academic excellence, groundbreaking research, and innovative solutions within the field of Computer Science. Our talented team of researchers concentrate on specialized areas such as computer vision, computer graphics, virtual/augmented reality, and creative technologies, while exploring their applications across a diverse array of disciplines. At the ICT Lab, we strive to achieve ambitious long-term objectives that are centered around the development of highly realistic virtual environments. Our primary objectives include (a) creating virtual worlds that are virtually indistinguishable from the real-world locations they represent, and (b) employing these sophisticated digital twins to produce a wide range of impactful visualizations for various applications. Through our dedication to academic rigor, inventive research, and creative problem-solving, we aim to propel the boundaries of technological innovation and contribute to the advancement of human knowledge.

Researchers

People who have worked on the project; sorted according to graduation date where applicable:

Harshitha Voleti - MSc

Saikumar Iyer - MSc

Amirhossein Sorour - MSc

Ahmad Shabani - PhD

Amin Karimi - PhD

Naghmeh Shafiee Roudbari - PhD

Bodhiswatta Chatterjee - PhD

Sacha Lepretre - Presagis Inc (CTO)

Charalambos Poullis - Concordia (PI)

Research Objectives

Generative Modeling

Deep Reinforcement Learning

PHOENIX research programme

Publications

ISVC2023

Strategic Incorporation of Synthetic Data for Performance Enhancement in Deep Learning: A Case Study on Object Tracking Tasks

Jatin Katyal, Charalambos Poullis
18th International Symposium on Visual Computing (ISVC), 2023
Obtaining training data for machine learning models can be challenging.
Capturing or gathering the data, followed by its manual labelling, is an expensive and time-consuming process. In cases where there are no publicly accessible datasets, this can significantly hinder progress. In this paper, we analyze the similarity between synthetic and real data. While focusing on an object tracking task, we investigate the quantitative improvement influenced by the concentration of the synthetic data and the variation in the distribution of training samples induced by it. Through examination of three well-known benchmarks, we reveal guidelines that lead to performance gain. We quantify the minimum variation required and demonstrate its efficacy on prominent object-tracking neural network architecture.
ISVC2022

Tractable large-scale deep reinforcement learning

Nima Sarang, Charalambos Poullis
CVIU, 2023
Reinforcement learning (RL) has emerged as one of the most promising and powerful techniques in deep learning. The training of intelligent agents requires a myriad of training examples which imposes a substantial computational cost.
Consequently, RL is seldom applied to real-world problems and historically has been limited to computer vision tasks, similar to supervised learning. This work proposes an RL framework for complex, partially observable, large-scale environments. We introduce novel techniques for tractable training on commodity GPUs, and significantly reduce computational costs. Furthermore, we present a self-supervised loss that improves the learning stability in applications with a long-time horizon, shortening the training time. We demonstrate the effectiveness of the proposed solution on the application of road extraction from high-resolution satellite images. We present experiments on satellite images of fifteen cities that demonstrate comparable performance to state-of-the-art methods. To the best of our knowledge, this is the first time RL has been applied for extracting road networks. The code is publicly available at https://github.com/nsarang/road-extraction-rl.
ISVC2022

Unsupervised Structure-Consistent Image-to-Image Translation

Shima Shahfar, Charalambos Poullis
ISVC, 2022
The Swapping Autoencoder achieved state-of-the-art performance in deep image manipulation and image-to-image translation. We improve this work by introducing a simple yet effective auxiliary module based on gradient reversal layers.
The auxiliary module’s loss forces the generator to learn to reconstruct an image with an all-zero texture code, encouraging better disentanglement between the structure and texture information. The proposed attribute-based transfer method enables refined control in style transfer while preserving structural information without using a semantic mask. To manipulate an image, we encode both the geometry of the objects and the general style of the input images into two latent codes with an additional constraint that enforces structure consistency. Moreover, due to the auxiliary loss, training time is significantly reduced. The superiority of the proposed model is demonstrated in complex domains such as satellite images where state-of-the-art are known to fail. Lastly, we show that our model improves the quality metrics for a wide range of datasets while achieving comparable results with multi-modal image generation techniques.

Contact

Charalambos Poullis
Immersive and Creative Technologies Lab
Department of Computer Science and Software Engineering
Concordia University
1455 de Maisonneuve Blvd. West, ER 925,
Montréal, Québec,
Canada, H3G 1M8