Machine Learning (ML), as a branch of Artificial intelligence (AI) play a crucial role in almost every engineering discipline. In this page, we gradually cover the most interesting educational material for the photogrammetric applications. Here, we assume that the readers have basic understanding about the AI and ML topics.

This page will be hopefully contain an interesting collection of learning basic ML methods, deep learning methods, and their usage in photogrammetric tasks.

We will gradually cover the following topics

1- Basics of Machine Learning

Machine learning (ML) is a branch of artificial intelligence that enables computers to learn from data and improve over time without being explicitly programmed for every task. It relies on algorithms that can detect patterns, make decisions, or predict outcomes based on the information provided. By training these algorithms on large datasets, ML systems “learn” to make accurate predictions or decisions on new, unseen data. ML is broadly classified into supervised, unsupervised, and reinforcement learning, each serving different purposes. From recommendation systems to self-driving cars, machine learning powers a wide range of modern applications, making it a key field in today’s data-driven world. Take a look at this tutorial. Here you find basic definitions and concepts.

2-Linear Regression

Linear regression is one of the simplest and most widely used algorithms in machine learning and statistics for modeling the relationship between a dependent variable and one or more independent variables. In its most basic form—simple linear regression—it aims to find a line that best fits the data, represented by the equation y=mx+c , where m is the slope and c is the intercept. The algorithm minimizes the difference between the actual data points and the predicted values on this line, often by reducing the “sum of square errors.” Linear regression is especially useful for identifying trends and making predictions when there is a linear relationship between variables. When multiple independent variables are involved, the technique is extended to multiple linear regression, which fits a plane or hyperplane to the data. Despite its simplicity, linear regression is a foundational tool in predictive modeling and a useful benchmark for more complex algorithms.

This is a useful resource to learn Linear Regression.

3-Supervised Learning

Supervised learning is a core type of machine learning where a model is trained on labeled data to make predictions or classify data points. In this process, each training example consists of an input and a corresponding output label, enabling the model to learn the mapping from input to output. The algorithm’s goal is to minimize the error in its predictions, improving accuracy by adjusting to patterns within the data. Supervised learning is often divided into two main types: classification, where the model categorizes inputs into discrete classes (e.g., identifying spam emails), and regression, where it predicts continuous values (e.g., forecasting sales). This method is widely used in real-world applications, from recommendation systems to image recognition, and forms a strong foundation for complex AI systems.

Take a look at this link.

4- Unsupervised classification

Unsupervised machine learning is a type of machine learning where the model is trained on unlabeled data, meaning it learns to identify patterns, structures, or groupings in the data without explicit guidance. Unlike supervised learning, there are no predefined categories or outcomes for the model to predict. Instead, it autonomously explores the data to discover underlying patterns, making it ideal for clustering and association tasks. For example, clustering algorithms group similar data points, often used in market segmentation to identify customer groups with similar purchasing behavior. Unsupervised learning is also foundational in anomaly detection, dimensionality reduction, and recommendation systems, where understanding inherent patterns in data can lead to valuable insights without requiring labeled datasets. This approach is particularly useful in situations where data is abundant but labels are scarce or costly to obtain, highlighting its role in exploratory data analysis and feature discovery in modern machine learning applications. An application of unsupervised machine learning in domain adaptation could be sees in this page. An article that discussed a UDA method is reviewed in this page.

This link refers to an interesting article about unsupervised image classification. You can find more about concepts at here.

5-Clustering Methods

Clustering is an unsupervised machine learning method that organizes data points with similar attributes into clusters, facilitating the discovery of hidden patterns and insights within a dataset. By categorizing data based on similarity, clustering uncovers structures that may not be immediately apparent, such as customer segments in marketing, patterns in medical imaging, or themes in text analysis. Some common clustering algorithms include k-means, which divides data into a specified number of clusters; hierarchical clustering, which creates nested groupings; and DBSCAN, which identifies clusters of various shapes and is especially useful for detecting outliers. Clustering plays a crucial role in exploratory data analysis, allowing analysts to interpret complex, unlabeled datasets, derive meaningful insights, and guide decision-making across various domains.

An overview of clustering methods could be found here.

6- Bayesian Methods

Bayesian machine learning methods offer a powerful framework for modeling uncertainty and making predictions in complex environments. By incorporating prior knowledge and beliefs into the learning process, these methods enable practitioners to update their models as new data becomes available, leading to more robust and adaptive systems. Bayesian approaches excel in situations where data is scarce or noisy, as they provide a principled way to quantify uncertainty and enhance decision-making. Techniques such as Bayesian inference and probabilistic graphical models allow for flexible modeling of relationships among variables, making them suitable for a wide range of applications, from natural language processing to healthcare. Ultimately, Bayesian machine learning empowers analysts to derive deeper insights and develop models that are not only accurate but also interpretable, fostering trust and understanding in automated decision-making processes.

You can find good information regarding basic concepts here. This is a relevant boo about probabilistic graphical models.  Probabilistic graphical models: principles and techniques, It is written by Daphne Koller and Nir Friedman.

7- Neural Networks

Neural networks are fundamental to contemporary machine learning, drawing inspiration from the structure and operation of the human brain. These networks consist of layers of interconnected nodes, or neurons, which enable them to learn intricate patterns and representations from extensive datasets. Their architecture ranges from straightforward feedforward networks to complex deep learning models with numerous hidden layers. This adaptability allows neural networks to perform exceptionally well in various applications, including image and speech recognition, natural language processing, and even game-playing. Through training, they adjust the weights of their connections to capture complex relationships in the data, making them especially effective for tasks involving unstructured inputs. Consequently, neural networks have greatly advanced artificial intelligence, leading to significant breakthroughs that were once considered out of reach.

Book: An introduction to Neural Networks. Kevin Gurney

Book: Neural Networks, an Introduction ,B Müller, J Reinhardt, MT Strickland

8- Deep Learning

Deep learning classification has emerged as a pivotal technique in the field of artificial intelligence, enabling the automated categorization of data into predefined classes with remarkable accuracy. Utilizing multi-layered neural networks, deep learning models can learn complex features and representations from large datasets, making them particularly effective in tasks such as image recognition, natural language processing, and speech analysis. Recent trends in deep learning classification include the development of more efficient architectures, such as convolutional neural networks (CNNs) and transformers, which have significantly improved performance while reducing computational costs. Additionally, there has been a growing emphasis on transfer learning, allowing pre-trained models to be fine-tuned for specific tasks with limited labeled data, enhancing accessibility for smaller organizations and researchers. The rise of explainable AI is also noteworthy, as it aims to provide transparency into the decision-making processes of deep learning models, addressing concerns about trust and accountability in automated systems. Together, these trends are shaping the future of deep learning classification, driving innovation across various industries and applications.

Take a look at this article.

Free online AI resources

1-Practical Deep Learning: A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.

2- Free NLP course by HuggingFace: A free online course describes the principals of using HuggingFace to address NLP problems.

3-Microsoft BitNet:the official inference framework for 1-bit LLMs.

4-Llama model: The Llama model represents a sophisticated Large Language Model to application development, offering a comprehensive suite of features and capabilities designed to streamline the process of creating scalable, data-driven software systems. Its modular architecture enables developers to tackle large projects in smaller, more manageable chunks, facilitating greater efficiency and productivity throughout the development lifecycle. By leveraging version control and dependency management functionality, the Llama model also helps ensure that code modifications can be easily tracked and reversed if necessary, promoting confidence and stability in the development process. Ultimately, this innovative framework simplifies the task of constructing complex applications, empowering developers to deliver high-quality solutions with unparalleled speed and agility.

Hardware

Currently, NVIDIA is the main provider of AI chips with its A100 GPUs be the flagship in the computation AI market, however, rivals such as Huawei (with GPU Ascend 910B) are trying to fill the monopoly gap. We should note that many regular AI methods could be safely executed on CPUs, however, some more computing demand tasks such as Large Language Models (LLMs) require specialized computing units to train models and perform highly sophisticated inferences.

Stereo Matching

Stereo matching is a crucial process in computer vision that involves determining the correspondence between points in stereo images captured from slightly different perspectives. By identifying matching features in the left and right images, stereo matching enables the computation of depth information, allowing machines to perceive the three-dimensional structure of a scene. This technique typically employs algorithms that analyze the pixel intensities, textures, and shapes to find correspondences and minimize disparities between the images. Various methods, such as block matching, semi-global matching, and graph cuts, can be used to enhance accuracy and handle challenges like occlusions and varying illumination. The resulting depth maps are invaluable in numerous applications, including robotics, autonomous driving, 3D reconstruction, and augmented reality, as they provide essential spatial information for navigating and interacting with complex environments. By enabling machines to understand depth perception, stereo matching plays a vital role in advancing technologies that rely on accurate environmental sensing.

take a look at the following link:

Real-Time Self-Adaptive Deep Stereo

OpenCV Documentation: The OpenCV Stereo Matching documentation provides comprehensive information on stereo vision and various stereo matching algorithms available in the OpenCV library, including code examples and tutorials.

Stereo Matching Algorithms Comparison: The paper titled A Comparative Review of Stereo Matching Algorithms provides an overview and comparison of various stereo matching techniques, discussing their strengths, weaknesses, and applications.

Computer Vision: Algorithms and Applications: The book by Richard Szeliski is available online here. It includes chapters on stereo vision and matching, providing theoretical insights and practical implementations.

Stereo Vision and Depth Estimation with PyTorch: This tutorial on Depth Estimation using Stereo Vision covers stereo matching concepts using PyTorch, complete with code examples and explanations.

Depth Estimation via Stereo Matching: The GitHub repository Depth Estimation contains implementations of various depth estimation techniques, including stereo matching algorithms, along with datasets and results.

CVPR Conference Papers: The Computer Vision and Pattern Recognition (CVPR) conference features numerous papers and presentations on stereo matching and related topics. You can search for specific papers on their official site.

GitHub Repositories: You can find various open-source implementations of stereo matching algorithms on GitHub. For example, Stereo Matching Algorithms provides a collection of projects and resources shared by the community.