
The master's thesis of researcher Raja Wajoud Ali was discussed at the College of Engineering, University of Basra, Department of Computer Engineering, entitled:
With the increasing demand for efficient and reliable autonomous systems,
the development of autonomous navigation for mobile robots in indoor
environments has become a vital research focus. As mobile robots are increasingly
deployed in industries such as warehousing, healthcare, and manufacturing, the need
for precise, adaptable navigation systems in complex and dynamic spaces continues
to grow. A crucial component of such systems is the integration of object detection
and mapping technologies, enabling robots to perceive their surroundings and
navigate effectively while minimizing human intervention. Simultaneous
Localization and Mapping (SLAM) serves as a foundational technique in these
efforts, allowing mobile robots to build maps of unknown environments while
tracking their location in real time.
This thesis explores visual-based SLAM (V-SLAM) approaches. Each
method is examined in terms of its architectural components, operational pipeline,
and typical limitations when applied to indoor environments. A comparative analysis
of four widely-used V-SLAM algorithms—ORB-SLAM2, ORB-SLAM3, RTAB-
Map, and Dyna-SLAM—is presented, with attention to their effectiveness in
handling dynamic elements in indoor settings. System evaluations were conducted
using benchmark datasets, including KITTI, TUM—RGB-D, and BONN—RGB-D
Dynamic, with performance where accuracy was determined based on Absolute
Trajectory Error—and Relative Pose Error. The results provide a comprehensive
understanding of the strengths, limitations, and practical considerations involved in
deploying V-SLAM systems for indoor mobile robot navigation. This study presents
a modified approach to ORB-SLAM3 by incorporating a real-time object detection
mechanism aimed at improving performance in dynamic environments. The original
ORB-SLAM3 framework relies on ORB features for extracting keypoints, matching
IVthem through descriptors, and estimating the camera's motion. While effective in
static scenes, its accuracy can significantly decline when moving objects are present,
as these may introduce misleading feature correspondences. To counter this issue,
the enhanced system integrates a real-time object detection model—specifically
YOLOv5—capable of identifying potentially dynamic regions within video frames.
Detected areas associated with motion are excluded from the feature matching
process, thereby minimizing their influence on pose computation. The approach was
validated using established dynamic datasets, including BONN and TUM RGB-D,
and demonstrated a notable improvement in pose estimation accuracy over the
standard ORB-SLAM3. Further evaluation was conducted using the Intel RealSense
D435i camera in live, real-world scenarios with both RTAB-Map and the YOLOv5-
enhanced ORB-SLAM3.