Session: 06-17-05 AI Technology for Ocean Engineering V
Paper Number: 127557
127557 - Avoiding Collision in Congested Maritime Environments Using Reinforcement Learning Algorithms
The increase in inter-continental trade has led to a higher demand for maritime transportation, resulting in more accidents and ship collisions. A 2020 study reported that out of 31,412 accidents recorded between 1993 and 2012, 19,175 were attributed to human errors. The Ever Given disaster highlighted the significance of autonomy in marine navigation in 2021 when the ship blocked the vital Suez Canal for six days due to strong winds. Better solutions are needed to manage busy marine areas with numerous under-actuated ships in motion. Deep Reinforcement Learning (Deep RL) algorithms have emerged as a potentially transformative solution.
This research delves into the effectiveness and adaptability of various Deep RL algorithms in congested marine environments with moving ships. This study focuses on reducing the number of state values for interpretable and computationally efficient deep-learning networks. The goal is to pinpoint the best configurations and algorithmic approaches to enhance maritime safety and efficiency, especially in challenging situations like port overcrowding, open-sea traffic, and the unpredictable movements of multiple underactuated vessels.
Two primary strategies are widely recognized within reinforcement learning: off-policy and on-policy methods. On-policy learning evaluates and improves the same policy used to select actions. In contrast, off-policy learning algorithms evaluate and improve a policy different from One used for action selection. That means it will try to evaluate and improve the same policy the agent already uses for action selection. Algorithms like Deep Double Q-Network (DDQN) and Dueling DQN are notable off-policy methods with discrete action spaces; their adaptability allows for functionality in intricate scenarios, such as navigating areas dense with moving vessels. The strength of these algorithms lies in their ability to learn from previous data. This feature becomes indispensable in dynamic environments characterized by many moving vessels within ports or active marine routes.
In contrast, algorithms like Twin Delayed Deep Deterministic Policy Gradients (TD3) and Proximal Policy Optimization (PPO) cater to continuous action spaces. TD3, an off-policy algorithm, demonstrates proficiency in managing continuous action contexts. Meanwhile, PPO, an on-policy method, directly approaches policy optimization through immediate interactions with the simulation environment. One of the main advantages of PPO is its simplicity and stability. PPO does not require complex hyperparameter tuning or sophisticated optimization techniques.
This research examines off-policy and on-policy algorithms with various approaches to minimize the number of state values and optimal reward functions. Such insights present concrete avenues to enhance safety and efficiency, considering the fluctuating congestion and the persistent presence of numerous vessels typical of ports and frequented maritime pathways.
Presenting Author: Amar Nath Singh Indian Institute of Technology Madras
Presenting Author Biography: I'm a third-year undergraduate in Naval Architecture and Ocean Engineering, and my passion for the convergence of marine systems and cutting-edge technology has profoundly shaped my academic and practical pursuits. Specializing in Reinforcement Learning (RL) and Computer Vision, I have cultivated expertise that bridges the gap between naval engineering and advanced AI-driven solutions. Serving as the Software Lead for Team Aritra during the Njord Autonomous Boat Competition was a highlight, where our team achieved a commendable 3rd-place finish. My role allowed me to implement and fine-tune AI and machine learning solutions for marine robotics, reinforcing my commitment to the field. With a solid foundation in naval engineering and a proven record in software leadership, I am enthusiastic about driving innovation and advancing the capabilities of autonomous marine systems.
Authors:
Amar Nath Singh Indian Institute of Technology MadrasAkash Vijayakumar Indian Institute of Technology Madras
Shankruth Balasubramaniyam Indian Institute of Technology Madras
Abhilash Somayajula Indian Institute of Technology Madras
Avoiding Collision in Congested Maritime Environments Using Reinforcement Learning Algorithms
Submission Type
Technical Paper Publication