Reinforcement Learning in Financial Services: Modelling Payment Switching as a Multi-Armed Bandit Problem

Ishaya Gambo; Christopher Agbonkhese; Segun Aina; Mogboluwaga Tayo Otegbayo; Johnson Bayo Adekunle; Israel Odetola; Omobola Gambo; Tolulope Oluwadare; Oluwatoni Odetola

doi:10.3844/jcssp.2024.1519.1529

Research Article Open Access

Reinforcement Learning in Financial Services: Modelling Payment Switching as a Multi-Armed Bandit Problem

Ishaya Gambo¹, Christopher Agbonkhese², Segun Aina¹, Mogboluwaga Tayo Otegbayo³, Johnson Bayo Adekunle⁴, Israel Odetola¹, Omobola Gambo⁵, Tolulope Oluwadare¹ and Oluwatoni Odetola¹

¹ Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria
² Department of Digital and Computational Studies, Bates College, Lewiston, United States
³ Vitruvian Shield PT, LDA, Portugal
⁴ Venture Garden Group, Ikeja, Lagos, Nigeria
⁵ Department of Arts and Social Science Education, Lead City University, Nigeria

Abstract

The ever-evolving landscape of digital payments demands continuous innovation and self-improvement. This study addresses this imperative by simulating a model for payment routing, a crucial aspect of the digital payment ecosystem. To achieve this, industry professionals were interviewed to inform the approach, emphasizing data randomization for effective data collection. Using Python, a randomized dataset is created and three Reinforcement Learning (RL) algorithms are implemented and evaluated: Epsilon Greedy, Upper Confidence Bound (UCB), and Thompson Sampling. The paper adopts the Multi-Armed Bandit (MAB) framework to model payment routing as a resource allocation problem, offering a computational approach to real-world resource allocation dilemmas. Through simulation, we eliminate real-time transaction costs, allowing us to focus on algorithmic approaches without implications for customers, businesses, or payment providers. Among the RL algorithms studied, UCB emerges as the most effective in addressing this Multi-Armed Bandit problem, corroborating findings from prior research. This study suggests not only the potential of modeling real-world problems as MAB but also the superior performance of the UCB algorithm in solving RL problems. The paper underscores the need for increased focus on non-consumer-facing aspects of the financial services industry, emphasizing cross-disciplinary research to create infrastructure and software solutions. Researchers can extend this study by exploring MAB algorithms in various domains with options for system choices. The simulation-based approach offers a cost-effective means of testing system performance and hypotheses across a spectrum of industries, fostering innovation and progress.

Journal of Computer Science

Volume 20 No. 11, 2024, 1519-1529

DOI: https://doi.org/10.3844/jcssp.2024.1519.1529

Submitted On: 7 March 2024 Published On: 12 October 2024

How to Cite: Gambo, I., Agbonkhese, C., Aina, S., Otegbayo, M. T., Adekunle, J. B., Odetola, I., Gambo, O., Oluwadare, T. & Odetola, O. (2024). Reinforcement Learning in Financial Services: Modelling Payment Switching as a Multi-Armed Bandit Problem. Journal of Computer Science, 20(11), 1519-1529. https://doi.org/10.3844/jcssp.2024.1519.1529

Copyright: © 2024 Ishaya Gambo, Christopher Agbonkhese, Segun Aina, Mogboluwaga Tayo Otegbayo, Johnson Bayo Adekunle, Israel Odetola, Omobola Gambo, Tolulope Oluwadare and Oluwatoni Odetola. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

1,475 Views
880 Downloads
0 Citations

Download

Keywords

Multi-Armed Bandit Problem
Reinforcement Learning
Digital Payments
Transaction
Simulation