Machine Learning Software Architecture and Model Workflow. A Case of Django REST Framework

Corresponding Author: Kennedy Ochilo Hadullo (Technical University of Mombasa) Institute of Computing and Informatics, Mombasa, Kenya Email: khadullo@gmail.com Abstract: The purpose of this study was to find out the challenges facing Machine Learning (ML) software development and create a design architecture and a workflow for successful deployment. Despite the promise in ML technology, more than 80% of ML software projects never make it to production. As a result, majority of companies around the world with investments in ML software are making significant losses. Current studies show that data scientists and software engineers are concerned by the challenges involved in these systems such as: limited qualified and experienced ML software experts, lack of collaboration between experts from the two domains, lack of published literature in ML software development using established platforms such as Django Rest Framework, as well as existence of cloud software tools that are difficult use. Several attempts have been made to address these issues such as: Coming up with new software models and architectures, frameworks and design patterns. However, with the lack of a clear breakthrough in overcoming the challenges, this study proposes to investigate further into the conundrum with the view of proposing an ML software design architecture and a development workflow. In the end, the study gives a conclusion on how the remedies provided helps to meet the objectives of study.


Introduction
Artificial Intelligence(AI) has become an important area of research in the 21st Century in many fields including: Marketing, education, banking, finance, agriculture, healthcare, space exploration, autonomous vehicles, law and so forth (Keshari, 2020;Hull, 2020). Besides, AI has also long been a major focus for tech leaders such as: Facebook, Amazon, Microsoft, Google and Apple (FAMGA) who have all been aggressively acquiring AI startups by trying to integrate machine learning into their products and services (Pathak, 2017).
It is noteworthy that FAMGA have announced shifting from a mobile-first world to an AI-first world (Allad, 2016). The shift implies that Information and Communication Technology (ICT) focus has moved from optimizing user experience through mobile phone inter faces to maximizing predictive accuracy through the use of AI.
The AI domain consists of several subfields, such as Machine Learning (ML), Deep Learning (DL), natural language processing, image processing and data mining which are also important topics in computing research and technology industries (Zhang and Tsai, 2005;Zhang et al., 2019). ML is an application of AI that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed.
Despite the interest caused by ML due to its wide applications and benefits in computing technology, DL, a subfield of machine learning is attracting much attention as well. DL uses artificial neural networks to mimic the workings of the human brain in processing data and creating patterns for use in decision making (Zang et al., 2019). However, despite the potential created by both ML and DL in data science projects, there is evidence that majority of the projects do not make it to production (Redapt Marketing, 2019;Ameisen, 2020) with a high failure rate of approximately up to 90% being reported.

Motivation
Although significant strides have been made in the field of ML software development, there is still a considerable amount of pitfalls that slow down the application of the technology in investments worldwide. Some of the issues that have motivated this study are stated below.
Most papers on ML published by students undertaking masters and PhD studies in computer science and Information Technology have been implemented using Python programming language and deployed using Tkinter (Grayson, 2000), graphical user interface for python desktop application.
Secondly, the use of REST frameworks (Django or Flask) for Machine learning web applications is quite complex and requires a good architecture and a clear workflow which are currently lacking (Jordon, 2019).
Thirdly, there is generally lack of clear software engineering principles for successfully integrating ML models and web applications.
Finally, the reportedly high failure rate of ML software projects of up to 80% calls for further research into how the design, development and deployment maybe be enhanced to improve the ML software engineering.

Background
Despite the perceived benefits of ML applications, the process of developing, deploying and continuously improving them is more complex compared to the traditional software, such as a web services and mobile applications (Geron, 2019;Chen, 2015). Deployment, or simply, putting models into production implies making it available to others, whether that be users, management, or other systems. When successfully deployed, ML projects enables users to send data and get their predictions accurately via web or mobile interfaces.
During the development of an ML software, there are three major tasks undertaken by the developers: The creation of ML model, the design of web application for running the model and the successful deployment of the product as an intelligent software (Chen et al., 2020;Li et al., 2015;Washizaki et al., 2019). These tasks are quite complex and demanding and require the relevant skills and inputs of from both ML Engineers and Software Engineers.
Task one requires a thorough knowledge of ML modeling using a machine learning programing language such as python. Task two requires the knowledge of web development using a REST framework and the integration of the model with the web application. Finally, task three involves successful deployment of the application with reliable outputs.
The challenges faced by ML engineers have resulted into more research being conducted in this area with the view of alleviating the challenges mentioned. As a result, new software design patterns and new platforms for development have emerged (Ameisen, 2020;(Zhang and Tsai, 2005;Zhang et al., 2019;Geron, 2019). However, these platforms both advantages and disadvantages.
The advantages include: Better data visualization, scalability, pipelining and code debugging options. On the flipside, the use of these tools requires fundamental knowledge of advanced calculus and linear algebra along with a good understanding of web based software engineering in order to create a sustainable ML software.
Secondly, the field of data science is known to mainly focus on ML algorithm writing and model development using data mining software's such as WEKA, Rapid Mining and Orange (Mikut and Reischl, 2011) and or ML programming languages such as python and R (Moroney, 2020), with preferably labeled data, having minimal dimensionality and optimizing performance and accuracy of the model (Schröer et al., 2021).
Another cause of concern in ML software development is that the principles used in software engineering and ML modeling are quite divergent: While ML is concerned more with algorithm writing, testing and accuracy issues, software engineering deals mainly with scalability, extensibility, configuration, consistency, modularity and security issues etc. (Sculley et al., 2015). It is thus difficult to produce a software that seamlessly combines constraints from both domains.
Lastly, there is no clear formula or procedure on the integration of ML models with web applications created with Django or Flask. This is to imply that while a majority of data scientists are good at creating ML models using datamining tools, very few are good at creating the same models using languages such as R or Python.
The problem is further compounded by the need to design and develop a web application and merge it with an ML application as one application (Plonski, 2019;Bajpai, 2020).

Purpose
The purpose of the study was to Identify the challenges that hinder the Development and Deployment of Machine Learning Software Models and thereafter create a Software Architecture and a Deployment Workflow implementable using Pythons Django Rest Framework (DRF).

Study Objectives
Identify the challenges facing data scientists and software engineers during Machine Learning Software Development and Deployment:

i) Develop a suitable Machine Learning Software
Architecture that is deployable with Python's DRF ii) Integrate a Machine Learning Software Deployment Workflow based on the software architecture created in objective (i).
In order to answer the study objectives, we propose to come up with a Software Design Architecture (SDA) to better understand the basic structure of a ML software and a Software Deployment Workflow (SDW) to guide the development and deployment of ML software and help overcome the challenges identified in the study.

Literature Review
The study reviewed the literature relevant to the study by using the Framework by proposed by Murad (2020) and illustrated in Fig. 1. By applying this framework, we decided to use a systematic literature review and scoped the existing literature on ML software to help us define the Research Problem (RP). Once this was done, the RP was specified in a clear and structured manner by framing it using specific keywords.
Some of the keywords used included machine learning software development, machine learning software deployment, machine learning engineering, machine learning web applications, data science engineering, machine learning software architecture, machine learning software workflow, Django REST framework and the challenges of deploying machine learning models.
To capture as many relevant articles as possible, a range of journals, books and grey literature in the mentioned areas were searched extensively to identify whether they contained articles having these key words. In total, twenty-five journals (25), sixteen books (16) and thirteen (13) grey literature were scoped. Out of these, only 18 journals, 15 books and 8 grey literature were found to be relevant for review Some of journals included were: Journal of Systems and Software, SSRN Electronic Journal, International Journal for Research in Applied Science and Engineering Technology, Journal of Data Warehousing and Journal of Systems, Software and Willy online Library. The review enabled us to identify some of the processes, models, frameworks and related work within the scope of the study topic as described in the next sections.

Machine Learning as a Model (MLaaM)
MLaaM is the output of writing ML algorithms that run on data and represents what was learned by the algorithm on training data. An algorithm in ML is a procedure that is run on data to create a machine learning model. Examples of ML algorithms include: K-nearest neighbors for classification, linear regression for regression and k-means for clustering (McClendon and Meghanathan, 2015).
The model is a file that is saved after running the algorithm and represents the data, the rules and the procedures for using the data to make a prediction (Geron, 2019). The most popular programming language for MLaaM is Python while Tensor Flow (TS) is the most preferred software framework by developers for both DL and ML (Jaxenter, 2018).
ML models can be created using three techniques: Supervised learning, unsupervised learning and reinforced learning. Supervised learning algorithms which are the most common are trained using labeled examples, such as an input where the desired output is known, while unsupervised learning is used against data that has no historical labels (Sharma, 2020).

Machine Learning as a Service(MLaaS)
Machine learning as a service (MLaaS) refers to a number of services that offer machine learning tools as part of cloud computing services (Singh, 2021;Geron, 2019). The main benefits of these tools is that customers can get started with machine learning applications quickly without installing specific software or provisioning their own servers. MLaaS providers offer services for the development and deployment of ML software projects that allow: Data transformation, predictive analytics, data visualization and advanced ML algorithms (Geron, 2019;Zhang and Tsai, 2005;Zhang et al., 2019;Singh, 2021).
MLaaS providers normally guarantee to their clients all stages of the machine learning process, including data storage and management, model development and deployment, performance monitoring and support and ensuring maximum efficiency of the whole machine learning process (Zhang and Tsai, 2005;Zhang et al., 2019).
Different providers may vary slightly in their cloud services, however most of them offer environments that can be used to: Prepare data, train, test, deploy and provide performance monitoring. Some of the popular providers include Amazon Web Services (Bankar, 2018), Google (Sanderson, 2012), IBM (Miller, 2019), Microsoft Azure (Ranjeetsingh, 2014) and Uber (Oppegaard, 2021).

ML Model Software Deployment
Software deployment is all of the activities that make a software system available for use. It is the mechanism through which applications modules are delivered from developers to users. The methods used by developers to build, test and deploy new code will impact how fast a product can respond to changes in customer preferences or requirements and the quality of each change (Fitzgerald and Stol, 2017).
In the context of ML, the process of taking a trained model and making its predictions available to users is known as deployment. As such, ML deployment is not very well understood amongst data scientists who lack backgrounds in software engineering. Alternatively, most software engineers are not good in ML model development. Plonski (2019) highlighted the four methods of deployment, outlining the requirements, merits and the demerits of each. The methods are summarized in Table 1.

Django REST Framework (DRF)
Django Representational State Transfer (REST) Framework is a free and open source high-level Python web framework that encourages rapid development and clean, pragmatic design. DRF is a powerful and flexible toolkit used for rapidly building web applications based on Django database models (Jordon, 2019;Bajpai, 2020) with the following advantages: Secure, scalable, customizable application with serialization that supports both the Object Relational Mapping(ORM) and non-ORM data sources (Jordon, 2019;Bajpai, 2020).
Given that most ML models are created using Python programing language makes DRF a preferred platform for ML software development.

ML Model Software Architecture(MMSA)
The ML software application building process is a complex process that brings together several components constituting the software engineering life cycle: Requirement engineering, analysis, design, development, testing deployment and maintenance (McGovern et al., 2004).
Thus, there is need for a software architecture that supports the ML model component and the web application components and without negatively affecting the performance of the software (Binge, 2020).
IEEE CS (2000) defines Software Architecture(SA), SA as the fundamental organization of a software embodied in its components, their relationships to each other and the principles guiding its design and evolution.
The SA for this study will consist of the following components: The architectural pattern which defines the granularity of a component, system Interaction which defines how the components communicate with each other and software quality attributes such as: Scalability, extensibility, maintainability, portability, adaptability and resilience, etc.
However, it is important to note that the type of architecture used in a software is normally determined by the project objectives, the proposed budget, the developer team skillset, infrastructure limits and the stakeholders interest (Binge, 2020).

Machine Learning Operations
Machine Learning Operations or "MLOps" is defined as the practice for collaboration between data scientists and software engineers in automatically managing the deployment of ML and DL software lifecycles (Wang, 2019). MLOps can be manual or automatic The manual MLOps processes as illustrated by Fig. 2 is an entirely manual process that includes data analysis, data preparation, model training and validation in Jupiter Notebook by data scientists. The data scientists hand over a trained model as an artifact to the software engineering team for deployment by putting the trained model in a code repository (Singh, 2021). The software engineers deploy the model as a prediction service using a micro service architecture with REST APIs. The workflow of this process is illustrated in Fig. 2.

Related Work
A study by Runyu (2020) to create a design pattern for ML deployment ascertained that although data scientists have come up with many good algorithms and trained models, putting those models is still a challenge. The key obstacles hindering ML software production are: Lack of a clear methodology for moving ML models to production, use of monolithic programming or lack of modularization when writing ML code and obscure best practices in ML software development.
Runyu (2020) developed a system design pattern named Model-Service-Client + Retraining (MSC/R) in order to overcome these challenges (Fig. 3). This design pattern incorporates the principles of modularization and separation of concerns and uses a micro service RESTful API architecture. Figure 3 illustrates the architecture.
The MSC/R design pattern works by using three teams of distinct developers: Data scientists-working on the model, MLOps engineers-working on the service and client developers-working on the front end. Then the next important part of the design illustrates connectors linking the four main system components: Model, service, retraining and client. The connectors main function is to provide guidelines for collaborations between the system components during development.
In a related study by O'Leary and Uchida (2020) to identify the common problems with creating ML pipelines from existing code, data was collected via face to face meetings in coding workshop settings averaging 100 companies, data scientists, researchers, ML platform owners and software engineers. The companies interviewed were in the process of transforming their business through the use of ML.
The projects involved migrating existing ML models to MLaaS using Kube Flow Pipelines (KFP) and Tensor Flow Extended (TFX). The study identified three problems: Firstly, due to the highly iterative nature of ML model development, the coding does not usually follow object oriented principles such as modularization and code reuse making it unsuitable for deployment using software engineering principles. As a result, engineers often need to re-implement the model from scratch into a deployable software. During the re-implementation, many of the implicit assumptions made by data scientists for modeling get lost, resulting in unexpected inconsistencies and issues in production.
Secondly, most ML model developments use "monolithic programming approaches" i.e., building applications that are "single-tiered" in nature. Single-tier architecture when used in ML combines data with business logic and user interface codes in a single logical structure. This results into a tightly coupled application that becomes inefficient to run and difficult to maintain.
In a related study by Sculley et al. (2015) that set out to explore the several specific design risk factors to account for in ML software deployment, the output was the Technical Debt Framework (TDF) illustrated in Fig. 4. Technical debt is an analogy used to describe a situation in software development where a workaround is used to solve a software problem (Kruchten et al., 2012;Zazworka et al., 2011). Several technical problems (debts) and potential workarounds (repayment approaches) were identified and used to create the TDF (Fig. 4).
Default in payment of technical debts may hinder successful deployment. The debts include issues related to: Design, coding, testing, documentation, versioning and infrastructure. Repayment can be done via: Automation, re-writing, refactoring, re-engineering, re-packaging, bug fixing and improving documentation. Repayment results into an improved software quality. ML systems have a tendency for incurring technical debts because of the already stated problems related to the domains of ML and software engineering.
Another study that set out to identify the challenges in deploying DL software by Chen et al. (2020), proposed an ML deployment process consisting of four phases: DL data collection, DL model training, Model conversion and exportation to TS and Platform configuration and deployment (Fig. 5).
The DDDM has two facets: DL software development and DL software Deployment. The first facet makes use of TF and Keras to integrate models into software applications for real usage after validation and testing. The second facet involves deploying the model on a cloud based server platform such as AWS SageMaker or Google Cloud. The deployment challenges identified include: Converting models to platform formats, configuration errors encountered during integration, limited skills in ML software development and data processing challenges when converting raw data into the input format needed by the model software. To obtain the data relevant for the study, over 3,023 posts from (Stack Overflow, 2020), specifically from TS serving, Google cloud ML and Amazon sageMaker were collected and analyzed.
In another related study, Esmaeilzadeh (2017), designed an architecture and developed a testable, scalable and efficient web-based application that models and implements machine learning applications in cancer prediction. The main components that formed the architecture of the system included a server, a database, a programming language, Django web framework, front-end design, testability, scalability, performance and design pattern (Fig. 6).
The data set for the study's application was a subset of the Surveillance, Epidemiology and End Results (SEER) Program of the National Cancer Institute. The application was implemented with Python as the back-end programming language, Django as the web framework and MYSQL as the database server. The front layer of the application was built using HTML CSS and JavaScript.
The study used Automated testing approaches to ensure the following: Making sure the application is working as expected before deployment, ensuring that new functionalities do not change the behavior of application in unexpected way, finding and fixing bugs and testing the performance of the application under heavy loads. Washizaki et al. (2019) embarked on a study with the purpose of collecting, classifying and discussing the best practices for designing quality and complex ML systems (Fig. 7).
The study set out to collect good and bad design patterns for ML software so as to provide developers with a comprehensive classification of such patterns. By using a questionnaire-based survey, the study established that there is a lack of expertise by ML engineers on the development of the architectures and design patterns. The study formulated a design pattern based on the Model View Controller (MVC) pattern having three layers: Presentation Layer, the Logic Layer and the Data Layer.

Requirement Not specified
Architectural TD

Documentation TD
Incomplete & insufficient Documentation,

Infrastructure TD
Old infrastructure, lack of integration & lack of automated deployment

Versioning TD
Unnecessary code forks, multi version support   Model-Service-Client + Retraining This design pattern incorporates the Does not clearly how the model, service and (MSC/R) design pattern (Runyu, 2020) principles of modularization and separation client component are integrated of concerns and uses a micro service 2 A test driven approach to develop Use of automated testing approaches to ensure No clear explanation how the rest of the web-based machine learning the following: Making sure the application application was developed using Python, applications (Esmaeilzadeh,2017) is working as expected before deployment. Django, MYSQL CSS and JavaScript. No mention of how deployment was done 3 DL software deployment Uses tensor flow and Keras to integrate No clear methodology on how Tensor Flow model: Chen et al. (2020) models into software applications deploying and Keras was used the model on a cloud based server platform No clarity on how deployment was done such as AWS sage maker 3

Technical Debts and Descriptions
Software engineering design pattern Exposed lack of expertise by ML engineers No clear methodology on how the logic, the for designing machine learning on ML software development data and the presentation layers were created, Systems (Washizaki et al., 2019) created an MVC for ML software integrated and deployed together with the ML model 4 The Technical Debt Framework (Adopted Identified some of the risks in ML software No mention of an architecture or a deployment from (Li et al., 2015). deployment called technical debts. workflow Identified debt repayment approaches. 5 Common problems with creating machine Used KubeFlow Pipelines (KFP) and Methodology on both development and learning pipelines from existing code TensorFlow Extended (TFX). Deployment not clear (O'Leary and Uchida, 2020)

Summary of Literature Review
After a comprehensive literature review, the results are summarized based on the model or framework reviewed, in terms of the advantages and disadvantages of each framework and model (Table 2).

Proposed ML Software Model Deployment Architecture(DFMSA)
The proposed architecture describes the major components of both the ML model and the Django part, their relationships (structures) and how they interact with each other. This architecture is known as the Django

SA1: User Interface
The user interface provides a connection between the Admin and normal user with the system through the Admin Panel and the Client Interface. Beneath this SA lies the static and template folders containing the CSS, HTML, JavaScript and JSON files. The SA connects with the rest of the application through the application URLS file.

SA2: Django API
The Django API is made up of the files: View.py for logic,models.py for database code, apps.py for application configuration, urls.py for providing paths, admin.py for administrative functions and tests.py for writing tests. All the files work in conjunction to make the application accept user data and give predictions.

SA3: System Configuration Files
The configuration files such as the settings.py and urls.py are vital in linking the system files together. For example, they are useful in creating paths and importing.
For example, they are useful in creating paths and importing files, linking the static and template files, defining database credentials and middleware components and linking the installed apps and security key.

SA 4: Serialization/De-Serialization
Object serialization is the process of saving a ML Model as a Pickle, a Joblib or manually saving and restoring using a JSON approach. Serialization represents an object with a stream of bytes, in order to store it on disk, send it over a network or save to a database. Deserialization is the process of restoring and reloading the pickled ML Model back to the Jupiter Notebooks(ipynb) format.

SA 5: Server and Repository
Heroku is a Cloud Platform as a Service (PaaS) supporting several programming languages such as: Ruby, Java, Node.js, Scala, Python and PHP. One advantage with Heroku is that If the project is already pushed to GitHub, automatic deployments can easily be set from the project's repository in GitHub from the Heroku dashboard.

SA 6: Command Line Utility
The command line utility contains two major utilities: Manage.py, a command-line utility that lets you interact with this Django project in various ways and django-admin.py, a Django's command-line utility for administrative tasks.

Proposed ML Software Model Deployment Workflow(SMDW)
The proposed SMDW is arrived based on the proposed architecture and Literature Review summary (Table 2).  (Fig. 9).

Phase 1: Start
During this phase, the software engineer is supposed to start by setting up a GitHub account, installing the Python virtual environment, creating a Django project and adding applications files into the project followed by committing the code into the GitHub Repository (GHR). This is in preparation for the software engineering part of the project.

Phase 2: Build ML Model
During this phase, an ML engineer or a data scientist installs Jupiter Notebook and installs and loads all the initial packages required for the project. This is followed by the loading and pre-processing of the data file, writing, training and saving the algorithms before adding the code into the GHR.

Phase 3: Build Django App
During this phase, the software engineer continues with what was started in Phase 1 by adding the database models, creating the REST APIs for the models, adding DRF serializes, adding views and URLS and adding the code into the GHR.

Phase 4: Integrate ML Model in Django App
During this phase, the software engineer continues with what was done in Phase 3 by writing ML server code for the model, write Test codes, creates a registry and add algorithms into the registry and then add the code into the GHR.

Phase 5: Make Predictions
During this phase, the software engineer continues with what was done in Phase 4 by creating views for predictions, creating DB models for Tests, create REST APIS for Tests, write scripts for sending Requests and add the code into the GHR.

Phase 6: A/B Testing
A/B testing in the context of this study is the process of comparing two outputs of the ML software predictions and concluding which of the two outputs or variants is more effective or accurate. The other parts of the project are repeated such as creating views for predictions, creating DB models for Tests, creating REST APIS for Tests, writing scripts for sending Requests and adding the code into the GHR.

Conclusion and Recommendations
This study investigated challenges that hinder the Development and Deployment of ML software models in order to create an architecture and a deployment workflow implementable using Pythons DRF. After a systematic literature review, the main challenges were found to be: Unethical programming practices, lack of software development skills that integrate both data science and software engineering, difficulty in using software's and tools for developing ML software and a lack of clear methodology for deployment. A suitable ML software architecture and model workflow and are also presented as a solution to deployment problems within the ML engineering. This study aims to benefit ML software engineers in industry to help increase the rate of production as well as masters and PhD students in IT and computer science to help them in wring their thesis regarding ML software. It is recommended that there I need to use the created architecture and deployment workflow to try and deploy an ML software as a test.