Pedagogical Resources Indexation Based on Ontology in Intelligent Recommendation System for Contents Production in d-Learning Environment

: We witness, today, a strong evolution of learning environments. In parallel, a problem has emerged, consisting in how to capitalize the production of resources when switching from one environment to another. The heterogeneity of the environments, the evolution of the platforms and the will to reuse the educational resources already produced pushed us to design an intelligent system based on cases. In this study, we will focus on the need for resource indexing to facilitate the task of researching and recommending educational resources for authors regardless of the learning environment used. In the literature, this representation can take two forms: Standards or ontologies. The use of standards has partially solved our problem since it is very beneficial for systems that are under construction. On the other hand, it is more interesting to go through the ontologies for systems that are already designed, that we wish to reuse, especially for those that have shown, through the authors, a great satisfaction in the field of knowledge management. Indeed, their use does not require an investment in the environments concerned by the reuse.


Introduction
In the object platforms of our subject, we find that there is a wealth of educational resources that is developing continuously. It is therefore interesting for a resource producer, before carrying out his production, to run a search in order to reuse existing resources. However, because of the diversity of the platforms: e-Learning, (MOOC) Massive Open Online Course and Open Educational Resources (OER) as well as the structure of the resource, it makes the research in these platforms very difficult almost impossible.
Our paper is focused on the indexing step which is the first important step of our architecture. The goal of indexing is to search, classify and organize objects. When a search engine runs a search in an indexed document base, the system will return all documents indexed by the same keywords of the documentary language used, regardless of the language used or the existence of keywords in the documents found.
In order to widen the field of research, in Linked Open Data, if we align the different vocabularies with the offered semantic web technologies, we can compare the common or different terms and enrich vocabularies with others by introducing (the terms equivalent, translation into other languages,...). Processing an indexed data set with controlled vocabularies makes it easy to search for information and bind data using these vocabularies (Faqihi et al., 2018).
The indexation of educational resources involves several actors around a learning information system with pedagogical aims. A case-based system allows high reuse of resources. In an indexed educational resource environment, a search for a learning purpose can recover similar resources to the expected goal. However, the iterations as well as the successful experiences allow in time to further enrich our system and make its use very interesting and very beneficial. The contribution will be very important to the authors.
Several questions arise: How to index already existing educational resources to be able to find them for reuse? How to ensure the sustainability of educational resources? How to search across the concepts in heterogeneous platforms?
In this study, we will first talk about the case-based content production system. Then we will describe the contribution of metadata in the process of indexing educational resources. Right after, we will present the conception of the global ontology, its structure and validation. Finally, we will conclude with perspectives. The first section is to recall the architecture of the design of the intelligent system for producing educational resources (Faqihi et al., 2018), we will detail briefly its stages by focusing on the step concerned by this communication which is "indexing". Then, we will present the indexing process in an intelligent production system with heterogeneous environments. In the next section, we will present the contribution of indexing in the sustainability of educational resources by dealing in particular with elearning and MOOC based on similar work. Then, we will present through a state of the art, the contribution of ontologies in the indexing of educational resources. Finally, we will propose our ontology that will ensure the indexing of educational resources for the benefit of our intelligent system of production of educational resources.

Content Production System Based on Case-Based Systems
For the production of educational resources, we opted for an architecture based on case-based systems.
We presented our vision to make pedagogical resources exploitable. We want to capitalize on already existing productions. For example, a tutor who wants to produce a course, he must first perform a search to reuse existing resources. He must be able to reuse the resource despite the structural heterogeneity between the two resources. The multitude of environments generates a multitude of educational resources and each resource has its own educational objectives.
How to make best use of existing pedagogical resources, by an author, when producing courses in an educational environment?
Our goal is to capitalize pool efforts and minimize the investments of authors. We propose to the author to reuse existing resources with communication between other learning environments (e-learning, OER and MOOC). This is the foundation of the concept of interoperability. But since we are in the pedagogical field, semantics is of great importance since the pedagogical objectives have a meaning that must be preserved.
Open Educational Resources or OERs and MOOCs are also rich learning content that need to be reused. Several studies (Faqihi et al., 2018) show the importance of case-based systems in the field of Artificial Intelligence. We proposed using these systems to solve our problem. The principle of case-based systems is essentially based on use cases. Fig. 1: Scenario of production of learning content through case-based reasoning (Faqihi et al., 2018) Integration of as much information as possible relating to the educational resource sought Indexing on the basis of ontology New problem.
Case of new production of learning content.
We are talking about semantic similarity with respect to the criteria introduced by the producer Similarity search An author who wishes to reuse an already existing resource can also rely on the use cases of the resource in question and their numbers. The author can propose a new use case for this resource. This positively impacts its importance, its ability to adapt and finally its relevance.
As a result, when an author wants to produce educational content, it is important to first capitalize on existing production. For this, he must first perform a search, according to a predefined educational objective, in the various platforms object of author research. The result obtained is a set of resources that do not necessarily meet the needs initially expressed by the author.
To bring the result closer to the pedagogical objectives, an indexing action is essential in order to have a sorting by relevance. During this phase, a pedagogical intervention by an expert is a prerequisite. Fig. 1 illustrates our vision.

An Overview of the Contribution of Metadata for Indexing
The evolution of educational resources and objects has led to the development of many metadata standards. The goal is to index and reference educational resources. For the field of learning, there are several standards for schematizing metadata. However, for learning environments to be able to communicate and exchange with one another, a common agreement must be stopped to overcome the problem of schema heterogeneity and also develop the tools and mechanisms that are capable of reaching a high level of interoperability and alignment between metadata.
The goal of a metadata schema is to define the fields that must describe a resource. It also defines, for each field, type, whether it is mandatory or not, its default value, modifiable or not ... etc.
In the literature, there are a large number of metadata schemas. Their objectives are to describe the different types of resources attributes to find a common agreement between the existing structures. In other words, to propose a schema as complete as possible but also to take care not to make these schemas specific to a domain or to a type of educational resource.
Another important component to integrate when choosing a schema is the time needed to collect and populate the resource form. Ideally, the metadata should be collected at the same time as the data, so that it can be tracked as soon as it is published. Indeed, the obligation to complete a very detailed form may reduce the willingness to publish. This can be explained by the use, in specific contexts, of application profiles.
In the literature related to our context, there are three major metadata schemas that make reference: Dublin Core, LOM and MLR.

Dublin Core
The Dublin Core is a descriptive format that is both simple and generic. It comprises 15 elements and was created in 1995 in Dublin (Ohio) by OCLC and National Center for Supercomputing Applications (NCSA).
The elements provided by Dublin Core are all repeatable and optional. They relate to the following descriptions: The following Fig. 2 shows the pedagogical scheme of the Dublin Core standard.

LOM
This is the acronym for "Learning Object Metadata". Aimed towards the need for e-learning, the LOM standard is a descriptive schema for representing digital and non-digital pedagogical resources in the context of teaching, training or learning. It is based on the principle of granularity of learning objects. Which means that it is very detailed. The LOM profile defines sixty elements structured in nine categories to describe accurately any educational resource. The educational nature of the resource is represented by 11 elements. The main objective of the LOM standard is to provide a common framework at the international level to ensure the interoperability of existing description schemes. The following Fig. 3 shows the LOM structure as well as the set of attributes and their level of arrangement.
The scheme of the LOM standard is described as follows: General: Groups the characteristics of a resource that are independent of the context of use (Title, Language, Description, ... etc.) Life Cycle: Describes the current state of a resource and who has contributed to its evolution (Version, contributing entities, etc.) Metadata: Collects the data detailing the descriptive card itself rather than a resource Technique: The technical characteristics (Format, size, location, ...) Pedagogy: Describes the pedagogical characteristics of a resource (type of resource, role of the user, context of use, ...) Rights: Specifies the conditions of use of a resource (Costs, copyrights, ... etc.) Relationship: Describes whether there is a relationship of the resource with others.   Annotation: Offers comments on the pedagogical use of a resource.
Classification: Describes the location of the resource in a certain classification system.

MLR
This is an innovative new standard whose purpose is the description of educational resources but in an international context that is both multilingual and multicultural. It includes the latest developments in terms of dissemination of open and linked data.
Published in 2011, the international standard multipart MLR -ISO/IEC 19788, named Metadata for learning resources (Metadata for Learning Resources) is a new educational standard initiated and produced in the 36 subcommittee of the international organization of ISO standardization.
The MLR standard has been supported by the French AFNOR delegation in order to cope with the gradual loss of international pedagogical interoperability resulting from a proliferation of LOM application profiles as well as other completely independent schemes that took place.
The structure of the MLR standard is illustrated in the Fig. 4.

Indexing of Educational Resources
Why is it necessary to index pedagogical resources? In a distance learning platform the constraints are more important since we are talking not only about a simple indexing but a transfer of content from one platform to another. In other words, it is the interoperability in terms of learning content between e-learning systems. An action of normalization of the indexing method of these contents will make it possible to dilute the migration and conversion constraints. Re-creating an integrated content into a single platform is a tedious and expensive step. The idea is that the content and learning environment designers use an indexing format to facilitate interoperability between these different platforms. The production of a pedagogical resource is both long and costly. Hence the importance of making it shared and exchanged in order to capitalize on the efforts already made during its creation. Therefore, during its development, we must meet requirements and standards for its description in this case indexing. In this way we will make it profitable and optimize its reuse. During the designing, we have to think about the background and form, therefore about the content and descriptors.
The action of indexing and referencing the pedagogical resources produced, has therefore become an obligation to allow their sharing and mutualization and thus to ensure the evolution towards a multi-resourced learning system, of quality and relevance that perfectly meets the needs of the tutors and learners. In this context, the concept of metadata plays a central role and covers both technical and semantic as well as organizational aspects and facilitates the classification, description and indexing of pedagogical resources. These aspects are essential where the relevance of learning depends not only on access to content based on criteria such as subject, title, author or date of publication of a resource, but also to additional elements such as learning context, learner profile, level of education, hourly volume, etc. These elements and of course others, form the basis of patterns and models of description and indexation around which there is consensus at all levels: Local, national, regional and international. Indeed, the aim is to harmonize resource description rules to optimize their search, share, reuse and interoperability when switching in heterogeneous learning environments.
The indexing process is very useful because it facilitates the search, classification and also the organization of pedagogical objects. It is a master piece in the field of knowledge management. Indeed, when searching through a search engine in an indexed resource database, objects indexed by the same keywords will be found, even if the keyword is not in these resources or they are in other languages. In a Linked Open Data context, Vocabulary alignment with Semantic Web technologies allows comparison and enrichment of common terms and vocabularies (equivalent term, translation into another language, etc.). The indexation of pedagogical resources, involves several actors around a system with educational objectives. There are several definitions for the concept of indexing, we quote the following: " Process to represent, by means of terms or indices of a documentary language or by means of free language elements, the characteristic notions of a document (resource, collection) or a question, in order to facilitate their search, after having identified them by the analysis. The possible combinations of the identified notions are represented explicitly (pre-coordinated indexing) or not (post-coordinated indexing) according to the possibilities of the documentary language used. " From (Viviane and Céline, 2013).
The following Fig. 5 illustrates the participation of all actors in a learning system in the process of indexing pedagogical resources.

Learners
According to a review of the literature, we found that the key solutions to overcome the problem of indexing pedagogical resources is either standards, or the development of ontologies. In this paragraph, we will present the how and some work done within our research team.

Standardization
This is one of the solutions adopted in the literature to overcome the problem of indexing. It is effective for two main reasons: First it can be implemented very quickly, since standards are used at the time of the system designing. Secondly, it is less expensive because the standards are already developed and put in place, they are ready to be reused. On the other hand, for the capitalization of a history or an existing one, it is very difficult to use standards. Therefore, there must be another way to represent knowledge. However, the following terms must be clarified:  A norm is a set of compliance rules issued by a standardization body at the national or international level  A standard is a set of recommendations emanating from a representative group of users gathered around a forum, such as the Internet Engineering Task Force (IETF), the World Wide Web Consortium (W3C), or the Dublin working group Core  An application profile is a local adaptation of a norm or standard, based on the particular needs and practices of a community

Ontology
Ontology defines the common vocabulary for different entities that want to share information in a specific area. It is "an explicit formal description of concepts in an area of discourse". This means that it allows two entities to exchange, eliminate conflicts, improve communication and sharing of meaning. Ontologies have been the best way to represent knowledge through existing research (Naçima, 2007;Battou, 2012;Izza, 2006). Finding matches between learning environments lead to finding matches between ontology concepts that represent them. For that, the author of the book (Davies et al., 2006), distinguishes between three wide categories to identify these correspondences: 1. The mapping: Its principle is to represent the correspondences between the ontologies. The most known existing tools are MAFRA (Kalfoglou and Schorlemmer, 2003), IF-MAP (Maedche et al., 2002) and C-OWL (Bouquet et al., 2004) 2. Fusion: Its aim to create a new ontology based on the knowledge of the original ontologies. The tools proposed for this category, we cite: CHIMAERA (Mcguinness et al., 2000), PROMPT (Noy and Musen, 2000) 3. Alignment: As for the two previous categories, it consists in the discovery of the correspondences between the ontologies. The existing methodologies for this category, we find: QOM (Ehrig and Staab, 2004), ASCO (Bach Thanh et al., 2004) and Anchor-PROMPT (Natalya and Musen, 2000) In the literature, among the works that have used ontologies, for example:  We find (Battou, 2012) whose main purpose is to examine the interest of the fine granularity for the adaptation of the courses in a Dynamic Hypermedia Adaptive Systems. It is, in fact, automatic generation of courses adapted to a particular learner, from a set of educational resources and according to his needs, preferences and prerequisites. The pedagogical resources currently known as pedagogical objects are indexed using educational metadata norms and standards such as LOM and SCORM. These pedagogical objects, which constitute the content to be learned, are assembled from pedagogical grains and then combined, to constitute individual training courses with a hypermedia type presentation  In Naçima (2007), it is a work that deals with the problem of semantic alignment of goals in a distributed environment. To solve it, they proposed an approach whose objective is to establish links between the ontologies of goals based on the distributed logics. In order to automate this alignment, they relied on the IF (Information Flow) model. Indeed, this model identifies a basic theory for the formalization of connections between systems. Thus, the goals represented in terms of ontologies can be connected semantically if they satisfy a certain number of rules In the literature review, we observe that (Du et al., 2019), the contribution of ontologies and more broadly knowledge bases, to the domain of recommendation systems is pretty proven. The ontologies can be integrated into a recommendation system to improve performance. Two main uses have been explored so far: One aims to represent the profiles of the users of the system, either by concepts defined in the ontology, or by instances of the class which refers to the recommendation, the values associated with each user profile element will be adjusted based on user actions and feedback. The other use of ontologies aims mainly to measure the semantic proximity of user items, represented by instances of the classes of ontology by taking into account the properties that characterize them and their semantic relationships.
An extract from existing recommendation systems shows that the ontologies seem especially relevant in the context of CBF or Hybrid approaches, see summary

Synthesis
The choice of standardization is ideal but it cannot be adopted since it is only strongly recommended for systems under construction. On the other hand, our work fits into an existing one, where the learning systems are there and they contain a very rich and varied content. It is difficult to rebuild it according to a set of standards.
We can say that the LOM Standard is widely used, also called IEEE-LOM. It has been customized in some countries such as France and Canada to distinguish between mandatory and optional attributes in the standard. The only criticism found in the literature is that it is detailed so much that producers sometimes prefer not to use it. On the other hand, during the last ten years, a new standard has emerged called Metadata for Learning Resources (MLR). In order for it to integrate the existing Dublin Core and IEEE-LOM standards and subsequently open on the Web of data or the Semantic Web it was based essentially on the RDF formalism.
In brief, if the e-learning platforms are widely based on the LOM norm, this is not the case for the other environments: MOOCs and OERs. Therefore, the use of ontologies is indispensable because they are very widespread in the field of knowledge representation. That said, the norm will be the basis of the concepts that will form our future ontology.
The strength of ontologies exists in their ability to represent knowledge strongly linked by syntactic or semantic relations. This representation can be done independently from the structural or descriptive constraints of the learning environments. Therefore, the designers of these environments opt for this solution because it doesn't require an evolution of their systems but rather a representation of knowledge managed by their systems. In addition to developments that are generally very expensive in terms of time and money, these systems contain an existing, a rich history and content, designed in a certain way that must necessarily be reused. This evolution is achievable via ontologies in a less expensive way compared to the maintenance of existing systems.

Specification of the Generic Ontology
The purpose of this ontology is to specify the concepts, attributes, relationships and the variables necessary for the representation of a pedagogical objective that meets a given author's need in a given field. The idea is that this ontology will make it possible to represent the essential elements of an educational resource in order to be able to search in any learning platform: OER, MOOC or e-Learning platform. It will be responsible for ensuring the indexing of educational resources, regardless of the specificity of each environment. First, it will represent the educational content but later, we will also represent the structures to ensure both the reuse of the resource and also the enrichment of a database of use cases of each resource.

Conceptualization of the Generic Ontology
In this section we will first identify a glossary of terms, concepts, instances and attributes frequently used in the target ontology with a description in natural language. Table 2-shows an extract of the concepts identified: On the other hand, to be able to integrate these concepts in their contexts, it is necessary to proceed to hierarchize them.

Description of the Process of Our System
Each educational resource in e-Learning, MOOC or OER is characterized, at least, by a goal and keywords. Indexing is a step that is essentially based on these two elements and aims at structuring resources in a way that they can be easily found at the launch of the research.
Then, the producer can carry out a search. In function, of the elements introduced, the extraction returns the pedagogical resources of the various platforms having a similarity with the educational objectives fixed by the author. The ranking of resources will be according to the chosen method: Decimal or universal decimal.
The principle of the search function or similarity is designed to launch the search in the three environments studied. In order to improve the quality of the search, we must carry it out by the most common elements by the three environments or else we must find matches between the structures.
The Fig. 6 describes the identified structure of the research of our global ontology via the PROTEGE tool.  Once finished, a human intervention by a domain expert is mandatory in order to identify the resources that perfectly meet the need already expressed initially by the author. In order to refine the result, we can sort, based on the use of the resource, its tags or comments or annotations, if they are exploitable, but human intervention is still essential. Indeed, the author who is also a pedagogue is the only actor able to measure the utility of the resource found for its context regardless of its order in the result of the research. Once identified, the resource found and deemed relevant will be reused when creating pedagogical content by the author. For us this is a new case of use of the resource. For this, we enrich a resource use bank by this use case. A use case is a new successful experience of the resource. This can be considered as an indicator for content producers. The search for resources can focus on these already reused and enriched resources in the future. Searching in an environment that contains already used resources that have already been a use case will yield some very satisfying results for the author.

Verification of the Coherence of the Ontology
We have implemented our ontology via the PROTEGE language. We chose it because it represents a modular interface allowing the edition, the visualization, the control, the extraction from a textual source and the semi-automatic fusion of ontologies. Figure 7 shows the design of our global ontology via the PROTEGE tool.
The great interest of the use of PROTEGE is to be able to check if the created ontology is coherent and does not contain definitions which can be contradictory. This verification can be so easy if the created ontology is also simplistic. It can even be checked manually. However, its verification is practically impossible for fairly complex ontologies.
In our case and in order to verify this consistency, we will illustrate the control by creating a class that would be both a comment and a Tags.
We will add a new class named CommenTag under owl: Thing and set SubClass Of the expression Comments and Tags. The CommenTag class will then be placed under both Comments and Tags.
The PROTEGE editor proposes a tool capable of checking the coherence of the ontology. To launch it, we first have to choose the HermiT option under the Reasoner menu. Then we will launch this tool via the option Reasoner/start Reasoner. On the Class hierarchy (inferred) tab, we obtain all the anomalies encountered by this tool that can be corrected and start the synchronization of the ontology via Reasoner/Synchronize Reasoner to correct anomalies of the ontology.
The following Fig. 8 shows the placement of the created class:

Conclusion
In this study, we presented the architecture of the case-based content production system. Then we reviewed the different standards and their contributions for indexing resources. In this reading we focused on the most famous models. After, we developed the basic principles of indexing pedagogical resources in different platforms. For this, we approached the standardization and the concept of ontologies before synthesizing. Then, we proposed an ontology conception that will serve as global ontology for the indexing and the research in the platforms object of our study: OER, MOOC and e-Learning. Finally, we validated and verified the consistency of our design through the test proposed by the publisher PROTEGE. We consider that this contribution goes in parallel with the progress of the field of artificial intelligence development.
We believe that what we have achieved is a critical step in moving to the next level of the research, which is in progress, with degrees of similarity, in the three learning environments. This work should also be complemented and enriched by the analysis of internationally recognized standards in the field of MOOCs and OERs in order to see from a structural point of view what are the similarities that can be identified in order to further improve the content production quality for authors. Admittedly, the intervention of the pedagogue in the process is of great importance, but the quality of the research is also important and makes it easier for the author to produce the pedagogical content.

Acknowledgement
This work is supported by its authors. It was carried out in the SSL laboratory at the National School of Computer Science and Systems Analysis -Mohammed V University -Rabat -MOROCCO.

Author's Contributions
Najima Daoudi: Conceptualization: Ideas; formulation or evolution of general research objectives and goals.
Methodology: Development or design of methodology; creation of models.
Validation: Verification, whether in the context of the activity in relation to experiments and other research results.
Brahim Faqihi: Writing -preparation of the original project: Creation and presentation of a draft article, in particular the writing of the initial project.
Imane Hilal: Writing-revision: Revision and presentation of the work carried out within the original research group, in particular critical review, comment or revision -including the stages of pre or post-publication.
Visualization: Preparation, creation and/or presentation of the published work, in particular visualization/presentation of the data.
Rachida Ajhoun: Project administration: Management and coordination responsibility for the planning and execution of research activities.