Educational Advertising Ontology: A Domain-Dependent Ontology for Semantic Advertising Networks

: Problem statement: Currently advertising networks connects Web site owners (Publishers) that want to host advertisements with advertisers who want to run advertisements. Advertising networks’ reliance on only the keywords in the content without an accurate interpretation of the context of the page, results in displaying irrelevant and unappealing ads on the web page. Approach: Ontologies provided a shared and common understanding of a domain that can be communicated between people and across application systems. Our objective was to create a domain-dependent Ontology to play a major role in supporting information exchange processes in semantic advertising networks. Results: Results for the prototype of matching ads with publishers had been presented in terms of precision and recall. High precision was shown and analysis of results was given in detail. Conclusion: The proposed Ontology is effective for advertising networks at a semantic level.


INTRODUCTION
Ontologies play a major role in supporting information exchange processes in various areas (Fensel, 2001).Many definitions of Ontologies have been given in the last decade.Ontologies were best defined by (Gruber, 1993): Ontology is an explicit formal specifications of the terms in the domain and relations among them.Ontologies were developed in Artificial Intelligence to facilitate knowledge sharing and reuse.Recently the use of Ontologies moved from Artificial Intelligence laboratories to real-world applications.
Ontologies applied to the World Wide Web creating what is called the semantic web (Berners-Lee, 2000).General-purpose Ontologies were developed such as WordNet (Fellbaum, 1998) and UNSPSC (www.unspsc.org).General-purpose Ontologies contain mistakes in specialized domains (McCrae and Collier, 2008), causing some researchers and applications to construct their own domain-specific Ontology.Many disciplines have developed standardized Ontologies where domain experts can use to share and annotate information in their fields, such as in Medicine SNOMED (Price and Spackman, 2000) and the semantic network of the Unified Medical Language System (Humphreys and Lindberg, 1993).Both general-purpose (domain-independent) and domaindependent Ontologies served the same purpose, that is developing a common vocabulary for application to share information.
One of the main application areas of Ontology technology is Electronic Business (e-commerce).Ontologies have been used in e-commerce and by many providers such as Amazon (www.amazon.com)and eBay (www.ebay.com)for categorizations of products for sale and their features.Advertising networks represent the most sophisticated application of Internet database capabilities to date (Laudon and Traver, 2008).In advertising networks or ad networks, different advertisers need to extract information from different Web sites, called publishers."Ad networks are companies that pay software developers as well as web sites money for allowing their ads to be shown when people use their software or visit their sites" (Wikipedia, 2009).In advertising networks, software agents are used to extract information to identify target publishers to publish their ads on (http://www.emarketer.com).A common use of Ontologies is necessary for sharing common understanding of the structure of information among people or software agents (Musen, 1992;Gruber, 1993).
In addition, the Ontology serves the purpose of separating the domain knowledge from the operation where programs can be implemented independent of a product (McGuinness and Wright, 1998).In this study we develop a domain-dependent Ontology of advertising in the education domain.This Ontology can then be used as a basis for other applications.One application could be building online shopping aggregators, another is online retail stores.
It is impossible to cover all the issues that an advertising network may need to deal with in one Ontology.Instead, we try to provide an initial Ontology specialized in education advertising domain.Enabling reuse of domain knowledge was one of the driving forces behind recent surge in ontology research (Noy and McGuinness, 2000).Others can simply reuse the developed Ontology and extend it to describe their domain or serve their application of interest.

Semantic-based advertising networks:
In the early years of ecommerce, firms placed ads on few popular Web sites in existence.With the increased number of Web sites, firms do not have the capability to place ads and marketing messages on the millions of sites that exist now and monitor them.Ad networks, such as GoogleAdSense (https://www.google.com/adsense),DoubleClick (DoubleClick.com)and aQuantive (http://advertising.microsoft.com/aquantive),appeared to assist forms take advantage of the potential marketing opportunities on the Internet.
Context-based Ad Networks presents user with banner advertisements and marketing messages based on the publisher's context.Nevertheless, it is a mistake for these Ad networks to think that users are willing to sift through web pages before they stumble on appealing ads.Ad networks can only be effective when they are able to bring resources and products that are of interest to the web user.However, ad networks' reliance on only the keywords in the content without an accurate interpretation of the context of the page, results in displaying irrelevant and unappealing ads on the web page (Allemang and Hendler, 2008).This is unfortunate because with the amount of information on the Internet continuing to increase at an enormous rate, it is imperative that businesses, organizations and associations find better approaches for reaching and attracting people who use computers for shopping, browsing the Internet, or interacting with others.
In most cases, the Web is a vast set of static and dynamically generated Web pages linked together.Pages are written in HTML, a language that is useful for publishing information intended only for human consumption.Humans can read Web pages and understand them, but the inherent meaning is not available in a way that allows interpretation by computers (Allemang and Hendler, 2008).Unlike humans, computers do not possess a range of vocabulary understanding.The computer itself cannot understand the information; it cannot read, see relationships, or make decisions like a human can.People see connections between different words and concepts and infer meanings based on contexts.
The introduction of semantics into the Web will solve some of Advertising Networks' failed attempts at reaching target audiences.Briefly defined, semantics is a field that studies the meaning of words, phrases, sentences and larger units of discourse.The Semantic technology will improve the current Advertising Networks by adding semantics into Web Sites and Ads, thereby enabling machine-to-machine exchange and automated processing in a way that computers can understand.The added Semantics is expressed as structured information that can be read and understood by computers (Wilson, 2009).However, in order to understand what words mean and what the relationships between words are, it is not enough to upload a dictionary or a set of encyclopedias and let the computer learn all this on its own.A computer has to have documents that describe all the words and logic to make the necessary connections (Wilson, 2009).
These tools pose a paradigm shift in advertising optimization because Web 3.0 technology will be able to discover more accurate information on the Web through proper interpretation of context.Whereas Web 2.0 emphasizes factors like keyword matching, Web 3.0 will tap into more important elements like understanding how an audience thinks and behaves online and latent or hidden relationships between ideas and the ways people express those ideas (Marshall, 2009).
With the aid of Semantic technology, the corporate community stands to benefit by spending less energy, time and money pursuing the wrong prospects and marketing to the wrong channels.The goal of the developed Ontology is to increase the precision of ad networks' matching results by; developing a contextbased ad network that incorporates the semantic Web technology and importing an ontology that describes the concepts and relationships from an educational perspective.
Currently, only a few advertising networks utilize semantic technology.Two examples are PEER39 (https//www.peer39.com/)and iSense, both developed in 2009 (http//www.isense.net/).The former has built technology that automatically connects content to advertisement.It takes into account the meaning of the entire web page instead of portions of it.It references a virtual database of potential meanings and literal connections for the keyword.PEER39 does not deal with RDFs or Ontologies.Instead, it implements the idea of Natural Language Processing (NLP) and Machine Learning which builds algorithms that simulate humans' minds by allowing computers to process and understand human languages, in order to achieve the desired semantics.
iSense, too, moves beyond keywords by analyzing and understanding the entire content and sentiment of the page in order to direct highly relevant advertisements.iSense has extensive machine training to identify relationships between terms.It takes advantage of expert linguistic knowledge to identify the context on any given page.
Ontologies: In the philosophical domain, the word ontology has been defined as the philosophical study of what exists: The study of the kinds of entities in reality and the relationships that these entities bear to one another (Spear, 2006).Philosophers since Plato and Aristotle have been greatly concerned about knowing what exists and how to describe it and once it becomes known that something exists, the next step is to find a place for it among all other things.In the domain of philosophy, this effort is called ontology.
But ontology is not restricted to philosophers alone-computer scientists have also realized its significance.The term had been adopted by early Artificial Intelligence (AI) researchers, who recognized the applicability of the work from mathematical logic and argued that AI researchers could create new Ontologies as computational models that enable certain kinds of automated reasoning.In the 1980's the AI community came to use the term ontology to refer to both a theory of a modeled world and a component of knowledge systems (Liu and Özsu, 2010).
Two types of Ontologies exist: (i) General-purpose Ontologies and (ii) domain-specific or material Ontologies.General-purpose Ontologies aim to provide conceptualizations of general notions.Since vertical applications on the Web are gaining lots of attention lately, domain-specific Ontologies form the majority of Ontologies: They are intended for sharing concepts and relations in a particular area of interest.Communities of practice in many domains have published shared sets of concepts in the form of vocabularies and thesauri (Schreiber, 2008).
Ontology is a formal explicit description of concepts in a domain of discourse (sometimes called classes), properties of each concept describing various features and attributes of the concept (sometimes called slots) and restrictions on properties (sometimes called role restrictions).Ontology together with a set of individual instances of classes constitutes a knowledge base.In reality, there is a fine line where the ontology ends and the knowledge base begins (Noy and McGuinness, 2000).
A further definition of Ontology given by (Smith et al., 2006) constitutes additional important elements for discussion in relation to the word "representational": "An Ontology is a representational artifact whose representational units are intended to designate universals in reality and the relations between them."This definition has two parts.The first identifies an Ontology as a representational artifact consisting of representational units, while the second has to do with what the representational units in such an artifact are intended to refer to or be about.However, representations by themselves are not yet Ontologies.Rather, Ontologies have the important further feature of being representational artifacts.A representational artifact is an entity which makes pre-existing cognitive representations from the minds of its author publicly available (Spear, 2006).Thus an important distinction in developing an EAO is not to accurately represent in a publicly accessible way the cognitive representations or concepts that exist in the minds of the authors, but rather the things in reality that these representations are representations of.
When such a representational artifact is formalized, that is, when such an artifact is expressed in a logical or programming language of some sort, it can be called a "formalized representational artifact."Formalized representational artifacts have the advantage, normally, of being both rigorously formulated and computer implementable (Smith et al., 2006).Thus Ontologies, in the sense of formalized representational artifacts, enable computers to help human researchers deal with the constant growth of information.
The goal of this study is to create order and describe relationships of things important to the web application-namely to build an Education Advertising Ontology (EAO) that will advance the potential for semantic advertising networks' capabilities.Developing the EAO includes; defining classes, arranging the classes in a hierarchy, defining properties, describing allowed values for these properties and filling in the values for properties (instances).

MATERIALS AND METHODS
The Educational Advertising Ontology (EAO) model: EAO was implemented based on WSMO (Fensel et al., 2001) framework for modeling semantic web services.A model driven architecture is used (Miller et al., 2001) and forward engineering approach was adopted where we started by modeling the ontology first and then using this ontology as a domain model to form the basis of the generation of the Semantic Advertising Network.Since the Ontology is the core element, this study was dedicated to discuss the modeling approach that was adopted to cover all possible aspects needed in creating the Semantic Advertising Network that is based on this model.The Ontology contains all needed concepts and logical rules and requirements that form the basis of the application.
The core of any ontology language is its hierarchy of class-declarations, stating for example that University is a sub-class of Institutions.Classes can be defined, which indicates that the stated properties for membership of the classes.Instead of using single types in expressions, classes can also be combined in logical expressions indicating intersection, union and complement of classes.For example: A Phone PDA is both a phone and a Computer and consequently inherits the properties from both these classes.Properties can be declared and range restrictions can be stated as part of a property-declaration, as well as the number of distinct values that a property is allowed to have.Properties can be further restricted by value-type or has-value restrictions.For example the Size of any PDA is restricted to be 1024×600.

Creating an Educational Advertising Ontology (EAO):
The first step taken in constructing the ontology was to explicitly determine its intended domain, to answer the question "what part of reality is this ontology an Ontology of?"Providing an explicit statement of the intended subject-matter of the ontology at the outset helped to focus the effort of constructing the ontology by indicating what principles and information needed to be included while at the same time ruling out other information as un-important and un-necessary for constructing an Ontology of the given domain.
At the beginning, the authors expected the reuse of an existing ontology in the education domain knowledge while implementing the semantic ad network.On the Web, there are more than 1500 available Ontologies in the education domain (swoogle.umbc.edu).Existing education Ontologies proved to be unsatisfactory for the specific purpose that centers on advertising.They were deemed inadequate because they did not enable the reuse of domain knowledge.They had been developed for their own specific purpose.For instance, they are mainly concerned with universities and the internal structure and division of these institutions.They give no consideration to educational concepts that are related to writing instruments such as pencil, notebook, study, or sketch pad.Existing Ontologies have not been developed with advertising in mind.Although there is no complete or all-inclusive ontology, it must be complete in the sense that it can serve its objective, in our case anything in the field of education that can be advertised and ultimately sold.Below is an Education Ontology that emphasizes the classes and sub-classes of a person in a university.There is little here that an advertiser can benefit from Fig. 1.
EAO not only had to focus on education, but had to focus on it in such a way that had to substantially benefit advertising.What was considered to be relevant to the education ontology determined what and how much information about this given domain had to be included.Indeed, one major consideration regarding relevance was that what was relevant to the ontology should be determined by the purpose for which it was being designed.This ontology is designed with two very distinct disciplines in mind-education and commerce.The former has to benefit the latter.
The general method followed in constructing the EAO can be summarized in the following steps: The maximum amount of clarity and precision is maintained throughout each step in the process of identifying and defining concepts and the relations amongst them in the given domain.Such careful and thorough organization ensures all of the information can be kept track of and be understood by computers (Spear, 2006).Ontologies required an ongoing revise.Below we describe the methodology adopted for developing EAO.
Step 1: Listing important terms in the EAO ontology: Developing the ontology started by listing all terms an advertiser would look for to explain a Web site content without worrying about the relations among the terms, or any properties that the concepts may have, or whether the concepts are classes or properties.
EAO relevant terms were gathered from sources like Merriam Webster's Collegiate Dictionary and Merriam Webster's Collegiate Thesaurus.The thesaurus was important because it helped in finding more suitable words for the terms already found.The process gained momentum after subscribing with Visualthesaurus.com,an interactive tool that shows connections between words.We also looked at sites that specialize in stationery and office/school supplies like Staples.com,Officedepot.com,Officemax.comand Myschool.co.nz.These sites proved quite useful because they not only sorted most of the words already found under specific categories, but they provided even more concepts for the education ontology.Nevertheless, the terms and definitions in the initial terminology did not represent the final state of the terms and definitions that were included in the domain ontology.They were rather a first-draft or gloss for the sake of getting the relevant information organized and assembled in a single place.
The next two steps are developing the class hierarchy and defining properties of concepts.Typically, we create a few definitions of the concepts in the hierarchy and then continue by describing properties of these concepts and so on.
Step 2: Define the classes and the class hierarchy: There are several possible approaches in developing a class hierarchy; Top-down, Bottom-Up and combination development process (Uschold and Gruninger, 1996).A combination development process is used in developing the EAO class hierarchy.It is a combination of the top-down and bottom-up approaches: We define the more significant concepts first and then generalize and specialize them accordingly.We can begin with a few top-level concepts such as Classroom_supply and a few specific concepts, such as study or notebook.We can then relate them to a middle-level concept, such as Stationery.
It is both a necessary and important step to seek the advice of domain experts.The goal of consulting a domain expert is to both assist in establishing the maximum amount of clarity, consistency and coherence in the domain information that is to be represented and to determine the relevant relationships among the concepts.Our expert helps to ensure that the ontology is maximally effective in representing the concepts and relationships that exist in the domain that it is intended to be about.For instance, in an ontology of education, concepts include "school," "department," "science," etc., while relations of interest include the relation of has_a when we say "school" has_a "Science_department."Our expert makes the observation that writing "Science" without "department" is incorrect because it would make our education ontology unintelligible-the relation "school" has_a "science" makes no sense.Therefore, this process includes a great deal of fact-checking and extensive consultation with the domain expert in order to develop a logically coherent and unambiguous ontology.
Step 3: Define the properties of classes: Once we have defined the classes and the class hierarchy, the internal structure of concepts must be defined.Most of the remaining terms from previous steps are likely to be properties of these classes, for example, color, price, model and size.For each property in the list, we must determine which class it describes.These properties become properties attached to classes.Ontology editors: Ontology editors help ontology designers and developers to build Ontologies.Ontology editors support the definition of concept hierarchies, the definition properties for concepts and the definition of constraints.Ontology editors provide graphical interfaces and must confirm to existing standards in web-based software development.They enable inspecting, browsing, codifying and modifying Ontologies and supports in this way the ontology development and maintenance task (Fensel et al., 2001).
Form generation: Relevant input form used to transform the input data into relevant semantic representation.In our prototype application, the java library wsmo4j (details see http://wsmo4j.sourceforge.net)was used to access the created Ontology.We have used Java Server Faces (JSF) as our web technology.Figure 4 describes EAO in OWL language.Since each concept is characterized by several properties, these properties have to be further explored by advertises and publishers within the generated form.
For each concept in the generated form a button is added to further specify the related concepts.By clicking such a button, another form with all properties of the previously selected concept is generated.As shown in Fig. 4 a "pen" product is characterized by price and color.Thus, Fig. 5 shows the representation of the concept pen.After the user has filled in all required data a subgraph of EAO is generated and existing axioms are used to validate its correct state.

RESULTS
The proposed Ontology was designed and constructed to describe knowledge of the association rules for the application of advertising in the educational domain.For creating Ontology, the Protégé Ontology engineering environment was used.OWL was used as the output language.
This study focuses on detection of classes and properties.For testing, we have developed an Ontology-based prototype to demonstrate the validity of the proposed Ontology.So the experiments described here used a manually annotated Web sites and ads set of classes and properties.
Experiments were carried out using six educational Web sites.Table 1 shows "Precision" and "Recall" for detection of the semantic content."Actual Num" is the actual number of concepts in entire matches, which are set manually; "True Num" is the number of detected correct matches and "False Num" is the number of false matches.
It can be seen from Table 1 that the precision results of the semantic match are higher than 85.7%, but the recall results are relatively low compared to precision, as we would expect.Based on the above experimental results, we believe semantic advertising networks have considerable potential.

DISCUSSION
In this study, we have described the development of an Educational Advertising Ontology (EAO) to serve Semantic advertising networks applications on the Web.We listed the steps in the Ontology-development process and the unsuitability of general education Ontology for the advertising application.Some of the most important things to remember is that it is not enough to check the domain knowledge, but to also understand the purpose of the Ontology, the types of questions the information in the Ontology should provide answers for, and who its users will be.The potential applications of semantic web in the advertising domain will undoubtedly affect Ontology design choices.An application is as good as the Ontology it is built for.
In order to create a domain-dependent Ontology for advertising network, OWL is used for Ontology description language.The Ontology is constructed using Protégé for demonstrating the validity of the proposed Ontology.

CONCLUSION
Experiments have shown the proposed Ontology is effective for advertising network at the semantic level.Results for the prototype for matching ad with publishers have been presented in terms of precision and recall.High precision but relatively low recall are shown and analysis of results is given in detail.
Future work includes the enhancement of the domain Ontology with more complex model representations and the definition of semantically more important and complex events in the domain of educational advertisement, as well as the use of automatically determined low level features.

Step 4 :
Create instances: The last step is creating individual instances of classes in the hierarchy.Defining an individual instance of a class requires choosing a class, creating an individual instance of that class and filling in the property values.For example, we can create an individual instance BlackBerry to represent a specific type of Phone PDA and supplying data for Size, model and price.

Fig. 5 :
Fig. 5: A snapshot of form representation of EAO properties

Table 1 :
Precision and recall for a semantic advertising network Sample No. Actual Num.True False Precision (%) Recall (%)