当前位置:首页 >> 互联网 >>

OntBot Ontology based ChatBot


2011 Fourth International Symposium on Innovation in Information & Communication Technology

OntBot : Ontology based ChatBot
Hadeel Al-Zubaide, and Ayman A. Issa
Abstrac1— A new ontology based approach is proposed to model and operate chatbots (OntBot). OntBot uses appropriate mapping technique to transform ontologies and knowledge into relational database and then use that knowledge to drive its chats. The proposed approach overcomes a number of traditional chatbots drawbacks including: the need to learn and use chatbot specific language such as AIML, high botmaster interference, and the use of non-matured technology. OntBot has the additional power of easy users interactions using their natural language, and the seamless support of different application domains. This gives the proposed approach a number of unique scalability and interoperability properties that are going to be evaluated in future phases of this research project.

I. INTRODUCTION n recent years the development of ontologies has been moving from the realm of Artificial-Intelligence laboratories to the desktops of domain experts. Ontologies have become common on the World Wide Web and a trend of modeling in Information Systems development where we can get use of the great benefits provided by them. Ontologies are human readable, comprehensive, sharable and formal which means that they are expressed in a language that has well-de?ned semantics. Ontologies are important to application integration solutions because they provide a shared and common understanding of data that exist within an application integration problem domain. Ontologies also facilitate communication between people and information systems. However, while today there is an unprecedented wealth of information available on the Web, to fully realize the power of ontologies and to enable e?cient and ?exible information gathering, persistent storage of ontologies and its subsequent retrieval is of paramount importance. From the other side, the relational database technology has ensured the best facilities for storing, updating and manipulating the information of problem domain. The relational database also has proved its capabilities to cope with large amounts of data [1], these data can be represented by the ontologies themselves somehow. In addition, features of relational databases management systems (e.g. transaction management, security and integrity control) make it much more preferred when compared to traditional file systems. This represents one of the strong motivations behind our proposed approach to utilize relational databases as a good storage
1

I

candidate. A number of tools and techniques do handle the issue of mapping one of them into the other; ontology to relational database and vice versa. One of the main issues currently facing such a huge amount of ontologies stored in a database is the lack of easy to use interfaces for data retrieval, due to the need to use special query languages or applications. Currently, users who wish to utilize ontology repositories need to know the contents of the ontology, which means understanding OWL (Web Ontology Language) or RDF (Resource Description Framework), and know how to query these ontologies using one of the ontology query languages, e.g., SPARQL. Such requirements are some of the major reasons the Semantic Web has not become mainstream as fewer than expected users are utilizing such knowledge. Chatbot is a computer program that interacts with users using natural Languages [2]. Chatbot systems allow to realize simply a dialogue system based on natural language. Therefore, they can be used as interfaces to a vastness of applications including entertainment applications, educational applications, e-learning platforms, research engines, and ecommerce web-site navigation. The usefulness and complexity of ontology based data retrieval, features of relational databases management systems, the lack of easy to use query interface, and chatbot features have directed this research to investigate the possibility of having a usable ontology based query interpreter and responder chatbot. In this paper, an ontology based chatbot (OntBot) is proposed to provide an easy to use, domain independent, scalable, dynamic and smart conversational agent. In the proposed approach, the ontology will be firstly converted into relational database as basis for a strong chatbot knowledge base. Sections 2 and 3 present some background and related work, respectively. Section 4 demonstrates the details of the proposed approach. The conclusions and future work are outlined in section 5. II. BACKGROUND This section describes necessary components that are not the main contributions yet are important for the proposed approach. These are chatbot, ontologies, and the mapping of ontologies to relational database. A. Chatbot A Chatbot (or chatterbot) is a type of conversational agent, a computer program that designed to simulate an intelligent conversation with one or more human users via auditory or textual methods. This computer programs are also known as

Hadeel Al-Zubaide, Department of Computer Science, Princess Sumaya University for Technology, Jordan (H_alzubaidi@yahoo.com). Ayman A. Issa, Software Engineering Department, Philadelphia University, Jordan (aissa@philadelphia.edu.jo).

978-1-61284-675-0/11/$26.00 ?2011 IEEE

7

Artificial Conversational Entity (ACE) and though many appear to be intelligently interpreting the human input prior to providing a response. Most chatterbots simply scan for keywords within the input and pull a reply with the most matching keywords or the most similar wording pattern from a local database. The intelligent behavior of a chatbot is depicted by the nature of its responding to human in such a way that human gets convinced that he is chatting with a human instead of a computer program. The degree of intelligent behavior is depending on the knowledge base ( The information that the bot knows), poor ones lead to weak chatbot responses while strong ones do the opposite. Such strong knowledge bases may require years to be created. Most chatbots rely on fairly simple tricks to appear lifelike. Richard Wallace, producer of the top-ranked Chatbot ALICE [3] (Artificial Linguistic Internet Computer Entity), has handwritten a database of thousands of possible conversational gambits. ALICE software utilizes AIML (Arti?cial Intelligence Mark-up Language), an XML-like language designed for creating stimulus-response chat robots. ALICE chatbot’s knowledge base is composed of question-answer modules, called categories and structured with AIML. The model of learning in ALICE is called supervised learning because a person, called botmaster, plays a crucial role. The botmaster monitors the robots conversations and creates new AIML content to make the responses more appropriate, accurate, or believable. Many chatbots have been deployed in a strictly limited domain to seek information, site guidance, and FAQ answering. Most existing chatbots consist of dialog management modules to control the conversation process and chatbot knowledge bases to response to user input. Typical implementation of chatbot knowledge bases contains a set of templates that match user inputs and generate responses. Templates currently used in chatbots, however, are hand coded. Therefore, the construction of chatbot knowledge bases is time consuming, and difficult to adapt to new domains. The proposed OntBot approach does not utilize AIML, rather, it is being built using general programming language such as VB.Net. In addition, relational database will be used to stor OntBot’s knowledge in stead of files. Further, the source of this knowledge is the WWW ontologies that will be automatically converted to entities capable of being stored in the database. Therefore, there will be no need to that hand writing. This shows that OntBot will not need special AIML nor knowledge archiving skills which overcomes the main drawbacks of traditional chatbots. B. Ontologies Ontology is the key enabling technology for the semantic web. An ontology is a specification of a conceptualization Error! Reference source not found., it can be viewed as a vocabulary used to describe a world model in the semantic web. The main purpose of an ontology is to enable

communication between computer systems in a way that is independent of the individual system technologies, information architectures and application domain. Ontologies are human readable, comprehensive, sharable and formal which means that they are expressed in a language that has well-de?ned semantics. Ontology population has been identified as a key enabler of practical semantic applications in industry. When an ontology is populated, it will contain not only the schema or definition of the classes/concepts and relationship names but also a large number of entities that constitute the instance population of the ontology. Another important factor related to the population of the ontology is that it should be possible to capture instances that are highly connected (i.e., the knowledge base should be deep with many explicit relationships among the instances). In our suggested approach, ontologies will represent the source of knowledge that OntBot knows. Regardless of ontology domain, it should be firstly mapped and transformed into relational database as will be explained in the next section. C. Ontology to Relational Database Mapping In order to store ontologies data and execute queries on that data in databases, several alternatives are exist such as storing them in relational, object or object-relational. Storing ontologies in relational databases is less straightforward than storing ontologies in object or object-relational databases, because relational database management systems do not support inheritance. However, relational database management systems have significant advantages over object or objectrelational database management systems. In particular, relational database management systems provide maturity, performance, robustness, reliability, and availability, and that’s what pushed us to go with the relational database option. Several studies [10, 11, 12] have been conducted regarding the mapping between ontologies and relational databases. It is out of our scope to go in details with each study since the proposed approach will work over the result of this mapping and go further in its process. Transformation of ontologies to relational databases is based on a set of rules called mapping rules that specify how to map constructs of the ontological model to the relational model. The mapping rules are then applied to an ontology to produce a relational database. Figure 1 shows the flow of these transformation. In OWL, a class can be regarded as a relational table. Properties of a class can be regarded as the attributes of a relational table. Inheritance relation between classes can be realized by the foreign key between relational tables [13]. Transformation of ontology into relational database includes many transformation series. During the process of transformation, the first step is the ontology classes that are transformed into relational database tables, then the transformation of ontology object-properties into relational database, When OWL classes are mapped to tables, object properties may be transformed into relational database

8

relations. After this, the transformation of ontology data type properties into relational database data columns occurs. Finally ontology constraints are transformed relational database into metadata tables [12].

Traditional chatbots are domain dependent, the botmaster is responsible on statically handwriting thousands or more of possible conversational scenarios, questions and their corresponding answers according to specific domain to fool human of being lifelike. A new chatbot should be created and separate scenarios have to be built by botmaster each time a domain changes. The power of OntBot relies on its capabilities of handling different domain knowledge automatically without the need of human intervention; Whatever the type of ontology files that will be mapped, after the conversion into database tables, OntBot should be able to handle different questionsanswers scenarios by itself without the need to predefine them.

Fig. 1: Transformation of ontology to relational database.

III. RELATED WORK A practical approach for enhancing a language independent conversational agent for question answering using Semantic Web knowledge has been developed [14]. It represents a cascade type architecture that is divided into several components. The ?rst component is a chatbot built on top of AliceBot, it relies on converting Semantic Web knowledge to AIML format, which is motivated by the work of Freese [15]. The second component is the ability to build answers from the ontology, by parsing and categorizing the user’s input. This approach uses XML ?les to store the generic patterns for each question domain they pre-defined (They predefined 6 types of questions). The patterns are generated by querying the ontology graph with Protege API and then replacing the template tags de?ned in the language ?les. The process of ?nding an answer by querying the ontology is done in three steps. IV. THE PROPOSED APPROACH : ONTBOT This section discusses the proposed approach with figure 2 showings the general architecture of OnBot. A. OWL to Database Mapping This part represents a novel step in chatbots world where OntBot will depend on any of the available ontology to relational database mapping techniques to get its knowledge. The transformed ontology will be stored in tables to form the base of chatbot knowledge that will be manipulated using an inference engine to suit chatbot needs and methodologies.
Fig. 2 : OntBot Architecture.

B. Ontbot Knowledge base Botmasters use AIML for creating the questions and answers in the form of categories and store the resulted scenarios in AIML files. These files represent the knowledge base of the chatbot [8, 9]. At the opposite of OntBot, the

9

knowledge base here will contain the resulted mapped ontological tables. This will empower our chatbot with all the facilities and advantages of the database and DBMS over the file system. OntBot knowledge base could be integrated with any existing database. It may also contain extra optional tables. These tables could be predefined ones or added at any time, they may contain scenarios to handle greetings, out of scope topics, session information to be managed or used later by botmaster. C. Natural Language Processing Module This module is responsible on processing user input in a way that facilitates the mission of getting the needed answer. Several functionalities should be taken into account before start searching for a match in OntBot knowledge base and then succession in getting the best answer. Such functionalities include input tokenization, stopper filtering, stemming and synonyms handling. Figure 2 illustrates the four functionalities and their sequence.

match between words' roots (from both sides, input word and stored word to be matched with) and thus, if the entry in our database that is coming from the WWW and resulting from ontology mapping is in its present tense while user input writes it in past tense, it will not be considered as a mismatch. There are many available stemming algorithms [9] for example, OntBot can employ one of the most effective and widely used stemming algorithms known as Porter stemming algorithm (or ‘Porter stemmer’). Synonyms/alternatives of a word are obtained in order to be checked. This is required if there is no match found between user input token and a stored word one, in this case OntBot will assume that the user may use a synonym words of the stored one so it will get them all. These synonyms will be in turn considered as new entries and one by one searched against the target stored words. If again no matching found, the word will be considered as a mismatched one. WordNet [16] is going to be used to find words alternatives. D. OntBot Inference Engine This is the main component of the architecture. It represents the brain of OntBot. The input to this module will be the normalized user question that results from Natural Language Processing Module while the final output will represent the target answer. This answer will be then forwarded to the next module which is to prepare and decorate it for final presentation. Three sub-modules, as shown in figure 4, formulate the Engine: Scope Specifier, Rule Matcher, and Query Processor.

Fig. 3: Natural Language Processing Module.

Tokenization, or splitting the input into words, is an important first step in the decision of natural language processing. It involves some operations that is necessary to facilitate the target matching process. A set of splitters will be used in order to break down user input into tokens. Such splitters include: space, punctuations, special symbols and others. Both user input and stored words will be tokenized. Stopper Filtering is the phase that refers to the process of removing set of fluff words that may exist in user input. The stop list consists of a list of common function words such as determiners (a, the, this), prepositions (in, from, to), conjunctions (after, since, as), coordination (and, or), Also those words which occur more frequently but contribute little meaning like about, them, only etc. Word stemming is an important feature especially when we talk about indexing and search systems. It is also could be thought as a fault tolerant technique in our case since we will

Fig. 4 : OntBot Inference Engine.

1) Scope Specifier: This module attempts to act like human in understanding about what user talks, it tries to specify the scope of a question in order to be able to handle the conversation in a right way and source the next module of all needed information regarding that scope. When it finds that user talks randomly or out of scope, it will help user to get to the point by notifying him about the domain of the conversation and then he can ask

10

OntBot for more details on how to get benefit from all that as a secondary service provided by this module. Technically speaking, having the normalized user input tokens that are resulted from the previous phase; Natural Language Processing Unit, this module will rely on Ontbot’s knowledge base (data and/or meta data) to find any related match with user input, whether it’s a match with table name, attributes or specific instances. If a match is successfully found, Scope Specifier will feed the next module, Rule Matcher, with the needed information to aid it in continuing the job of getting the needed response as what we will see next. 2) Rule Matcher After specifying the scope of the conversation and getting the needed information from the previous module, Rule Matcher will try to find a matching rule given the normalized user input tokens as another input. Rule Matcher will perform some manipulation over both inputs to form the basis that it will depend on when searching for appropriate rule. Such rules simply should be in the following form: Question Format => Query Format [That = Value]. The right hand side of each rule represents the possible question’s style users may ask. Questions that Ontbot can handle will range from simple to complex one. Complexity degree reflects how deep or detailed the question is and then how complicated the corresponding query will be. User questions will be matched against questions in this right hand side of a rule. The left hand side represents the corresponding query that should be passed to Query Processor to execute if match is found while the last part: [That = value] will be used as what it is already used for AIML in traditional chatbots to enable the bot of remembering what it said in the previous interaction so that conversations can become more meaningful and humanly. OntBot will keep track of the current output of Scope Specifier module; table names, attributes and instances, so that when user uses pronouns to refer to one of them in its next question Ontbot will use it to understand and go further in the conversation as explained in the following example. User: who is Eric? OntBot: He is Doctor. User: How old is him? (user used him to refer to Eric) OntBot: 31. (OntBot got it) Simple questions domains are direct questions. Such questions can be definition questions, measure questions, list questions, comparisons ones, yes/no and many others. Complex questions domains include any questions that need extra analysis or indirect manipulations. Examples of complex questions include those that are mathematical based or the one that need to access more than one table to get the needed answers. Here, we can get the benefits of all the facilitations of Relational Database and its DBMS. Sample example rules are shown in Table 1 below.

OntBot rules will be dynamic in terms of the way they will be defined and the domain they will cover. The botmaster of OntBot can define new rule each time he figures out new suitable one, he can dynamically add any kind or number of rules from simple to complex one at any time as long as they could be translated to a real SQL query somehow, what specializes OntBot is the ability to deal with various questions domains as long as it could be answered from the available knowledge in OntBot knowledge base, while the other ontology based chatbot is limited to fixed predefined questions domains [14]. 3) Query Processor This module will be responsible of the actual physical queries execution. It takes the right queries from the Rule Matcher, check their correctness before executing them against OntBot knowledge base. Retrieved results are then passed in a suitable and readable way to the Answer Formalism Unit that will take care of displaying readable and understandable answers to the user.

E. Answer Formalism
Before displaying answers to users, its vital to ensure that they are readable, errors-free whether they are spelling and/or grammatical errors. The way results will be displayed to users should also be friendly and close to na?ve user understanding especially that these answers simply represent tables entries. We may need to specify the pluralization of a word. We can use JBoss DNA that is implemented in Java for finding the plural of a given word. Also we can use MorphAdorner, implemented in Java with Verbix that is an online conjugation resource to perfom verb conjugation. Verbix can be accessed from within code by sending HTTP requests and parsing the result. Finally, if we need to know the gender of a word, male or female so that we can form the answer, we can use WordNet tool for this purpose. V. CONCLUSION A new approach to develop ontology based chatbot (OntBot) is proposed in this paper. In OntBot, the ontology should be mapped first into relational databases automatically to form its knowledge base. Users can interact with OntBot easily using their natural language so there is no more need to learn any query language or to know about the contents of the underlying ontology. OntBot provides friendly, easy to use and efficient user interface. OntBot will be ontology-portable, it will represent a plug-in component that can be replaced with another ontology that models a completely different domain without any change in the system and that is what makes OntBot more dynamic and flexible. OntBot botmaster can extend the capabilities of OntBot’s brain by defining new rules whenever he wants which will increase the range of conversations OntBot can handle.

11

Question Format What is Xi ?

TABLE I Inference Engine’s Sample Rules. Scope Specifier Output Query Xi : Instance in Table Y that have A and B Select B from Y where A = Xi attributes. Xi belongs to A. Ex. Stack : instance in Table Data Structures that have Name, Definition as Attribute. Stack belongs to Name Attribute. Y : Table Name. Ex. Y : Data Structures Table that have Name and Definition as Attributes. Y : Table Name that have A, B ,C and D Attributes. Xi : instance. Ex. Y : Employee Table that have Name, Address, Job Title, Salary and Age as Attributes. 700 belong to Salary Attribute. Y : Table Name that have A, B ,C and D Attributes. Xi : instance. Ex. Y : Employee Table that have Name, Address, Job Title, Salary and Age as Attributes. London belong to Salary Attribute. Ex. Select Definition from Data_Structures where Name = “Stack”. Select Count (Name) from Y P.S. A is the PK of Y. Ex. Select Count(Name) from Data_Structures. Select A,B from Y where C > Xi

That Xi

Ex. What is Stack? How many numbers of Y are there? Ex. How many number of data structures are there? List me all Y’s As and Bs who have C more than Xi? Ex. List me all Employee’s Names and Ages who have Salary more than 700$ ? If there is any Y where A is Xi, display B Ex. If there is any Employee where Address is London, display the name ?

Stack Y Data Structures Y, A, B, C and Xi Employee, Name, Age, Salary and 700 Y, A, B and Xi Employee, Name, Address and London.

Ex. Select Name,Ages from Employee where Salary > 700 Select B from Y where exist (select A from Y where A = Xi ) Ex. Select Name from Employee where exist (select Address from Employee where Address = “London” )

A prototype of OntBot is being developed and evaluated against a number of application domains to demonstrate its generalizability aspect. Further, user evaluation is being planned with associated users to the research project. REFERENCES [1] Goodwin, R., Lee, J., and Stanoi, G., (2005). M.I. Leveraging.
Relational Database Systems for Large-Scale Ontology Management

[9]

[10] [11] [12] [13] [14]

[2] [3] [4] [5]

in Proceeding of CIDR Conference Abu Shawar, B. and Atwell, E., (2007 ). Chatbots: Are They Really Useful? in Proceedings of LDV-Forum 2007 Band 22 (1) pp.31-50. ALICE, (2011). A.L.I.C.E. Artificial Intelligence Foundation [online]. Available from: http://alice.pandorabots.com/ [Accessed 05/25/2011]. Freese, E., (2007). Enhancing AIML Bots Using Semantic Web Technologies in Proceeding of Extreme Markup Languages Falbo, R., Menezes, C., and Rocha, A., (1998 ). A Systematic in Proceedings of 6th Approach for Building Ontologies
IberoAmerican Conference on AI, number LNCS1484 in Lecture Notes in Arti?cial Intelligence Lisbon, Portugal pp.349-360. Falbo, R., Guizzardi, G., and Duarte, K., (2002). An Ontological in Proceedings of 14th Approach to Domain Engineering International Conference on Software Engineering and Knowledge Engineering (SEKE’02) Ischia, Italy pp.351-358. Liao, L., Qu, Y., and Leung, H., (2005). A Software Process Ontology and Its Application in Proceedings of Workshop on Sematic Web Enable Software Engineering (SWESE) Galway, Ireland Ruy, F., Bertollo, G., and Falbo, R., (2004). Knowledge-

Basedsupport to Process Integration in ODE CLEI Electronic Journal, 7 (1), Abran, A., Cuadrado, J., Garc? ?a-Barriocanal, E., Mendes, O., S? anchez-Alonso, S., and Sicilia, M., (2006). Ontologies for Software Engineering and Software Technology. Berlin Heidelberg: Springer-Verlag. Gali, A., Chen, C., and Kajal, T., Claypool and Rosario UcedaSosa. From Ontology to Relational Databases [online]. Hawthorne, NY 10532, U.S.A.: IBM T. J. Watson Research Center. Astrova, I., Korda, N., and Kalja, A., (2007 ). Storing OWL Ontologies in SQL Relational Databases in Proceedings of World Academy of Science, Engineering and Technology Vysniauskas, E and Nemuraite, L., (2006). Transforming Ontology Representation From OWL TO Relational Database Information Technology and Control , 35 (3A), ZHUGE, H., XING, Y., and SHI, P., Resource Space Model, OWL and Database: Mapping and Integration [online]. Beijing, China.: Chinese Academy of Sciences. Alexandru Dobrila, t., (2010). From Semantic Web Knowledge To A Functional Conversational Agent: A Practical Approach.

[6]

[online]. Available from: http://airtudor.com/ semantic_chatbot.pdf [Accessed 05/21/2011]. [15] Fern?ndez, M., Gómez-Pérez, A., Pazos, J., and Pazos, a., (1999).
Building a Chemical Ontology Using MethOntology and the Ontology Design Environment IEEE Intelligent Systems Applications,

[7] [8]

4 (1 ), pp. 37-45. [16] WordNet, (2010). WordNet Project [online]. WN Team. http://wordnet.princeton.edu/ [Accessed Available from: 05/21/2011].

12


相关文章:
更多相关标签: