next up previous

Knowledge Integration for Building Organisational Memories

Ulrich Reimer
Swiss Life
Information Systems Research Group
CH-8022 Zürich, Switzerland
phone: +41-1-7114061, fax: +41-1-7115007, email:


The paper starts with a discussion of the roles an organisational memory (OM) should play and what kind of knowledge should go into it. We then identify two kinds of integration problems. The first one is concerned with integrating the knowledge bases of different knowledge-based systems employed in an organisation into one physically or virtually unified knowledge base which is to be considered as part of the organisation's OM. The second problem concerns the integration of several representations of the same knowledge with different degrees of formalization, ranging from formally represented knowledge via semi-structured text to plain text. This is an issue because formally represented knowledge, e.g. company regulations, often also exists in textual form, and both representations are needed for different kinds of tasks. It is argued that the two integration problems mentioned can only be solved by making use of a high-level language whose representation constructs are on the conceptual level (in the sense of Brachman) and which covers all representational needs. We argue that such a language can be made easy to use despite its being extremely comprehensive if the representational ontology underlying its constructs is represented explicitly.



It is increasingly acknowledged that knowledge is one of the most important assets of organisations. Especially in industrialised countries with expensive but highly educated employees, products and services must be outstanding in terms of innovation, flexibility, and creativity. A prerequisite for being able to face current and future challenges is the systematic management of the knowledge assets. An advanced knowledge management requires what is called an organisational or corporate memory. It is the central repository of all the knowledge relevant for an organisation. Building up such organisational memories (OM) and making them available to people and application systems with quite converging needs is a big challenge which can only be met by an integration of approaches from various fields of computer science.

There are two major roles an organisational memory can in principle play. In one role it has a more passive function and acts as a container of knowledge relevant for the organisation (including meta-knowledge like knowledge about knowledge sources). It can be queried by a user who has some specific information need.

The second role an OM can adopt is as an active system that disseminates knowledge to users wherever they need it for their work. This second functionality is not just mere luxury but of considerable importance as users often do not know that an OM may contain knowledge currently helpful to them. Furthermore, querying an OM whenever the user thinks it might be possible that the OM contains relevant knowledge is not practical because the user does not always think of querying the OM when it might actually be helpful and because it would be too time consuming (as it interrupts the users primary work and takes time for searching and browsing the OM).

For the OM to be able to actively provide the user with the appropriate knowledge it needs to know what the user is currently doing. Unfortunately, this is usually not the case. How can this be achieved at all in a realistic way? In our view, the only practical approach to achieve this is to give people systems which help them do their work wherever this makes sense. These systems (partly) know what the user is doing and what kind of information she needs and thus can provide her (often implicitly, cf. [Cole et al. 97]) with relevant knowledge she may not be aware of as existing. These systems may be knowledge-based, i.e., have their knowledge explicitly represented in a knowledge base, or are based on document management systems with a keyword-oriented and/or free-text retrieval component. Note that this scenario is not primarily motivated by the idea of having an OM but to provide people with as much support as possible and meaningful. When such systems get installed in an organisation their knowledge bases begin to form an OM - so to speak as a side effect.

A closer analysis of the scenario described above yields the following implications and conclusions:

  1. All knowledge bases of the knowledge-based systems used in an organisation should be part of its OM.
  2.   Certain parts of the OM only come into existence via knowledge-based application systems in the organisation.
  3.   Other parts of the OM do not fall under the above category. They represent knowledge that may be relevant for some user at a future time and can be queried whenever needed. This knowledge is not related with any application system in use and is accessed only by human users via the query interface of the OM.
  4. The knowledge in the OM falling into category gif is the formalized knowledge whereas the knowledge belonging to the parts of the OM mentioned in gif is not or only rudimentary formalized. This is because the enormous effort required for an extensive formalization of knowledge is only spent when it is clear that it will indeed be extensively used. This is usually only the case with an application system that exploits that knowledge. When the formalisation process will become cheaper in future times (e.g., due to employment of automatic text understanding) this situation may gradually change.
  5. A user not only needs to be able to query the less formalized knowledge in an OM but also the formalized ones. The shift in the degree of formalisation should be transparent to the user who accesses all parts of the OM in the same way via a uniform query interface.

From the conclusions above it gets clear that one of the main research problems to be solved is how to integrate the various pieces of knowledge into a coherent OM, and how to ensure its extensibility. Especially the following integration problems arise:

Integration of distinct knowledge bases:

From point gif above follows that the knowledge bases of the various application systems must in some way be integrated to become part of the OM. This can be done physically by making one big knowledge base out of them, or virtually by coupling them via an overall framework. To make things worse, the knowledge bases are typically not disjunct. At least with respect to the terminology there is an overlap, and possibly also with respect to the represented business rules, office tasks, organisational structure, etc. This means that even without the aim of fully integrating these knowledge bases to one OM they should at least be integrated to such an extent that knowledge is not represented repeatedly, avoiding problems with maintenance and consistency.

Integration of several representations of the same knowledge with different degrees of formalization:

It may very well be the case that part of the knowledge also resides in the OM in a less formalized state - typically as semi-structured (hyper)text. For example, company regulations may be given in the OM as text but also in a formal representation, e.g., for use by an intelligent workflow system. It should be possible to link all the more or less formalized versions of the same knowledge together such that different kinds of queries become possible. The query system decides which query to evaluate on which representational form(s). In this paper, we outline a solution to solving both kinds of integration problems and to gradually building a comprehensive OM. As we have realized a knowledge-based system for supporting office work, called EULE2, that already comprises knowledge which should be part of an OM, it is important to have an approach that ensures the integration of various knowledge sources into an OM. We believe that this kind of situation, where certain parts of (not necessarily formally represented) knowledge that should go into an OM already exist, is quite typical. Thus, in the subsequent section (Sec.gif) we give a concise introduction to EULE2, while Section gif motivates the usage of a high-level representation language to build up and maintain the knowledge in EULE2. This high-level representation language then serves as a starting point for solving the integration problems mentioned above for building an OM (Sec.gif). While EULE2 and the basic constructs of the high-level language are implemented the approach to integration is currently in a conceptual phase. Section gif concludes the paper.


EULE2: A Knowledge-Based System for Supporting Office Work

At Swiss Life, as in many other companies, office workers for customer support are no longer specialists dealing with certain kinds of office tasks only, but are becoming generalists who must deal with all kinds of tasks. The work of this new generation of office workers is quite demanding and calls for a better support. For this purpose, the Information Systems Research Group of Swiss Life has developed a knowledge-based system, called EULE2, that aims at providing a user with a maximal guidance in performing office tasks she may not be familiar with.

Figure:  An Office Task Description as the User Sees it

An office task can be visualised as a graph (cf. Fig.gif). Its nodes stand for (a sequence of) actions the user can perform, while its links are associated with conditions that must be fulfilled for the subsequent action to be permitted. The conditions result from the law and the company regulations. An office worker starts work with EULE2 by selecting an office task and entering task-specific data as requested by the system (EULE2 takes most of the data needed from various data bases and does not request it from the user). As long as the office task is not completed each action has one or more possible subsequent actions. From the data given EULE2 decides which path to follow in the graph. However, nothing is done automatically. The control of what to do next stays with the user but she cannot go on to actions that are not permitted. Some of the actions (like generating letters) are performed by EULE2 (possibly delegating it to another application system), the others are done by the office worker, telling EULE2 when they have been completed. Subsequent actions may be illegal, permitted, or obligatory. The office worker may decide to initiate a permitted action, may inquire why an obligatory action must be executed, or may ask why an action is illegal. Finally, the office worker selects an action for execution, thus causing new instances to be created or existing instances to be modified. This leads to a new situation where again one or more alternative actions are possible until a terminal node in the graph is reached.

When we regard the knowledge EULE2 makes use of as being part of an OM then EULE2 makes that OM an active system (with respect to that knowledge) in the sense as it has been mentioned in Section gif: In guiding the user through an office task the system supplies her with exactly that knowledge that she needs at a certain moment (cf. the idea of an ``electronic performance support system'' as discussed in [Cole et al. 97]), namely what to do next and why (the latter only if she is interested to know). Since EULE2 is a system that provides people with knowledge they need and ensures that every user always gets up-to-date knowledge EULE2 serves the purpose of knowledge management.

Figure:  The Architecture of EULE2

To achieve its functionality EULE2 requires the representation of

Each of the three kinds of knowledge requires a representation formalism of its own. The knowledge about the office tasks is represented in a first-order language based on the situation calculus [McCarthy/Hayes 69], and knowledge about concepts and instances in a terminological logic [Woods/Schmolze 92, Reimer 85]. Knowledge about law and regulations is encoded in a syntactically restricted variant of first-order logic where wie distinguish integrity constraints that must not be violated, and so-called auto-corrective integrity constraints which trigger corrective updates when they are violated. The latter are used like deduction rules with a complex condition part (cf. example in Figs. gif and gif). Accordingly, the architecture of EULE2 (cf. Fig.gif) provides for a knowledge base with three sub-components each of which offers its own inference services. As certain inferences needed for EULE2 require a combination of inferences of its sub-knowledge-bases they are integrated to a hybrid reasoning system.

EULE2 had to be integrated with several existing data bases where data needed for the office tasks resides. To this end we mapped the schemas of those data bases to a newly defined, integrated schema. Every relation schema belongs to a concept in EULE2's terminology while the relation tuples are seen by EULE2 as instances of the according concepts. Thus, for EULE2 it is completely transparent which concepts and instances come from one of the data bases and which reside in its terminology component only. The data bases are only read. Since no updates are made by EULE2 we avoid the problem of having long transactions with long locks. Data is changed by an office clerk in her usual way, namely through the application systems that already exist.

A further integration with workflow systems will probably become necessary in the future. We are currently investigating what kind of additional interface EULE2 would need for this. For more details on EULE2 see [Reimer et al. 97].

Figure:  Fragment of the Original Text of the Law SchKG 232

Figure:  Auto-Corrective Integrity Constraint Representing SchKG 232(2)(4)


Employing a High-Level Representation Language for
Modelling the Knowledge in EULE2

As can be seen, the EULE2 knowledge base captures quite some knowledge important to Swiss Life. Via EULE2 this knowledge is made available to an office worker in such a way that she gets always that knowledge offered which is relevant in the current situation. Besides for supporting office work, the knowledge EULE2 has available is also useful for other people and for other purposes, e.g., for tutoring new employees, for inquiring about the effect of certain company regulations on office tasks, or for finding out about past instances of office tasks performed. Reusing the knowledge represented in the EULE2 knowledge base for other systems will be quite hard since the formalisms used have been selected to efficiently support the kind of reasoning that occurs in EULE2. Thus, they may not suit very well the purposes of another application. In order to facilitate reuse and to make building and maintaining the EULE2 knowledge base easier we are developing a high-level representation language (HLL) that abstracts away from the representation formalisms actually used in EULE2. In terms of the representational levels introduced in [Brachman 79] HLL is on the conceptual level while the EULE2 formalisms are on the logical level (only its terminological logic being on the epistemological level). Thus, the representation constructs offered by HLL already introduce certain fundamental concepts, like obligations, rights, regulations, and actions. They are especially tailored to the representational tasks encountered with developing EULE2. The move from the logical level of the EULE2 representation formalisms to the conceptual level of HLL introduces a representation ontology (like the Frame Ontology in [Gruber 92] - not to be confused with a domain ontology) which is reflected by the constructs of HLL. This ontology can and should be formally represented [Guarino et al. 94]gif. The knowledge modelled in HLL will be compiled down into the formalisms actually used in the EULE2 system. Due to its being on the conceptual level HLL offers the following advantages:

Thus, HLL, as it currently exists, is a domain-specific language because it only supports the kinds of knowledge needed for EULE2 and similar systems.


Making Use of the High-Level Language for Tackling the Integration Problem for an Organisational Memory

Integration of Distinct Knowledge Bases

As discussed in Section gif there are two kinds of integration problems with respect to building an OM. One of them is the integration of the knowledge bases of several application systems into one physically or virtually integrated knowledge base which would form a part of the OM. The integration causes a considerable added value due to the following reasons:

The integration of knowledge can only be achieved when it is represented either in the same language or in different languages that can be mapped to each other. Therefore, we intend to take HLL which has already been developed for EULE2 and extend it to a language we can use for representing the OM (the knowledge of EULE2 would then just be a small part in the OM). However, as the inferential requirements can be quite distinct for different application systems their knowledge can only then be uniformly described in one representation language if the language is on the conceptual level (rather than on the logical level), thus abstracting from the low-level representational views which reflect the measures taken for efficient inferences. This is the case with HLL. The extension of HLL to represent other kinds of knowledge as well pushes it more into the direction of a general-purpose language. Still, for a given application system only a certain subset is needed. By specifying the underlying representation ontology explicitly [Guarino et al. 94] the representational impacts of all constructs and their possible interrelationships stay clear (similar ideas underly the meta-modelling approach for customizing modelling languages as e.g. described in [Nissen et al. 96]). Due to such a formal ontology it is for example possible to have generalizations between constructs, like a general construct for representing actions with several specializations of it which serve the specific needs of representing actions in different application systems. The different constructs for representing actions may even be based on different conceptualizations of the world as long as the formal ontology keeps track of this so that a unified view can be generated (which is necessarily more general to capture all different conceptualizations). Different conceptualizations that are not on the level of constructs but affects how knowledge is actually represented can, of course, not be handled.

We think that the resulting language will not be bulky and monstrous because for a certain representation task only a subset is needed. The sum of all subsets which are properly put together via the underlying formal ontology make out the language. The development of such a language is still future work to be done.

Integration of Several Representations of the Same Knowledge with Different Degrees of Formalization

The second integration problem concerns the linking of representations of the same pieces of knowledge in notations that have a different degree of formalisation. We illustrate the need for doing this by the example of knowledge about company regulations which are represented in three different formalisations in the OM (cf. Fig.gif):

Figure:  The Architecture of an Organizational Memory

The OM as outlined in Figure gif additionally contains a content representation of the office tasks and a comprehensive terminology. The content representations of the regulations and the office tasks, as well as part of the terminology also forms the intensional part of the knowledge base of EULE2.

The different representation components of the OM fit quite well into the representational levels of an OM as discussed in [Abecker et al. 97]. On their object level is the primarily interesting knowledge, in our case the terminology, both content representations, and the regulation texts. The content characterization (of the regulation texts and thus, transitively, of the content representation of the regulations) belongs to their knowledge description level. The authors additionally suggest a relevance description level where the task-specific relevance of knowledge is represented so that it becomes possible to actively deliver exactly that knowledge which people need at a given time. This level has (currently) no direct correspondence in the OM architecture of Figure gif. However, the knowledge is implicitly present as part of the office task representations, but not independently on a meta-level.

A wide range of queries concerning company regulations can be posed to the OM. We give just a few examples (cf. with Fig.gif):

  1. The user looks for regulations that deal with how to react in the case of the bankruptcy of a Swiss Life client. To formulate the query the user selects concepts in the terminology given with the OM and sets relationships between them, resulting in a set of concept descriptions. The query is evaluated against the content characterization of the regulation texts.
  2. The user looks for regulations that deal with certain underwriting issues (i.e., when to conclude an insurance contract, possibly with a risk supplement). No appropriate concept can be found in the terminology to formulate the query. Thus, she tries a free-text retrieval on the regulation texts by specifying which words to occur in the text of a regulation.
  3. The user wants to know which kinds of office tasks are affected by a certain regulation. This query is evaluated against the content representation of the regulations and office tasks of the OM. This is a meta-inference on the content representations because the regulations are not used to find out if a given office task instance is to be executed in a certain way but instead the representations must be inspected to find out where there are references from a regulation representation to an office task. Such references are found by identifying which obligation the given regulation would deduce under the proper circumstances and to check which office tasks refer to this obligation in their precondition.

    For some of the office tasks retrieved the user may then want an explanation in what aspects the regulation influences the way the task is to be performed. This request is satisfied with the help of the explanation component of EULE2.

  4. The user wants to find out which regulations concern only one office task (maybe because she looks for possible ways to optimize the office work). This, too, requires a meta-inference on the content representations of the regulations and office tasks.
  5. The user wishes to know which regulations override federal law (this happens in certain special cases where jurisdiction deviates from the literal interpretation of the law). Again, the query is to be evaluated on the content representation of the regulations. A similar query would ask for regulations that are exceptions to other regulations.
  6. The user requests those office task instances of the last three years where a certain regulation was relevant for the way the office task was performed. This query is evaluated against the historized extensional knowledge base of EULE2 where the data of all formerly executed office tasks is kept.

The examples above illustrate that all three formalisations of company regulations are needed to evaluate all the possible queries. They show also that for the evaluation of one query more than one representation may be needed, for example, if a user specifies a regulation in terms that have to be evaluated against the regulation texts, and then, once the intended regulation is found, looks for regulations that are exceptions of it, which requires evaluation against the content representation. To the user it must remain completely hidden against which representation a query is evaluated so that she does not need to know to which representational form to pose the query nor to know all the query languages required. Instead, she always makes use of one and the same query interface. Consequently, there must be links between all the representational forms of regulations, as indicated in Figure gif.

We suggest to establish these links by exploiting a certain feature currently under development for HLL: With respect to law and company regulations it maintains a one-to-one correspondence between the knowledge represented and its original natural language formulation. This correspondence is on the level of subsections when law is concerned, and on the paragraph level for regulation texts. As HLL provides special constructs for certain kinds of complex natural language phrases (like ``tex2html_wrap_inline262 pursuant totex2html_wrap_inline262'', ``tex2html_wrap_inline262 except oftex2html_wrap_inline262'', ``tex2html_wrap_inline262 analogously totex2html_wrap_inline262'') the one-to-one correspondence is in these cases on the sentence level or even below. Having a one-to-one correspondence on the sentence level or below is in general not possible as a certain piece of knowledge may be described in more than just one sentence.

The one-to-one correspondence not only helps the knowledge engineer but is a necessary prerequisite for generating explanations that use phrases of the original text so that the user can more readily see the correspondence of a restriction she encounters in the office task with a certain law or regulation. An ordinary, first-order representation of law and regulations does not allow such kinds of explanations because a first-order representation usually atomizes the statements to be represented so that no correspondence can be seen any more.

HLL's property to link a regulation representation with its textual counterpart can be exploited to keep track of the dependencies between a content representation and a natural language text which describes (part of) the same knowledge. In this way, later changes to the text or the content representation cannot be done by a knowledge engineer without his taking notice of the dependency. Of course, it can by no means be ensured that the text and the formal representation are consistent with each other because that would require a degree of text understanding abilities currently far from being feasible.



We have outlined an approach to creating an OM by integrating the knowledge bases of existing application systems as well as those to be built in the future. To support the integration we advocate to employ a high-level representation language HLL which is used to represent the knowledge in all the knowledge bases. HLL has to be on the conceptual level (according to [Brachman 79]) because only in this way it can abstract from the lower-level inferential commitments made to achieve efficiency. Although HLL would be quite a comprehensive language, for a certain application only certain constructs are needed. The relationship of the various constructs is to be given by a formal ontology so that a mapping between constructs is possible (where meaningful). In this way, we can even support different conceptualizations of the world in different knowledge bases while still using the same representation language.

An HLL representation is compiled into the actual representation formalisms used in the knowledge bases which are usually quite different for the sake of efficiency. The compilation can be different for the various knowledge-based systems.

We have also addressed the need of having the same pieces of knowledge in more than just one representation in the OM. The representations differ in the degree of formalisation, ranging from natural language text to deep, first-order representations. These representations are needed to answer the various kinds of queries that may occur. A query interface to the OM must hide from the user what kind of query is evaluated on what kind of representation. To enable the query system to pick out those representation which is the proper one for the current query the textual and more formalized representations must be linked to each other. We suggest that this is done with support by an extension of HLL, too, which maintains a one-to-one correspondence between pieces of a natural language text and a formal representation of that text.

The first version of HLL is currently being implemented for a system to support office work we call EULE2. Further extensions, especially concerning the support of certain complex natural language phrases, are planned. The ideas of using HLL for building an OM are still preliminary and need further elaboration. This will be done in the context of a new project which has the aim to integrate quite different knowledge sources so that they can be queried via a single user interface. This integration effort is intended to become the starting point for Swiss Life's OM.


I am grateful to my colleagues Jörg-Uwe Kietz and Martin Staudt for their helpful comments on an earlier version of this paper. I also got a lot of comments from the reviewers which, too, helped considerably to improve the paper.


Abecker et al. 97
A. Abecker, A. Bernardi, K. Hinkelmann, O. Kühn, M. Sintek: Towards a Well-Founded Technology for Organizational Memories. In B.R. Gaines, R. Uthurusamy (eds): Artificial Intelligence in Knowledge Management. Papers from the 1997 AAAI Spring Symposium. Menlo Park: AAAI Press, 1997.

Brachman 79
R. Brachman: On the Epistemological Status of Semantic Networks. In: N.V. Findler (ed): Associative Networks. Academic Press, 1979, pp.3-50.

Cole et al. 97
K. Cole, O. Fischer, P. Saltzman: Just-in-Time Knowledge Delivery. In: Communications of the ACM, Vol.40, No.7, 1997, pp.49-53.

Guarino et al. 94
N. Guarino, M. Carrara, P. Giaretta: Formalizing Ontological Commitments. In: Proc. 12th National Conf. on Artificial Intelligence, 1994, pp.560-567.

Guarino 95
N. Guarino: Formal Ontology, Conceptual Analaysis and Knowledge Representation. In: Int. Journal of Human and Computer Studies, Vol.43, No.5/6, 1995. (Special Issue on the Role of Formal Ontology in the Information Technology)

Gruber 92
T.R. Gruber: A Translation Approach to Portable Ontology Specifications. In: Knowledge Acquisition, Vol.5, 1992, pp.199-220.

McCarthy/Hayes 69
J. McCarthy, P.J. Hayes: Some Philosophical Problems from the Standpoint of Artificial Intelligence. In: B. Meltzer, D. Michie (eds): Machine Intelligence 4, 1969, pp.463-502.

Nissen et al. 96
H.W. Nissen, M.A. Jeusfeld, M. Jarke, G.V. Zemanek, H. Huber: Managing Multiple Requirements Perspectives with Metamodels. In: IEEE Software, March 1996, pp.37-48.

Reimer 85
U. Reimer: A Representation Construct for Roles. In: Data & Knowledge Engineering, Vol.1, No.3, 1985, pp.233-251.

Reimer/Hahn 97
U. Reimer, U. Hahn: Text Summarization Based on Condensation Operators of a Terminological Logic. In: Proc. ACL/EACL'97 Workshop on Intelligent Scalable Text Summarization, Madrid, July 7-12, 1997.

Reimer et al. 97
U. Reimer, A. Margelisch, B. Novotny: Making Knowledge-Based Systems more Manageable: A Hybrid Integration Approach to Knowledge about Actions and their Legality. In: R. Pareschi, B. Fronhöfer (eds): Representing and Managing Dynamic Knowledge. Kluwer, 1997. (to appear).

Woods/Schmolze 92
W.A. Woods, J.G. Schmolze: The KL-ONE Family. In: Computers and Mathematics with Applications, Vol.23, Nos.2-5, 1992, pp.133-177.

About this document ...

Knowledge Integration for Building Organisational Memories

This document was generated using the LaTeX2HTML translator Version 96.1-h (September 30, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html -split 0 -show_section_numbers ki97-ws.tex.

The translation was initiated by Andreas Abecker on Sun Jul 20 22:23:39 MET DST 1997

In fact, as [Guarino 95] suggests, the definition of the ontology underlying a representation language adds an additonal, ontological level to the ones suggested by [Brachman 79]. It is situated between the epistemological and the conceptual level. In our case, the primitives used by HLL to make the representation of rights, obligations, etc. possible are formally defined on that ontological level while the result of their application, namely the fundamental concepts of rights, oblogation, etc., are on the conceptual level.

next up previous

Andreas Abecker (