In: EKAW `97, 10th European Workshop on Knowledge Acquisition, Modeling, and Management, Sant Feliu de Guíxols, Catalonia, Spain, October 15 - 18, 1997. LNAI 1319, Springer Verlag.

Information Tuning with KARAT:
Capitalizing on Existing Documents

Bidjan Tschaitschian, Andreas Abecker, and Franz Schmalhofer

German Research Center for Artificial Intelligence (DFKI) GmbH
P.O. Box 2080, D-67608 Kaiserslautern, Germany
Phone: ++49 631 205-3484; Fax: ++49 631 205-3210; E-Mail: tschaits@dfki.uni-kl.de

Abstract: Organizations store their information in electronic or paper documents. This information is severely underutilized in the daily work of most organizations. Because there are no effective means to access the documents, employees do not find relevant information, or are not even aware of its existence. We describe Information Tuning - a first step towards knowledge management in enterprises. Information Tuning capitalizes on existing documents and enables better exploitation of the contained knowledge by adding the background, context, and meta information which is necessary for making possible beneficial utilization, sharing, and reuse. We present an Information Tuning method which evolved from model-based knowledge acquisition from texts, and illustrate the method with application examples. Information Tuning is supported by the KARAT tool which applies techniques from text analysis and hypertext technology.

1 Introduction

Enterprise Knowledge Management (cf. [47],[2]) and Organizational Memories (or, better Organizational Memory Information Systems - OMIS) at the core of its support ([42],[39]) are emerging terms in management as well as in the information sciences. Many successful research prototypes developed by the AI community focus on capturing and dissemination of tacit enterprise knowledge [5] which is manifested, e.g. in sophisticated, complex design decisions. For instance, remarkable results have been achieved in capturing design rationale in product development. Here, well-known representation and reasoning methods from knowledge-based systems, case-based reasoning, or issue-based information systems can be successfully applied (see, e.g. [35],[21]).

On the other hand, for industrial practice, we see an enormous application potential in somehow complementary efforts: in numerous applications, it is not necessarily the first and most important step to make explicit and formalize highly sophisticated tacit knowledge; on the contrary, there are already considerable knowledge corpora embodied in electronic and paper archives: in product and project documentation, technical manuals, lessons-learned archives, best practice databases, personal memos, etc. But, those existing knowledge assets are rarely used in everyday work. There are no effective means to access the documents, employees do not find relevant information, or are not aware of its existence. Even if they know the knowledge sources, they cannot assess their importance and validity in the current situation because they do not know the context in which the documents have been created (cf. [29],[20],[25]).

We suggest to prepare existing sources for better exploitation and beneficial use of the knowledge assets by Information Tuning. With this method a significant value-added can already be achieved with sparse formal knowledge. In contrast to the traditional knowledge-based approach, we do not encode the whole knowledge in a formal representation, but primarily rely on the existing sources as they are. Formal structures are used for organizing, annotating, and linking together these sources thus enabling their better utilization. This sparse formalization can be compared with parts of the ontological engineering in conventional KBS development, but without filling the concrete application knowledge bases. Furthermore, our formal models have the only purpose of structuring the domain in a way that alleviates later retrieval and reuse. Thus, they do not have to be as "deep, exhaustive, and consistent" as if they were the basis for automated problem solving.

Figure 1 sketches an application scenario we found in a large German software company in the context of building up a Call-Center (cp. [44]). The project goal was to improve dissemination and utilization of the company's product knowledge, which formerly has been documented in voluminous paper folders or depending on individual experience. The example illustrates several observations we found frequently in industrial practice: Typically, the ideal information flow involves many people, with rather different views, performing many tasks in the enterprise; significant parts of the knowledge involved is rapidly changing, e.g. due to new product features; other parts are very informal by nature, e.g. individual experiences. Building, maintaining, and sharing formal representations of such knowledge is often not possible in an economic way. However, a human operator could immediately process the original informal document if she was provided with it in the right situation and probably equipped with some meta information (e.g. about author, date, creation context of the document etc.). On the other hand, in an enterprise there often exists also knowledge which is essentially (semi-)formal by nature, e.g., business process models, software architecture diagrams, catalogues about compatibility relations, error documentation forms, error hierarchies, and so on. Such knowledge can often very easily be prepared to serve for organizing and structuring the more informal parts.

Roughly speaking, we propose to fill a communication gap rather than a processing gap for knowledge. The objects to be communicated are existing knowledge sources (texts, documents, ...), the prerequisite for effective communication are formal organization structures, formal annotations, and links between knowledge items. Information Tuning is a pragmatic approach to determining and providing these prerequisites.

We will discuss the task and the method in Section 2. Section 3 describes the KARAT tool for performing Information Tuning in practice. Since the main ("technical") activity is associating and linking together elements of formal models and parts of documents, it is nearby to employ hypertext technology for comfortable handling. Since the main ("semantic") effort is to map knowledge items onto the appropriate formal categories, automatic text categorization techniques are employed for generating suggestions. In Section 4, we present some examples how Information Tuning can act in practice. After some words about implementation issues in Section 5, we conclude with some remarks on related work, limitations, and future work in Section 6.

2 Information Tuning: Task and Method

We propose to accompany information units by the meta-knowledge necessary for:

Our approach has been developed for tuning text-based information collections, i.e. setting-up a more intelligent OMIS from a corpus of available documents. However, it can also be used for inserting new information into an existing OMIS. The method can be subdivided into three major phases (see Figure 2): In the preparation phase a set of different models is established. Together with initially available texts documents, these models form the input to the analysis phase which is decomposed as follows: first of all, separate information units are extracted from the initial texts. These text segments (e.g. whole sentences) are then decontextualized (reformulated) and recontextualized (explained, associated, and classified). This enables a subsequent efficient and effective validation and prioritization of information units. The information units together with the source documents and models are stored in the information repository. Please note that the method does not prescribe a fixed sequence of steps as it might be suggested by the figure. Instead, it describes actions that can be done in several cycles, maybe partly in parallel, partly building on results from prior steps, until an information repository of desired quality (in terms of the needed usefulness and usability) is reached. The integration phase comprises storing in the information repository and the various aspects of usage. We discuss the separate phases in some more detail:

Preparation: the set of structuring models has to be established. Basically, one can define arbitrary models which are suited for easing access and utilization of knowledge items. For instance, a view model could reflect the different roles and competencies within an organization, task models might be represented as business process models [31], and domain models could provide meaningful decompositions of different areas or departments within the organization. If surveying the OMIS literature for organizing principles for corporate knowledge competencies, the most prominent approach (by van Heijst et al. [42]) proposes to characterize knowledge items according to the dimensions task, role, and domain. Other authors (see [16]) embed these ideas into the world of business processes and product models.

Analysis: the analysis phase consists of the following five steps:

  1. Extraction: the user singles out relevant information units from the initially available text documents. This step may be seen as a first review procedure, i.e. only relevant information should be extracted from the documents. A link is generated that associates the new information unit with its source text. Thus, each information unit can be traced back to its origin and vice versa.
  2. Decontextualization: the extracted information units (usually whole sentences) are reformulated for better understanding and unambiguity.
  3. Recontextualization consists of three substeps:
  4. Validation: information units may be reviewed to detect inconsistencies and identify gaps. The structuring of the information units can now be utilized to retrieve small, manageable subsets of possibly related information and prove consistency and completeness (manually).
  5. Prioritizing: the relevance of an information unit may be judged as "rule", "recommendation", or "optional". Such judgements are often easier to establish when the respective information unit is compared to possibly related information. Again the multi-criteria structuring of the information units allows to retrieve related information.

Integration: the information units together with the source documents and models are stored in an information repository. This allows for a very flexible utilization of the available information. Several criteria can be combined to retrieve relevant information. Currently, these criteria comprise searching for information units

When a user needs specific information in his/her current context of work, he/she would select the respective class in his/her task model and retrieve the stored information.

3 Tool Support for Information Tuning

The KARAT tool for Information Tuning mainly builds on technologies:

  1. The basic technology for realizing comfortable user support, establishment, visualization and handling of complex relationships is Hypertext. It is the basic technology for "knowledge editing" at the "syntactic level".
  2. (Semi-automatic) support for establishing formal models as well as classification and retrieval of information units is given by adapting and integrating techniques and tools from Information Retrieval and Document Analysis and Understanding. This supports the more "semantic level" of Information Tuning.

3.1 Text Analysis Techniques in KARAT

Text analysis and information retrieval techniques are able to automatically index [18], classify [15] [43] [14], and search [22] natural language texts. Such techniques can be beneficially employed in several phases of the KARAT method: First of all, the user should be supported in the adaptation and enrichment of the different models during the preparation phase. Secondly, fast and sophisticated search mechanisms are needed to enable a quick and thorough exploration of the available text documents. Finally, text analysis techniques may provide suggestions for the classification of information units according to the different models. In the following, we take a closer look at the embedded text analysis techniques with respect to the phases of the KARAT method.

Preparation Phase: In the construction of the different models, information retrieval techniques automatically extract a word-list of relevant terms out of the collection of initially available text documents. Therefore, the texts are analyzed by a morphological component, i.e. the inflected word forms occurring in the text are reduced to their respective stems. Afterwards, a stop word reduction utilizes the part-of-speech information to delete irrelevant words. The (most frequent) remaining terms may be used as suggestions for the categories of the models. After the completion of the models, the same automatic indexing techniques are utilized to find keywords (also called weighted index terms or descriptors) for the categories of the models. Therefore, relevant text sequences are filtered out of the original text documents for each model category, i.e. only paragraphs including the respective category's name are considered. Again, a morphological analysis with stop-word reduction is performed on these text segments. Afterwards, an indexing component performs a frequency analysis of all remaining word stems [30] and ranks the terms according to relative or absolute frequency. Finally, the user may modify the resulting list of index terms (e.g. add or delete words) and the remaining keywords are attached to the respective model category. These keywords are employed later in the analysis phase to provide suggestions for the classification of the information units.

Analysis Phase: At present, parsing techniques for deep text understanding are too inefficient and error-prone when applied to complex real world problems. However, shallow text analysis techniques may very well be integrated into the information extraction and recontextualization steps for search and classification tasks.

Extraction: In identifying relevant information units in the initial texts the user needs support to browse the documents and to locate identical, similar, or related text sequences. Therefore, a search engine for single words and morphological word stems is provided. This enables the retrieval of all occurrences of different word forms and word compositions derived from one single word stem.

However, for the detection of typical formulations and text phrases a pattern matcher is required that matches all occurrences of so-called text patterns in the current text. Text patterns may be defined which involve arbitrary nestings of conjunction, disjunction, negation, skip (up to n words) and optionality operators [14].

In addition to these word-based techniques, we also employ a kind of similarity-based search to retrieve text segments which appear to be relevant in a user-defined context. Such a context would be defined by the selection of one or more model categories. Based on the keyword lists for the specific model categories, respective paragraphs of the available text documents can be identified and suggested.

Recontextualization: The central task of the recontextualization step is the classification of the extracted information units according to the previously established models. Here, the problem is distinct from the one of finding similar information in the extraction phase, but the solution is quite similar. In the first case, the model categories (a context) are given and appropriate text segments have to be found while in the second case the information unit is given and respective model categories have to be suggested.

To classify the information units according to the different models, we utilize two distinct text categorization approaches:

3.2 Hypertext Techniques in KARAT

A variety of different relations between documents, text segments, and models are established following our Information Tuning method. In particular, an information unit is associated with its source text, its natural language explanation, the chosen categories from the different models, and closely related information units (e.g. follow-up information). Hypertext techniques are ideally suited to organize the involved data and their relations. They are applied to construct nodes from the respective texts, model categories, and information units. These nodes are then connected with links according to the existing relations. The resulting hypernet provides means to

Figure 3 illustrates some applications of hypertext taken from an application of the KARAT tool in requirements engineering (see Section 4.2.2):

* In the Preparation Phase, the chosen category identifiers are linked to the respective occurrences in the word-list of relevant terms and the source texts. Furthermore, each keyword list is attached to the corresponding model category.

* In the Analysis Phase, during information extraction, information units retrieved for a certain word or text pattern are automatically linked to the respective text segments in the source text with a "has-source" link; several search results are temporarily linked to each other as a hypertrail to guide the user from one retrieved text segment to the next or previous one. Within recontextualization, the user may provide a natural language explanation, links to related information, and a classification of information units according to the different models. For related information such as follow-ups, hypertext provides means to present them graphically as a subnet with links of just one specific type. For text classification, model categories suggested by the text analysis tools may be graphically presented through highlighting the respective nodes in the models.

4 Information Tuning Sample Applications

To get a better impression of possible tasks and benefits of Information Tuning, we briefly describe two different applications currently under work in our research group. We present two examples in less detail in order to show the variety of problems tackled; if one is interested in more detailed information, he or she is referred to the respective publications.

4.1 Information Tuning for Intelligent Fault Recording

The problem. A modern coal mine employs a newly-developed, highly complex machine in order to reach the highest possible productivity in black-coal mining. It is of vital importance to maximize the technical availability of this machine as any disruption will cause a significant loss in coal production. Consequently, the diagnosis and repair of occuring faults should be performed as fast as possible. Due to the complexity of the machine and the auxiliary units this is a very demanding task (cf. [3]).

Aggravating circumstances are the shift-work - resulting in communication problems between up to seven different shifts working on a problem, insufficient experiences with the newly-developed machine, and the continuous loss of expertise due to fluctuations and the ongoing reduction of the workforce. Heuristics-based or model-based approaches to diagnosis were not feasible due to the insufficient experience with this machine, and the high complexity of the system, respectively. Lacking experiences also led to ill-defined and incomplete cases, hindering the use of CBR technology.

Information Sources. There exist databases and paper-bound fault records documented by numerous people in ordinary text, all of them using different levels of detail and vocabulary. Since diagnosis process, repair actions, and possibly other faults are closely interwoven, the clear definition of "cases" is problematic. As an additional (up to now rather unsuffiently used) information source, there exists technical background information, e.g. wiring diagrams, data tables, assembly instructions.

The models. We found a detailed machine model comprising both the component hierarchy and the functional relations between components a good starting point for organizing observations and activities. Further structures could be put on the fault records using well-known concepts from knowledge-based diagnosis, e.g. a fault hierarchy.

The goal. The system shall provide a comfortable environment for unambiguously documenting diagnosis experiences in a well-structured manner. The recorded entries are still natural language notices; however, the embedding of formal structures allows efficient retrieval and flexible processing. The formal models also link together recorded experiences and supporting (hypertext) background knowledge. Flexible information structuring allows to take into account new tasks or a priori not known relevant context factors. Multi-criteria selection and adequate presentation of information units enables, e.g., identification of weak points in the machine, training of new personnel on old cases, or exploration of optimal diagnosis and repair strategies.

Contributions. On the one hand, KARAT could be used to support the initialization of the system's knowledge-base which is constructed by analyzing the previously used databases and paper-bound fault logs. On the other hand, the KARAT functionality and tool served as a blueprint for design and implementation of the especially tailored software. The system will be operational within 1997.

4.2 Information Tuning for Requirements Engineering

The problem. A high-quality software requirements specification (SRS) is an essential precondition for the development of a successful software system. High-quality in requirements engineering refers to criteria like consistency, completeness, correctness, clarity, structure, unambiguity, minimality, traceability, and maintainability [41].

Information Sources. In the problem analysis and requirements gathering phases, requirements are captured and stored in a number of informal natural language documents which stem from informal technical notes, notes from meetings or phone calls, statements of work, requests for proposal, interviews with customers or users, manuals etc. [46]. Often, requirements in these initial texts are poorly structured, contradicting, incomplete, wrong or unnecessary, ambiguous, hard to follow or validate, etc. In order to fulfill the needed and expected quality of a well-written SRS this information has to be carefully reviewed, revised, and organized (cf. Figure 4).

The models. In the preparation phase, one or more domain models, a requirements taxonomy, and a view model are established. These models are employed later on to organize requirements according to multiple criteria. A specific model (which here replaces the general task models in arbitrary knowledge management applications) is the requirements taxonomy. It describes problem-independent categories of requirements, such as performance, interface, functionality, etc. (e.g. IEEE standard 1002-1987 [17]). The view model comprises the different roles of all stakeholders involved in the software development process. In addition to roles, a view model may also comprise the individual stakeholders. Three examples of the different kinds of models are shown in Figure 5.

The goal. Because the main objective of the whole SRS establishment process is to "tune the information" contained in initial requirements texts, all KARAT steps contribute in some way to this goal. We sketch some benefits by linking the respective activities and the concerned quality criteria: The extraction step implicitly includes a validation of requirements with respect to correctness and minimality. Obviously irrelevant information (e.g. design decisions or implementation details) is already filtered out at this stage. Keeping a link from the extracted text to the source is important to support requirements traceability.

Since extracted text segments may not be clear without the context of the source text, a reformulation (decontextualization) of the requirement units is required to improve clarity and unambiguity.

In the recontextualization step, the requirement units are further explained, associated, and classified by the requirements engineer:

Validation: The previous steps facilitate tests for consistency and completeness in the validation step. On the one hand, the requirements engineer now starts with a relevant set of requirements without major errors within single units. On the other hand, he can work on small sets of closely related requirements utilizing their model-based, multi-criteria organization. Besides this, he is able to prepare special review documents for specific persons (e.g. a collection of all "user" and "interface" requirements). Thus, the reviewers are not overburdened by the whole set of requirements but have to check only the relevant requirements according to their role in the project.

Prioritization: Finally, the relevance of requirements has to be estimated. Independent of the chosen priority ordering we claim that more reliable priorities may be found when (small) sets of closely related requirements are examined at once. Consequently, the structure of the requirements is utilized again. Obviously, the prioritization step may be combined with the validation step.

In the integration phase, the set of (now, hopefully) sound requirements may be automatically compiled into a standard as well as an organization specific SRS scheme (instead of building an information repository) as long as the scheme is reflected in the selected models. From the set of sound requirements, structured text documents can be flexibly built according to pre-selected criteria. For instance, a document may be structured with respect to the requirements taxonomy at the top-level and according to the domain models at the next level. Within the lowest structuring level, requirements should be sorted according to their priorities. Additionally, different versions of such an SRS may be easily created for different persons. Managers and users probably prefer an SRS without any specific requirements from developers or different developers in charge of different parts of the system design would prefer a specific SRS concerned with their specific requirements.

Contributions. Since the KARAT tool (Knowledge-based Assistant for Requirements Analysis at Telekom) was originally developed for software requirements engineering in a leading German telecommunications company, we have the most application experience in this area. The KARAT system prototype is currently used in first field tests in the software development department of our customer company.

5 Implementation

Figure 6 shows some of the KARAT user interface components, namely the document browser, the information browser, and the model editor. We use Netscape Navigator enhanced with a JAVA applet as document browser. Information browser and model editor are implemented in ParcPlace Digitalk VisualWave Smalltalk. Documents, models, and information units are stored and managed in the object-oriented GemStone database. The Hypertext Abstract Machine (HAM) is employed for hypertext management tasks, such as establishing and updating links between information units and the different models. Two text analysis tools implemented in C and C++ [9] are currently coupled to the KARAT kernel. MORPHIC-PLUS is a tool for the inflectional morphological analysis of the German language [24]. With its lexicon size of about 53 000 word stems, it reaches a good word coverage. The INFOCLAS2 system includes an automatic indexing component with different weighting functions as well as a word-based text classification component [15].

6 Summary

A core component of systematically accomplished Knowledge Management in enterprises is an Organizational Memory Information System which captures, preserves, updates, disseminates, retrieves, and actively reminds of enterprise-critical information. Our experiences strongly suggest that (among others) the need for deeply interwoven handling of data, formal knowledge, and documents, a still unchanged predominance of informal knowledge representations and text-based documents, and the need for minimal up-front knowledge engineering for system development are key properties of feasible solutions in industrial practice [20]. Regarding these practical constraints, we presented a pragmatic approach to preparing existing text-based information collections for a better systematic exploitation. The approach is heavily influenced by prior work in knowledge acquisition from texts and cooperative knowledge evolution (cf., e.g., [28],[32],[33]), but its ultimate goal is not the full formalization of some corpus of knowledge, but adding that portion of formal and meta knowledge which is necessary for successful sharing and reuse.

The model-based classification of information units with respect to multiple criteria is a core issue of our approach. The underlying knowledge acquisition method takes informal documents as input and guides a user in the Information Tuning process. Active assistance in this process of extraction, structurization, and utilization of information from texts is given by the KARAT tool which employs text analysis techniques for information extraction and text categorization in combination with hypertext techniques for a flexible, user-centered handling of text documents, models, and information units. The degree to which steps in the Information Tuning process can be automated is highly application specific. At present, we do not plan to fully automate subtasks since this would not be feasible in the applications we have investigated in depth. However, the tool can provide useful suggestions, and even without such suggestions the comfortable interface with its browsing, linking, and information retrieval facilities already makes a strong contribution to KARAT's success in the applications.

The Information Tuning idea fits very well in recent discussions and research in application-oriented Knowledge Management, where the balance between formality and informality as well as the role of existing documents are important topics (see, e.g., [35],[36],[37]). In this discussion, we promote a "shallow formalization", a point of view which - compared with deep understanding approaches - showed also convincing practical results in the areas of document analysis and information retrieval. This shallow approach alleviates also the model-building process in the preparation phase of Information Tuning. Of course, the quality of the models determines the quality of the whole results. However, an ontological analysis of the application has to be done for each knowledge-based system, but, maybe for other approaches in more depth (what we must do has much in common with what van Heijst et al. call a missing methodology for ontology building at the macro level [42]). Our applications showed that one can already achieve good results with rather simple, straight-forward models. Another interesting point is the minimization of up-front knowledge engineering for the preparation phase through building on existing information sources. The use of business process models seems a promising approach for embedding Knowledge Management into existing workflow technology.

Our approach taps into a growing flow of interest on meta knowledge for information finding and use. So, we will clarify whether helpful advice for designing and filling the structuring models could also come from fundamental considerations about meta knowledge as e.g. done by Guha with his Meta Content Format (MCF [13]), or from areas like Digital Libraries [6].

Future work for further developing the Information Tuning method could investigate the question how much "semantics" can be given to models and links and how those can be exploited (as, e.g., done in issue-based information systems [35]). A modest form could be to allow the user to formulate constraints over models, e.g. that after Information Tuning each information unit associated to a certain concept in a certain model must be linked (with a certain link type) to some information unit associated to some other certain concept in some other model. However, we regard it an important property of our approach that we aim at reasonably good results with cheap effort instead of extremely good results at high costs. In this trade-off, it is not easy to decide how much expressive power should be provided at the cost of increased modeling effort and more complex system handling.

Another topic of our future work is the role of CSCW: In the acquisition of organization-specific information from available text documents several people may be able to beneficially contribute. The KARAT tool allows for collaboration between different people in both the acquisition and the utilization of information. All data are stored and managed by an object-oriented database. Documents, models, and information units may be read by the whole personnel of a company. Changes to a specific information unit may only be done by a project leader or the creator of that information unit. Nevertheless, other people should have the opportunity to comment each information unit. Therefore, a simple messaging and a more specific information annotation component have been integrated into the KARAT tool. With the annotation component the user can change (a copy of) each information unit and send it as suggestion to the creator or project leader. Furthermore, the KARAT tool gives the user direct access to text- and voice-based talk tools. The CSCW functionality enables an efficient and effective communication between several users of the system.

7 References

[1] A. Abecker, A. Bernardi, K. Hinkelmann, O. Kühn, and M. Sintek. Towards a Well-Founded Technology for Organizational Memories. In: [12], 1997.

[2] A. Abecker, St. Decker, K. Hinkelmann, and U. Reimer. Knowledge-Based Systems for Knowledge Management in Enterprises. Workshop at the 21st Annual German Conf. on AI (KI-97), Freiburg, Germany, September 1997.

[3] A. Bernardi, M. Sintek, and A. Abecker. Combining Artificial Intelligence, Database Technology, and Hypermedia for Intelligent Fault Recording. Submitted. April 1997.

[4] G. Bruno. Model-Based Software Engineering. Chapman & Hall, 1995.

[5] Choo Chun Wei. Information Management for the Intelligent Organization: Roles and Implications for the Information Professions. 1995.

[6] D. Clay, S. Geffner, J. Gottsegen, B. Gritton, and T. Smith. A General Framework for Constructing Conceptual Models of Metadata in Digital Libraries. In: First IEEE Metadata Conference, Silver Spring, Maryland, USA. April 1996.

[7] A. M. Davis. Software Requirements, Objects, Functions and States. Englewood Cliffs, NJ: Prentice Hall, 1993.

[8] H. S. Delugach. Analyzing Multiple Views of Software Requirements. In Nagle, Gerholz, and Eklund (Eds.), Conceptual Structures - Current Research and Practice, Ellis Horwood Limited, Chichester, England, 1992.

[9] A. Dengel, R. Bleisinger, F. Fein, R. Hoch, F. Hönes, and M. Malburg. OfficeMAID -- A System for Office Mail Analysis, Interpretation and Delivery. Proc. of First International Workshop on Document Analysis Systems (DAS'94), pages 253-275, Kaiserslautern, Germany, October 18-20 1994.

[10] M. Dorfman and R. Thayer (Eds.). Standards, Guidelines, and Examples of System and Software Requirements Engineering.Washington, D.C.: IEEE Computer Science Press, 1990.

[11] A. P. Gabb and D. E. Henderson. Navy Specification Study Report 3: Requirements and Specification (DSTO-TR-0192). Salisbury, South Australia: DSTO Electronics and Surveillance Research Laboratory, 1995.

[12] B. Gaines, M. A. Musen, et al. (eds.). AAAI Spring Symposium Artificial Intelligence in Knowledge Management. Stanford University, March, 1997.

[13] R.V. Guha. Towards a theory of meta-content.
http://mcf.research.apple.com/mc.html

[14] P. J. Hayes, P. M. Andersen, I. B. Nirenburg, and L. M. Schmandt. TCS: A Shell for Content-Based Text Categorization, Proc. of 6th Conference on AI Applications, pages 320-326, Santa Barbara, CA, 1990.

[15] R. Hoch. Using IR Techniques for Text Classification in Document Analysis. Proc. of 17th International Conference on Research and Development in Information Retrieval (SIGIR'94), pages 31-40, Dublin City, Ireland, July 3-6 1994.

[16] J. Hofer-Alfeis and S. Klabunde. Approaches to Managing the Lessons Learned Cycle. In: [47]. 1996.

[17] IEEE Standards Collection: Software Engineering. IEEE, 1993.

[18] P. S. Jacobs. Text-Based Intelligent Systems: Current Research and Practice in Information Retrieval. Lawrence Erlbaum, Hillsdale, 1992.

[19] Knowledge Systems Laboratory, Institute for Information Technology, National Research Council Canada. FuzzyCLIPS Version 6.02A User's Guide. 1994.

[20] O. Kühn and A. Abecker. Corporate Memories for Knowledge Management in Industrial Practice: Prospects and Challenges. Journal of Universal Computer Science. Springer Verlag, 1997. To appear.

[21] O. Kühn and B. Höfling. Conserving Corporate Knowledge for Crankshaft Design. In: Seventh International Conference on Industrial & Engineering Applications of Artifical Intelligence & Expert Systems (IEA/AIE'94), Gordon and Breach Science Publishers. Also as DFKI RR-94-08. 1994

[22] J. Liang and J. D. Palmer. A Pattern Matching and Clustering Based Approach for Supporting Requirements Transformation. Proc. of the First International Conference on Requirements Engineering (ICRE `94), April 1994.

[23] D. Lukose. Knowledge Management Using MODEL-ECS. In: [12], 1997.

[24] O. Lutzy. Morphic-Plus: Ein morphologisches Analyseprogramm für die deutsche Flexionsmorphologie und Komposita-Analyse. DFKI Document D-95-07 (in German).

[25] Marble Associates Inc. Leveraging Knowledge through a Corporate Memory Infrastructure. April 1994.

[26] J. A. McDermid. Software Engineer's Reference Book. Oxford: Butterworth Heinemann Ltd., 1991.

[27] J. A. McDermid, A. Vickers, and B. Whittle. Requirements Elicitation and Analysis: Goals, Problems and Approaches. Workshop on Requirements Elicitation for Software-based Systems (RESS), Keele, England, July 12-14 1994.

[28] J.-U. Möller. Knowledge Acquisition from texts. Proc. of the European Knowledge Acquisition Workshop (EKAW'88), Gesellschaft für Mathematik und Datenverarbeitung mbH, Sankt Augustin, Germany, 1988.

[29] K. Romhardt. Processes of Knowledge Preservation: Away from a Technology Dominated Approach. In: [2]. 1997. To appear.

[30] G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. New York: McGraw Hill, 1983.

[31] A.-W. Scheer. Architektur integrierter Informationssysteme, Grundlagen der Unternehmensmodellierung. 2nd edition, Springer Verlag, 1992.

[32] F. Schmalhofer and B. Tschaitschian. Cooperative Knowledge Evolution for Complex Domains. In: G. Tecuci and Y. Kodratoff (eds). Machine Learning and Knowledge Acquisition - Integrated Approaches. Academic Press, 1995.

[33] G. Schmidt. Modellbasierte, interaktive Wissensakquisition und Dokumentation von Domänenwissen (MIKADO), DISKI Vol. 90, infix Verlag, 1995.

[34] M. L. G. Shaw and B. Gaines. Knowledge and Requirements Engineering. Proc. of the 10th Banff Knowledge Acquisition for Knowledge-Based Systems Workshop, Banff, Alberta, Canada, 1995.

[35] S. B. Shum. Representing Hard-to-Formalise, Contextualised, Multidisciplinary, Organisational Knowledge. In: [12], 1997.

[36] S. B. Shum. Balancing Formality with Informality: User-Centred Requirements for Knowledge Management Technologies. In: [12], 1997.

[37] D. Skuce. Hybrid KM: Integrating Documents, Knowledge Bases, and the Web. In: [12], 1997.

[38] I. Sommerville. Software Engineering. Workingham, England: Addison Wesley, 1992.

[39] E. W. Stein and V. Zwass. Actualizing Organizational Memory With Information Technology. Information Systems Research Vol. 6, No. 2: 85-117, 1995.

[40] B. Tschaitschian, I. John, C. Wenzel. Integrating Knowledge Acquisition and Text Analysis for Requirements Engineering. Internal Report. DFKI, 1996.

[41] B. Tschaitschian, C. Wenzel, and I. John. Tuning the quality of informal software requirements with KARAT .. In: E. Dubois, L. Opdahl, and K. Pohl (eds.). REFSQ'97: Third Int. Workshop on Requirements Engineering: Foundation for Software Quality. Held at CAiSE*97, Barcelona, 1997.

[42] G. van Heijst, R. van der Spek, and E. Kruizinga. Organizing Corporate Memories. Tenth Knowledge Acquisition for Knowledge-Based Systems Workshop KAW'96. November 1996.

[43] C. Wenzel and R. Hoch. Text Categorization of Scanned Documents Applying a Rule-based Approach. Proc. of the Fourth Annual Symposium on Document Analysis and Information Retrieval (SDAIR'95), pages 333-346, 1995.

[44] St. Wess: Intelligent Systems for Customer Support: Case-Based Reasoning in Help-Desk and Call-Center Applications. In: [2]. 1997. To appear.

[45] B. J. Wielinga, A. T. Schreiber, and J. A. Breuker. KADS: A Modelling Approach to Knowledge Engineering. Knowledge Acquisition, 4(1), 1992.

[46] D. P. Wood, M. G. Christel, and S. M. Stevens. A Multimedia Approach to Requirements Capture and Modeling. Proc. of the First International Conference on Requirements Engineering (ICRE `94), pages 53-56, Colorado Springs, CO, April 18-22 1994.

[47] M. Wolf and U. Reimer (eds). PAKM-96: First Int. Conference on Practical Aspects of Knowledge Management. Basel, Switzerland, October 1996.