METADATA FILES
CREDITS
DFKI
IMPRESSUM

My Contribution to Research

Activities

Semantic Search

My research focuses on semantic search, the topic of my PhD, and is motivated by the way the World Wide Web is evolving.

Especially with the advent of Linked Data, the Semantic Web is finally gaining momentum, while at the same time more and more Web 2.0 applications are being deployed. The two trends are increasingly interwoven and build the basis of the next generation Web, the Web 3.0, where conventional websites and data sources (e.g., DB, XML, HTML, plain text, multimedia) should be integrated and linked as well. Thus, there is a plethora of information in various representation forms which can be mapped to an information pool composed of a knowledge base (in RDF/S) and a text index. To effectively search such data, an engine is required which is able to explore linked content with various degrees of formality. For comprehensive search support, all kinds of queries, i.e., free text queries, formal queries, and queries composed of free text and also formal parts should be allowed. Furthermore, various kinds of answers should be delivered (e.g., facts, documents and documents with relevant facts). Therefore, we propose a search method which combines a fact retrieval approach (triple-based) and a semantic document retrieval approach (spreading activation).

For more information see publications.

Semantic Web, Web 2.0, Web 3.0

In my opinion, the Semantic Web and Web 2.0 build the basic principles of the next World Wide Web generation or so-called Web 3.0. The Semantic Web ('Web of Data', 'Linked Data') is based on a formal description (RDFS, RDF, OWL) of resources, i.e., data and services, allowing them to be uniquely identified and defining relations between them. Furthermore, such a formal description is machine readable and interpretable, thus enabling the development of learning methods. Web 1.0 and legacy IT content, i.e., linked documents (HTML, XHTML), Databases, plain-text and multimedia files, are also integrable and linkable using such formal descriptions. The Web 2.0 has a rather less technical focus and describes a social phenomenon of activities in the Web. Web 2.0 is about linked people, linked social services, thus social media sharing platforms mostly with folksonomies (Flickr, del.icio.us), blogs (Technorati), wikis (Wikipedia). The Semantic Web and Web 2.0 ideas are increasingly interweaving, Social Semantic Web applica- tions are being developed. Such applications are, e. g., semantic wikis (Semantic MediaWiki, Kaukolu), semantic blogs (SemBlog), social semantic networks (PeopleAggregator) and social semantic information spaces (NEPOMUK ).

Links

Semantic Desktop

The Semantic Desktop is a means for personal knowledge management. It transfers the Semantic Web to desktop computers by consistent application of Semantic Web standards like the Resource Description Framework (RDF) and RDF Schema (RDFS). Documents, e-mails, contacts, multimedia files are identified by URIs, across application borders. The user is able to annotate, classify, and relate these resources and to represent abstract concepts like projects, events, groups, etc. expressing his view in a Personal Information Model (PIMO). On a full-featured Semantic Desktop many data sources are integrated: a personal categorization system of topics, projects, people, events etc. is established. The text and metadata of all documents is indexed and categorized. Together with the collected metadata about these concepts, a critical amount of information is available - browsable, searchable - to the user.

For more information visit

Machine Learning Algorithms

Machine Learning (ML) is a subfield of artificial intelligence. It deals with the development of algorithms that allow the computer to learn from data. I'm interested in machine learning algorithms, their characteristics, applications and limits.

Links

Information Retrieval

Information retrieval (IR) deals with the representation, organization and access to information-items like text, sound, images, data in several datasets such as documents, databases, metadata, hypertext etc. I became acquainted with information retrieval during my diploma thesis. I have developed, implemented and evaluated four methods for supervised word sense disambiguation, one of the most challenging tasks of information retrieval. The approaches are based on methods of statistical analysis like Singular Value Decomposition, coocurrence matrix and machine learning algorithms like the Naive Bayes Classifier, K-Means clustering.

Links