Erik Wilde's Publications


The publications have been grouped into written publications and presentations. Written publications are available grouped by different categories, you may directly jump to the sections listing books, theses, book chapters, journal papers, standardization activities, conference papers and posters, workshop papers, technical reports, magazine articles, newspaper articles, or online articles. Presentations are available grouped by different categories, you may directly jump to the sections listing university courses, invited talks, tutorials, talks, or professional courses.

Publications available on-line sometimes are accessible in different formats. PostScript files are for printing and previewing with PostScript previewers such as ghostview. PDF files are documents in Adobe's Portable Document Format. You need Adobe's Acrobat Reader (or some other PDF viewer) to view these files.

Additional information is available in the curriculum vitae and on the home page of Erik Wilde.


Written Publications

Books, Theses, and Book Chapters Journal Papers Standardization Reviewed Conference Papers & Posters and Workshop Papers Technical Reports Magazine, Newspaper, and Online Articles
2008 IJWBC 4(1) OIR 32(3) CACM 51(7) CACM 51(10) ACM Queue 6(6) RFC 5147 draft-wilde-sms-uri WSW2008 LocWeb 2008 EXPONWIRELESS 2008 SCC 2008 IRI 2008 BCS 2008 TIPUGG 2008 HCIR 2008 iRep 2008-016 iRep 2008-025 iRep 2008-026
2007 VDF JoDI 8(3) WWW20071 WWW20072 WWW20073 XTech 2007 SCC 2007 IRI 2007 DocEng 2007 BXML 2007 iRep 2007-001 eCH-0036 eCH-0050 iRep 2007-014 iRep 2007-015 xml.com3
2006 WBC 2006 WWW20061 WWW20062 WWW20063 WWW20064 ELPUB 2006 GMW06 TIKrep 242 TIKrep 244 TIKrep 245 TIKrep 257 eCH-0033 TIKrep 265 eCH-0035
2005 WWW20051 WWW20052 HT 2005 BXML 2005 ECDL 2005 IAWTIC 20051 IAWTIC 20052 TIKrep 212 TIKrep 213 TIKrep 224 eCH-0018 XML & WS 2005(2) iX 18(7) XML & WS 2005(4) iX 18(10)
2004 PHIC1 PHIC2 ISTL 41 XML Europe 2004 ICETE 20041 ICETE 20042 ECOWS'04 XSW 2004 TIKrep 190 TIKrep 194 XML & WS 2004(1) xml.com2 XML & WS 2004(2) XML & WS 2004(3) iX 17(7) XML & WS 2004(4) D-Lib 10(9)
2003 IEEE IC 7(5) XML Europe 2003 WWW20031 WWW20032 WWW20033 IUC24 SINN03 TIKrep 160 TIKrep 166 TIKrep 172 W3C BXML iX 16(2) XML & WS 2003(5) xml.com1 XML & WS 2003(6)
2002 XLink WWW2002 XML 2002 TIKrep 124 TIKrep 125 TIKrep 134 TIKrep 143 TIKrep 148 iX 15(7) iX 15(8)
2001 WWW101 WWW102 Open Publish 2001 TIKrep 102 iX 14(3) iX 14(7)
2000 HICSS-33 NZZ Australian IT iX 13(6)
1999 WWW (german)
1998 WWW
1997 Ph.D. thesis ETT 8(4)
1996 COST 237 ECMAST 96 TIKrep 15 TIKrep 19
1995 TCCC 95 ULPAA 95
1994 TIKrep 18 TIKrep2 TIKrep3
1993 MCAT 93 ZBF 224Z1 ZBF 224Z2 ZBF 224Z3
1992 TIKrep1
1991 Diploma thesis

Books

Theses

Book Chapters

Journal Papers

  • Erik Wilde and Robert J. Glushko, XML Fever, ACM Queue, 6(6):46–53, October 2008. (available as abstract and HTML)
    Abstract: The Extensible Markup Language (XML), which just celebrated its 10th birthday, is one of the big success stories of the Web. Apart from basic Web technologies (URIs, HTTP, and HTML) and the advanced scripting driving the Web 2.0 wave, XML is by far the most successful and ubiquitous Web technology. With great power, however, comes great responsibility, so while XML.s success is well earned as the first truly universal standard for structured data, it must now deal with numerous problems that have grown up around it. These are not entirely the fault of XML itself, but instead can be attributed to exaggerated claims and ideas of what XML is and what it can do.
  • Erik Wilde and Robert J. Glushko, Document Design Matters, Communications of the ACM, 51(10):43–49, October 2008. (available as abstract and HTML)
    Abstract: The classical approach to the data aspect of system design distinguishes conceptual, logical, and physical models. Models of each type or level are governed by metamodels that specify the kinds of concepts and constraints that can be used by each model; in most cases metamodels are accompanied by languages for describing models. For example, in database design, conceptual models usually conform to the Entity-Relationship (ER) metamodel (or some extension of it), the logical model maps ER models to relational tables and introduces normalization, and the physical model handles implementation issues such as possible denormalizations in the context of a particular database schema language. In this modeling methodology, there is a single hierarchy of models that rests on the assumption that one data model spans all modeling levels and applies to all the applications in some domain. The one true model approach assumes homogeneity, but this does not work very well for the Web. The Web as a constantly growing ecosystem of heterogeneous data and services has challenged a number of practices and theories about the design of IT landscapes. Instead of being governed by one true model used by everyone, the underlying assumption of top-down design, Web data and services evolve in an uncoordinated fashion. As a result, a fundamental challenge with Web data and services is matching and mapping local and often partial models that not only are different models of the same application domain, but also differ, implicitly or explicitly, in their associated metamodels.
  • Erik Wilde and Robert J. Glushko, XML Fever, Communications of the ACM, 51(7):40–46, July 2008. (available as abstract and HTML)
    Abstract: The Extensible Markup Language (XML), which just celebrated its 10th birthday, is one of the big success stories of the Web. Apart from basic Web technologies (URIs, HTTP, and HTML) and the advanced scripting driving the Web 2.0 wave, XML is by far the most successful and ubiquitous Web technology. With great power, however, comes great responsibility, so while XML.s success is well earned as the first truly universal standard for structured data, it must now deal with numerous problems that have grown up around it. These are not entirely the fault of XML itself, but instead can be attributed to exaggerated claims and ideas of what XML is and what it can do.
  • Erik Wilde, Deconstructing Blogs, Online Information Review, 32(3):401–414, 2008. (available as abstract)
    Abstract: Purpose: A growing amount of information available on the Web can be classified as "contextual information", putting already existing information into a new context rather than creating isolated new information resources. Blogs are a typical and popular example of this category. By looking at blogs from a more context-oriented view, it is possible to deconstruct them into structures which are more contextual than just focused on the content, facilitating flexible reuse of these structures.
    Design/Methodology/Approach: We look at the underlying structures of blogs and blog posts, representing them as multi-ended links. This alternative representation of blogs and blog posts allows us to represent them as reusable information structures. This paper presents blogs as a popular content type, but the approach of restructuring Web 2.0 content can be extended to other classes of information, as long as they can be regarded as being mainly contextual.
    Findings: By deconstructing blogs and blog posts into their essential properties, we can show how there is a simple and universal representation for blogs. This representation allows the reuse of blog information across specific blog or blogging platforms, and can even go beyond blogs by representing other Web content which provides context.
    Originality/Value: The approach presented in this paper is a novel approach of mapping a popular Web content type to a simple and universal representation. The value of such a unified representation lies in exposing the structural similarities among blogs and blog posts, and making them available for reuse.
  • Erik Wilde, Sai Anand, Thierry Bücheler, Max Jörg, Nick Nabholz and Petra Zimmermann, Collaboration Support for Bibliographic Data, International Journal of Web Based Communities, 4(1):98–109, January 2008. (available as abstract)
    Abstract: In many collaborative research settings, electronic bibliographic repositories (bibliographies) are used to aggregate information about related work among researchers. These bibliographies allow for group bibliography collection, individual tracking of each user.s library, and personal annotation capabilities within each user.s library. However, most tools used for managing bibliographic data do not support collaboration. Given the collaborative nature of the research group, this information should be shareable between researchers within the group and potentially across larger organizational units (for example, research institutes). By using ShaRef, users can share bibliographic information and collaborate, publish and export data using a variety of output channels. ShaRef.s goal is to make sharing of and collaboration with bibliographic information easier than it is today.
  • Erik Wilde, Personalization of Shared Data: The ShaRef Approach, Journal of Digital Information, 8(3), 2007. (available as abstract)
    Abstract: Personalization of services often has to cope with the conflicting goals of allowing cooperation and sharing, which require common data formats and services, and supporting individual use cases, which require as much personalization as possible. In this paper we present the ShaRef approach to personalization and sharing, which on the one hand allows users to cooperatively work with bibliographic references, and on the other hand supports the usage of this information in personalized and diverse ways. The goal of this approach is to foster as much cooperation as possible, while simultaneously supporting users with individualized ways of reusing the cooperatively managed data. This way of building applications combines the beneficial aspects of information sharing and personalization. Using this approach, applications are better suited to become building blocks in information infrastructures that are built by users in unpredictable ways.
  • Erik Wilde, References as Knowledge Management, Issues in Science & Technology Librarianship, No. 41, Fall 2004. (available as abstract)
    Abstract: Management of bibliographic and Web references for many researchers is the closest thing to knowledge management they will ever do. This article describes ShaRef, a new approach to reference management that focuses on the user and enhances traditional reference management approaches with collaboration features and lightweight knowledge management. While this is primarily targeted at providing individual users and user groups with a better tool, it also creates a new and interesting link to libraries, because of the features that enable users to go from their own references directly to the library through the use of OpenURL. Thus, a new task for libraries is to adjust to this new type of users, who are using new technologies to access a library.
  • Erik Wilde, XML Technologies Dissected, IEEE Internet Computing, 7(5):74–78, September/October 2003. (available as abstract)
    Abstract: XML technologies are very popular, and one of the most important reasons for this is the availability of tools and technologies for working with XML, eliminating the need to build XML processing from scratch. However, XML technologies are built on top of inherent (and not always well-defined) information models, and this may cause problems because (1) the information models of some tools may not support the required ``view'' of XML, or (2) there is no appropriate data model to work with the information model in question. In this article, we approach this question from the systematic side, and describe the most prominent XML technologies with regard to their information and data models.
  • Erik Wilde and Bernhard Plattner, Transport-Independent Group and Session Management for Group Communication Platforms, European Transactions on Telecommunications, 8(4): 409–421, July 1997. (available as abstract, PostScript, and PDF)
    Abstract: With more and more computers gradually changing from isolated, personal tools to networked workstations, group communications is an area of research which has received much attention recently. This paper focuses on a model and the architecture of a system which supports group communications by providing group and session management functionality. The system architecture is related to DNS or X.500, however avoids their complexity by focusing on group and session management and adding functionality where necessary. New functionality is needed for the dynamics of group communications (members of a connection may change over the lifetime of the connection) and increased complexity of relations which may be established between objects. A model is described which defines six object types which represent the relevant objects. Users and groups represent real world users and their relations. Sessions and flows describe ongoing group communications. Flow templates and certificates provide mechanisms for management and security issues. The architecture presented in this paper can be used for group and session management support within different group communications platforms. A description of the implementation as well as implementation results are given in the last section.

Standardization Activities

  • Erik Wilde and Antti Vähä-Sipilä, URI Scheme for GSM Short Message Service, Internet Draft draft-wilde-sms-uri-16, August 2008. (available as abstract, ASCII, and HTML)
    Abstract: This memo specifies the Uniform Resource Identifier (URI) scheme "sms" for specifying one or more recipients for an SMS message. SMS messages are two-way paging messages that can be sent from and received by a mobile phone or a suitably equipped networked device.
  • Erik Wilde and Martin Dürst, URI Fragment Identifiers for the text/plain Media Type, Internet RFC 5147, April 2008. (available as abstract, ASCII, and HTML)
    Abstract: This memo defines URI fragment identifiers for text/plain MIME entities. These fragment identifiers make it possible to refer to parts of a text/plain MIME entity, either identified by character position or range, or by line position or range. Fragment identifiers may also contain information for integrity checks to make them more robust.

Reviewed Conference Papers

  • Erik Wilde and Martin Gaedke, Web Engineering Revisited, 2008 British Computer Society (BCS) Conference on Visions of Computer Science, London, UK, September 2008. (available as abstract and PDF)
    Abstract: We propose Web Engineering 2.0 to not focus anymore on how to engineer for the Web, but how to engineer the Web. Web Engineering has become one of the core disciplines for building Web-oriented applications. This paper proposes to reposition Web engineering to be more specific to what the Web is, by which we mean not only an interface technology, but an information system, into which Web-oriented applications have to be embedded. More traditional Web applications often are just user interfaces to data silos, whereas the last years have shown that well-designed Web-oriented applications can essentially start with no data, and derive all their value from being open and attracting users on a large scale. Such an approach to Web engineering not only leads to a more disciplined way of engineering the Web, it also allows computer science to better integrate the special properties of the Web, most importantly the loosely coupled nature of the Web, and the importance of the social systems driving the Web.
  • Erik Wilde and Yiming Liu, Lightweight Linked Data, 2008 IEEE International Conference on Information Reuse and Integration (IRI 2008), Las Vegas, Nevada, July 2008. (available as abstract and PDF)
    Abstract: Much of the Web's success rests with its role in enabling information reuse and integration across various boundaries. Hyperlinked Web resources represent a rich information tapestry of content and context, instrumental in effective knowledge sharing and further knowledge development. However, the Web's simple linking model has become increasingly inadequate for effective content discovery and reuse. At the same time, rigorous but heavyweight solutions such as the Semantic Web have yet to garner critical mass in adoption. This paper analyzes the relative strengths and shortcomings of existing linked data approaches. It proposes a novel, lightweight architecture for the modeling, aggregation, retrieval, management, and sharing of contextual information for Web resources, based on established standards and designed to encourage more efficient and robust information reuse on the Web.
  • Eric Kansa and Erik Wilde, Tourism, Peer Production, and Location-Based Service Design, 2008 IEEE International Conference on Services Computing (SCC 2008), Honolulu, Hawaii, July 2008. (available as abstract and PDF)
    Abstract: This paper describes characteristics of information and service design by exploring the needs and motivations of tourists. Tourists are expected to be important and demanding users of location-based services. They will need customized means to filter their experience of destinations, as well as ways to meaningfully participate in the creation of narratives and histories about different places. Mobile technologies will also allow tourists to be more discriminating in their patronage of different service offerings, especially as they gain greater knowledge of so-called backstage processes. These demanding needs will require choreography between services offered by many different commercial, cultural, educational, and community providers. The paper suggests approaches to deliver tourist location-based services based on low barrier of entry principles of web architecture. The paper concludes with a discussion on how the erosion of backstage/front-stage distinctions in service systems impacts service innovation.
  • Erik Wilde, Philippe Cattin, and Felix Michel, Web-Based Presentations, Berliner XML Tage 2007 (BXML 2007), Berlin, Germany, September 2007. (available as abstract and PDF)
    Abstract: The management and publishing of complex presentations is poorly supported by available presentation software. This makes it hard to publish usable and accessible presentation material, and to reuse that material for continuously evolving events. XSLidy provides an XSLT-based approach to generate presentations out of a mix of general-purpose HTML and a small number of presentation-specific structural elements. Using XSLidy, the management and reuse of complex presentations becomes easier, and the results are more user-friendly in terms of usability and accessibility.
  • Erik Wilde, Declarative Web 2.0, 2007 IEEE International Conference on Information Reuse and Integration (IRI 2007), Las Vegas, Nevada, August 2007. (available as abstract and PDF)
    Abstract: Web 2.0 applications have become popular as drivers of new types of Web content, but they have also introduced a new level of interface design in Web development; they are focusing on richer interfaces, user-generated content, and better interworking of Web-based applications. The current foundations of the Web 2.0, however, are strictly imperative in nature, which makes it difficult to develop applications which are robust, interoperable, and backwards compatible. Using a declarative approach for Web 2.0 applications, this new wave of applications can be built on a more robust foundation which is more in line with the Web's style of using declarative methods whenever possible. We show a path how today's imperative Web 2.0 applications can be regarded as a testbed as well as a first implementation for a revised version of Web 2.0 technologies, which will be based on declarative markup rather than imperative code.
  • Erik Wilde, What are you talking about?, 2007 IEEE International Conference on Services Computing (SCC 2007), Salt Lake City, Utah, July 2007. (available as abstract and PDF)
    Abstract: While services are widely regarded as an important new concept in IT architecture, so far there is no consolidated concept about the exact meaning of the term "service orientation". While there are many problems which are simply problems of certain technical decisions, other areas are more fundamental and lead to different perspectives and eventually implementations of service oriented systems. We argue that the current emphasis of service orientation as a collection of interface descriptions misses the critical point of services, which is that they revolve around resources. With a more resource-centered approach, the investment into a service oriented architecture can be made much more promising, because the resource-centered approach is better suited for the design of loosely coupled systems than the current interface-based approach.
  • Felix Michel and Erik Wilde, Data Model Perspectives for XML Schema, XTech 2007, Paris, France, May 2007. (available as abstract and presentation PDF)
    Abstract: The family of upcoming XML technologies, consisting of XPath 2.0, XSLT 2.0, and XQuery, no longer operates only on the Infoset, but also utilize schema information. Today, this schema information is added to the Infoset during schema-validation and commonly is referred to as PSVI contributions (PSVI for .Post-Validation Schema Infoset.). Utilizing schema information is promising, for XML Schema allows to describe relationships between structures in an expressive, semantically relevant way, e.g. through type derivation and substitution groups. This structural information can become valuable meta-data when processing instances that comply to the respective Schema. However, only a small fraction of this schema information is accessible with the aforementioned technologies. There are various reasons for this: Some schema information such as where wildcards can occur is not exposed at all, and other components (e.g. types) are only represented by QNames, lacking any possibilities to further navigate the schema information. Secondly, the PSVI specification remains vague with respect to the data model. And finally, the present data model of XML Schema is not appropriate for some application contexts. The existence of differing data models for XML Schema (e.g. in programming APIs for XML Schema) is evidence for the fact that the abstract data model as defined in the recommendation does not rule out the need for other data model perspectives. In fact, the abstract data model and its incarnations (namely the normative XML syntax) may be good for defining schemas, but it proves to be less appropriate for exploiting the structural information. Features that are convenient for definition (such as named groups and nested model groups) turn out to be problematic for retrieval and navigation, the most important ways of using the structural information. We propose an alternative data model perspective that represents the schema information in a way that meets the needs of certain classes of applications better. These applications have in common read-only access to schema information, an instance-driven perspective, the need for schema inspection at runtime, and possibly only a local scope. Our data model uses what we call .occurrences. instead of the .particles. in the normative abstract data model, and it expands what we (deliberately) consider to be notational shorthands (like occurrence constraints and named groups). Furthermore, we index all occurrences (even of the same element), as it is done in .marked expressions. in regular language theory. The structural information is not longer captured by model groups, but by a set of potential next occurrences. This is based on the idea of Brzozowski derivatives and again inspired by the anticipated needs of instance-oriented applications. We present a prototype implementation which is purely based on standard technologies. It is implemented as a XSLT 2.0 function library that reads schemas in the normative XML syntax, constructs the data model from this information, and provides various functions for accessing, navigating, and exploiting the schema information. We show that such functionality is highly beneficial, making applications more powerful, resilient, and easier to develop.
  • Erik Wilde, Structuring Content with XML, 10th International Conference on Electronic Publishing (ELPUB 2006), Bansko, Bulgaria, June 2006. (available as abstract and PDF)
    Abstract: XML as the most successful data representation format makes it easy to start working with structured data because of the simplicity of XML documents and DTDs, and because of the general availability of tools. This paper first describes the origin and features of XML as a markup language. In a second part, the question of how to use the features provided by XML for structuring content is addressed. Data modeling for electronic publishing and document engineering is an research field with many open issues, the most important open question being what to use as the modeling language for XML-based applications. While the paper does not provide a solution to the modeling language question, it provides guidelines for how to design schemas once the model has been defined.
  • Erik Wilde, Sai Anand, Thierry Bücheler, Nick Nabholz, and Petra Zimmermann, Bibliographies as Shared Resources, Web Based Communities 2006 Conference (WBC 2006), San Sebastian, Spain, February 2006. (available as abstract and PDF)
    Abstract: In many research settings, bibliographies are a central resource for collecting information about related work, keeping track of the own research record, and annotating this information with remarks. By its very nature, this information should be shared between researchers within a research group and maybe in larger organizational units (for example research institutes) as well. However, most tools used for managing bibliographic data do not support collaboration. Using ShaRef, users can share bibliographic information, collaborate, and publish and export data using a variety of output channels. ShaRef's goal is to make sharing of and collaboration with bibliographic information easier than it is today.
  • Erik Wilde, Augmenting XHTML for Help and Documentation, International Conference on Intelligent Agents, Web Technology and Internet Commerce (IAWTIC 2005), Vienna, Austria, November 2005. (available as abstract and PDF)
    Abstract: Providing users with help and other documentation is essential for any software targeted at end users. Authoring help and documentation in a platform-independent way is hard, because different help systems have different conventions for structuring and organizing the documents. The Help System Generator (HSG) presented in this paper provides an easy and platform-independent way of preparing and publishing help and documentation. Using HSG, software creators can easily author, reuse, and publish help and documentation for different platforms.
  • Erik Wilde and Nick Nabholz, Access Control for Shared Resources, International Conference on Intelligent Agents, Web Technology and Internet Commerce (IAWTIC 2005), Vienna, Austria, November 2005. (available as abstract and PDF)
    Abstract: Access control for shared resources is a complex and challenging task, in particular if the access control policy should be able to cope with different kind of sharing and collaboration. The reason for this is that traditional access control system often depend on administrators to set up the foundations of the access control mechanism, in most cases users and their group memberships. The access control model presented in this paper approaches this problem by supporting two different kinds of groups, named groups and resource-based groups. Using the implementation of this model in our application allows to to support a wide variety of sharing and collaboration types between the application's users.
  • Erik Wilde, Sai Anand, and Petra Zimmermann, Management and Sharing of Bibliographies, 9th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2005), Vienna, Austria, September 2005. (available as abstract and PDF)
    Abstract: Managing bibliographic data is a requirement for many researchers, and in the group setting within which the majority of research takes place, the managing and sharing of bibliographic data is an important facet of organizing the research work. Managing and sharing bibliographies has to balance different levels of shared access (public catalogs, closed research group bibliographies, and personal bibliographies), and the sharing platform should integrate as seamlessly as possible into diverse environments in terms of operating systems, document processing, and other information management tools. The ShaRef system presented in this paper has been designed to fill the gap between public libraries and personal bibliographies, and provides an open platform for sharing bibliographic data among user groups. Through its simple and flexible data model and system architecture, ShaRef adapts to many settings and requirements, and can be used to increase collaboration and information flow within groups.
  • Erik Wilde, Towards Conceptual Modeling for XML, Berliner XML Tage 2005 (BXML 2005), Berlin, Germany, September 2005. (available as abstract and PDF, and paper presentation)
    Abstract: Today, XML is primarily regarded as a syntax for exchanging structured data, and therefore the question of how to develop well-designed XML models has not been studied extensively. As applications are increasingly penetrated by XML technologies, and because query and programming languages provide native XML support, it would be beneficial to use these features to work with well-designed XML models. In order to better focus on XML-oriented technologies in systems engineering and programming languages, an XML modeling language should be used, which is more focused on modeling and structure than typical XML schema languages. In this paper, we examine the current state of the art in XML schema languages and XML modeling, and present a list of requirements for a XML conceptual modeling language.
  • Erik Wilde and Marcel Baschnagel, Fragment Identifiers for Plain Text Files, Sixteenth ACM Conference on Hypertext and Hypermedia (HT 2005), Salzburg, Austria, September 2005. (available as abstract and PDF)
    Abstract: Hypermedia systems like the Web heavily depend on their ability to link resources. One of the key features of the Web's URIs is their ability to not only specify a resource, but to also identify a subresource within that resource, by using a fragment identifier. Fragment identification enables user to create better hypermedia. We present a proposal for fragment identifiers for plain text files, which makes it possible to identify character or line ranges, or subresources identified by regular expressions. Using these fragment identifiers, it is possible to create more specific hyperlinks, by not only linking to a complete plain text resource, but only the relevant part of it. Along with this proposal, a prototype implementation is described which can be used both as a server-side testbed and as a client-side extension for the Firefox browser.
  • Erik Wilde, Semantically Extensible Schemas for Web Service Evolution, European Conference on Web Services (ECOWS'04), Erfurt, Germany, September 2004. (available as abstract, PDF, and paper presentation)
    Abstract: Web Services are designed for loosely coupled systems, which means that in many cases it is not possible to synchronously upgrade all peers of a Web Service scenario. Instead, Web Service peers should be able to coexist in different versions. Additionally, older software versions often could benefit from upgrades to the service if they were able to understand it. This paper presents a framework for semantically extensible schemas for Web Service evolution. The core idea of is to use declarative semantics to describe extensions to a service's vocabulary. These declarative semantics can be used by older software versions to understand the semantics of extensions, thus enabling older software to dynamically adapt to newer versions of the service. As long as declarative semantics are sufficient, older software can benefit from the service's extension.
  • Erik Wilde, Protecting Legacy Applications from Unicode, International Conference on E-Business and Telecommunication Networks (ICETE 2004), Setúbal, Portugal, August 2004. (available as abstract, PDF, and paper presentation)
    Abstract: While XML-based Web Service architectures are successfully turning the Web into an infrastructure for cooperating applications, not all problems with respect to interoperability problems have yet been solved. XML-based data exchange has the ability to carry the full Unicode character repertoire, which is approaching 100'000 characters. Many legacy application are being Web-Service-enabled rather than being re-built from scratch, and therefore still have the same limitations. A frequently seen limitation is the inability to handle the full Unicode character repertoire. We describe an architectural approach and a schema language to address this issue. The architectural approach proposes to establish validation as basic Web Service functionality, which should be built into a Web Services architecture rather than applications. Based on this vision of modular an infrastructure-based validation, we propose a schema language for character repertoire validation. Lessons learned from the first implementation and possible improvements of the schema language conclude the paper.
  • Erik Wilde and Jacqueline Schwerzmann, When Business Models Go Bad: The Music Industry's Future, International Conference on E-Business and Telecommunication Networks (ICETE 2004), Setúbal, Portugal, August 2004. (available as abstract, PDF, and paper presentation)
    Abstract: The music industry is an interesting example for how business models from the pre-Internet area can get into trouble in the new Internet-based economy. Since 2000, the music industry has suffered declining sales, and very often this is attributed to the advent of the Internet-based peer-to-peer file sharing programs. We argue that this explanation is only one of several possible explanations, and that the general decrease in the economic indicators is a more reasonable way to explain the declining sales. Whatever the reason for the declining sales may be, the question remains what the music industry could and should do to stop the decline in revenue. The current strategy of the music industry is centered around protecting their traditional business model through technical measures and in parallel working towards legally protecting the technical measures. It remains to be seen whether this approach is successful, and whether the resulting landscape of tightly controlled digital content distribution is technically feasible and accepted by the consumers. We argue that the search for new business models is the better way to go, even though it may take some time and effort to identify these business models.
  • Mario Jeckle and Erik Wilde, Identical Principles, Higher Layers: Modeling Web Services as Protocol Stack, XML Europe 2004, Amsterdam, April 2004. (available as abstract, PDF, and HTML)
    Abstract: Web Services and their potential applications are currently under heavy discussion in industry, research, and standardization. As a result of evaluation and experience by early adopters, the technology is expected to mature through the advent of new standards and solutions leveraging Web Service's power. In essence, the efforts undertaken to create and complete a stack of Web Service protocols lead to a new communication architecture and extends the stack of classical network protocols. This evolving architecture could serve as a future-proof infrastructure for businesses to rely on. However the growth of the Web Service stack with respect to the addition of new layers and expansion of the resulting infrastructure has not been studied in comparison with well-established protocol suites like the ISO/OSI stack or the set of protocols constituting the Internet. Strictly speaking, industry's demand for functionality and services enhancing the basic Web Service protocols such as XML-RPC or SOAP, leads to the creation of a full-fledged layered protocol suite on-top of the existing ones. Nevertheless, the various standards, specifications, and ideas have neither been consolidated on a common terminological basis, nor been integrated in a single framework of reference. This observation also applies to the established trio of Web Service standards composing of SOAP, WSDL, and UDDI. According to the specific usage patterns of these specifications, they are not operating on one layer as the well-known triangular relationship graph suggests, but instead they are connected by means of unidirectional usage dependencies. From this point of view, the message patterns (MP) defined by WSDL 2.0 offer services to layers organized on top of WSDL which rely on the service interfaces exposed by SOAP. More precisely, not the interface definition with WSDL but the accompanying MPs act as the transport layer of the service stack. Based on this and other criteria, SOAP can be categorized as the basic low-level layer of the Web Service infrastructure corresponding to the network-dependent layers of the classical protocol suites. Based on these facts, all of the various efforts relying on the seminal Web Service protocols can be categorized at the various levels layered above the transport layer. This is especially true for specifications dealing with the management of sessions and transactions which are layered directly above the MPs. Also, security standards like XML digital signatures and XML encryption fit well into this by classifying them as part of the presentation layer. Furthermore, within the Web Service environment quite analogous application layer mechanisms (e.g. firewalls for content filtering) emerge are commonly known for classical network operation. Taking this congruency of established protocol stacks and the Web Service's one step further the analogy may serve as a valuable framework for the comparison of different architectural styles in Web Service deployment. Taking the continuing debate weighing services based on representational state transfer (REST) against those based on RPC-style SOAP as an example, both approaches reveal themselves as heterogeneous protocols. Both ideas are not mutually exclusive nor conflicting at all. Both protocols can be made interoperable by the use of bridges or gateways arbitrating between the two parties. Our analysis shows that Web Services are a true but yet incomplete protocol suite deploying classical Internet protocols as basic services by the continued addition of supplemental specifications and standards.
  • Erik Wilde, Towards Federated Referatories, SINN03 Conference on Worldwide Coherent Workforce and Satisfied Users, Oldenburg, Germany, September 2003. (available as abstract, PDF, and paper presentation)
    Abstract: Metadata usage often depends on schemas for metadata, which are important to convey the meaning of the metadata. We propose an architecture where users can extend the schema used by a system for managing referential metadata. Users can plugin new schemas and install custom filters for exporting metadata, so that users are not forced to limit their metadata to a fixed schema. The goal of this architecture is to provide users with a system that helps them managing their referatory, enables them with powerful tools to adapt the tool to their metadata, and still makes it possible to collect the metadata of several users in a central storage and exploit the common facets of the metadata. Our system is based on a specialized schema language, which has been built on top of the XML schema languages XML Schema and Schematron.
  • Erik Wilde, Validation of Character Repertoires for XML Documents, Twenty-fourth Internationalization and Unicode Conference (IUC24), Atlanta, Georgia, September 2003. (available as abstract, PDF, and paper presentation)
    Abstract: XML is based on Unicode, and therefore XML documents may use the full Unicode character repertoire. However, XML-based applications often use XML interfaces to legacy software which in many cases is not capable of dealing with the full Unicode character repertoire. We therefore propose a schema language for XML which is capable of limiting the character repertoire of XML documents. This schema language, called Character Repertoire Validation for XML (CRVX), has features to permit or disallow character repertoire subsets from certain parts of an XML document, for example only for element and attribute names. CRVX uses information from the Unicode Character Database (UCD) to make character repertoire specification as easy as possible. CRVX is not intended to be the only schema language in an XML application scenario, but it provides useful additional schema-based validation to protect applications from unsupported characters. XML applications typically combine different schema languages before processing XML documents, and CRVX is intended to complement other schema languages such as grammar-based languages (DTD, XML Schema) or rule-based languages (Schematron). CRVX can be implemented in various ways. One simple solution is to use XSLT to transform an CRVX schema into an XSLT program, which is then used to validate XML documents. We briefly describe such an implementation. Other (and more efficient) implementations could be based on DOM or SAX parsers.
  • Erik Wilde and Kilian Stillhard, A Compact XML Schema Syntax, XML Europe 2003, London, UK, May 2003. (available as abstract, HTML, and paper presentation)
    Abstract: The new schema language defined by the W3C, XML Schema, is used in a number of applications, such as Web Services and XQuery, and will probably be used by an increasing number of users in the near future. Currently, XML Schema's data model, the "XML Schema Components", can only be represented in the rather verbose XML syntax defined in the XML Schema specification itself. We propose an alternative non-XML syntax, which is (1) much more compact than the XML syntax, (2) defined by EBNF productions, (3) re-uses well-known syntactic concepts where appropriate, and (4) is easy to implement using standard parser-generating tools. Our approach is comparable to the approach of the RELAX NG schema language, which also supports two alternative syntaxes, an XML-based one, and a more compact non-XML one. We believe that XML Schema could be made easier to use by supporting a compact syntax. Currently, complex schemas are very hard to read due to the large amount of XML markup, and the various tools and GUIs that are on the market differ widely and in all cases support only a subset of the features of XML Schema. We believe that there should be a compact syntax, optimized for human users, which makes it easy to read and write XML Schemas, and which supports the full feature set of XML Schema. Obviously, a non-XML syntax makes it necessary to introduce new tools. However, generating parsers from EBNF productions is rather simple and well-supported by standard tools (such as yacc and JavaCC), and the other direction (i.e., generating non-XML syntax) can be implemented by using XML tools. Our XML Schema Compact Syntax (XSCS) is geared towards human users, by re-using language constructs known from other application areas, such as DTDs and programming languages, and making them available for XML Schema component representation. Examples for this re-use of syntactic constructs are DTD-style content models, number ranges ("[a,b]" or "(a,b]" as in standard mathematical notation), and qualifying attributes like "abstract" or "final" known from programming languages ("final abstract type { ... }"). We also believe that graphical representations of complex structures such as schemas are not always suitable because some people prefer textual representations, editing might be faster when using keyboard input instead of using click-and-point operations, and graphical representations (usually) hide some information. We fully integrate the processing of our syntax into the existing pipeline of XML-based tools by creating a parser that generates SAX events or DOM trees from the compact syntax documents. This way, we can use the existing XML Schema validation engines and XML Schema error checking facilities already implemented in validation engines like the Xerces parser. In addition, we have a serialization module to generate compact syntax documents from XML Schema DOM trees. Our overall goal is to improve XML Schema acceptance by providing a syntax that is easier to work with than the XML syntax, and tools to process this syntax.
  • Erik Wilde, Making the Infoset Extensible, XML 2002, Baltimore, Maryland, December 2002. (available as abstract, PDF, HTML, and paper presentation)
    Abstract: The XML Infoset defines the data model of XML, and it is used by a number of other specifications, such as XML Schema, XPath, DOM, and SAX. Currently, the Infoset defines a fixed number of Information Items and their Properties, and the only widely accepted extension of the Infoset are the Post Schema Validation Infoset (PSVI) contributions of XML Schema. XML Schema demonstrates that extending the Infoset can be very useful, and the PSVI contributions of XML Schema are being used by XPath 2.0 to access type information in a document's Infoset. In this paper, we present an approach to making the Infoset generically extensible by using the well-known Namespace mechanism. Using Namespaces, it is possible to define sets of additional Information Items and Properties which are extending the core Infoset (or other Infoset extensions, defining a possibly multi-level hierarchy of Infoset extensions). Basically, a Namespace for an Infoset extension contains a number of Information Items, which may have any number of Properties. It is also possible to define an Infoset extension containing only Properties, extending the Information Items of other Infosets. Further elaborating on this method, many of the XML technologies currently using the Infoset could be extended to support the Infoset extensions by importing Infoset extension using the extension's Namespace name. To illustrate these concepts, we give an example by defining the XML Linking Language (XLink), the XML vocabulary for hyperlinking information, in terms of Infoset extensions. We show how the proposed ways of supporting Infoset extensions in XML technologies such as XPath, DOM, and CSS could pave the path to a better support (and hopefully faster adoption) of XLink than we see today. XLink serves as one example, but the proposed extensions and techniques are not limited to this particular technology. The content of this paper is work in progress, contributing to the ongoing debate on how to deal with different XML vocabularies and their usage in other XML technologies. We believe that making the Infoset extensible would provide a robust and flexible way of making the data model of XML-based data more versatile, and creating an accepted way of making the data available through standard interfaces such as DOM and XPath.
  • David Lowe and Erik Wilde, Improving Web Linking Using XLink, Open Publish 2001, Sydney, July 2001. (available as abstract, and PDF)
    Abstract: Although the Web has continuously grown and evolved since its introduction in 1989, the technical foundations have remained relatively unchanged. Of the basic technologies, URLs and HTTP has remained stable for some time now, and only HTML has changed more frequently. However, the introduction of XML has heralded a substantial change in the way in which content can be managed. One of the most significant of these changes is with respect to the greatly enhanced model for linking functionality that is enabled by the emerging XLink and XPointer standards. These standards have the capacity to fundamentally change the way in which we utilise the Web, especially with respect to the way in which users interact with information. In this paper, we will discuss some of the richer linking functionality that XLink and XPointer enable — particularly with respect to aspects such as content transclusion, multiple source and destination links, generic linking, and the use of linkbases to add links into content over which the author has no control. The discussions will be illustrated with example XLink code fragments, and will emphasise the particular uses to which these linking concepts can be put.
  • Erik Wilde and David Lowe, From Content-centered Publishing to a Link-based View of Information Resources, 33rd Hawaii International Conference on System Sciences (HICSS-33), Maui, Hawaii, January 2000. (available as abstract, PostScript, and PDF)
    Abstract: Influenced by the linking model which is implicit in HTML, today's publishing model on the Web is content-centered, with the emphasis of publishing on content rather than links. With the growing amount of information available on the Web, and the more powerful hypermedia architectures made possible by new Web technologies, putting the content into context will become increasingly important. In this paper, a new way of structuring publishing systems for information providers is presented in an attempt to shift the emphasis in Web-based publishing from content to an improved balance between content and links. After a description of the architecture of a link-based publishing system, a strategy for implementing such a system is described. Finally, a number of challenges associated with such a fundamental transition in the publishing model are described, in the technical as well as in the organizational domain.
  • Erik Wilde, Murali Nanduri, and Bernhard Plattner, A Transport-Independent Component for a Group and Session Management Service in Group Communications Platforms, European Conference on Multimedia Applications, Services and Techniques (ECMAST 96), Louvain-la-Neuve, Belgium, May 1996. (available as abstract, PostScript, and PDF)
    Abstract: Group communications is an area of research which has received a lot of attention recently. This paper focuses on a model and the architecture of a system which supports group communications by providing group and session management functionality. This system is an extension of directory services which are used with unicast communications. New functionality is needed for the dynamics of group communications (members of a connection may change over the lifetime of the connection) and increased complexity of relations. A model is described which defines six object types which represent the relevant objects. Users and groups represent real world users and their relations. Sessions and flows describe ongoing group communications. Flow templates and certificates provide mechanisms for management and security issues. The architecture presented in this paper is transport-independent, ie it can be used within different group communication platforms. A short sketch of the implementation is given in the last section.
  • Erik Wilde, Group Management and Communication Support for Collaborative Applications, Conference on Upper Layer Protocols, Architectures and Applications (ULPAA 95), Sydney, December 1995. (available as abstract, PostScript, and PDF)
    Abstract: In this paper, an architecture for communication support for collaborative applications is described. The motivation for the design of this architecture is the observation that generic support for group communications is an area which received not much attention until now. The design is based on two components, a Group Management System (GMS) and Group Communication Support (GCS). The GMS is responsible for managing the name space of the support platform. Users and groups are the two entities of the name space, and two different relationships between them (membership and manager) can be established. This way it is possible to reflect the structure of collaborative workers inside the GMS. The GCS component is responsible for establishing connections between collaborative applications using the GMS/GCS and for hiding the details of the multicast transport infrastructure from the application. It is possible to bind users and groups to specific applications and multicast transport services. This way any group can be used by different applications using different transport services. The main advantages of GMS/GCS are reduced implementation costs, a shared name space of users and groups, and a simple interface to different multicast transport services.

Reviewed Conference Posters

  • Erik Wilde and Philippe Cattin, Presenting in HTML, ACM Symposium on Document Engineering (DocEng 2007), Winnipeg, Manitoba, August 2007. (available as abstract and PDF)
    Abstract: The management and publishing of complex presentations is poorly supported by available presentation software. This makes it hard to publish usable and accessible presentation material, and to reuse that material for continuously evolving events. XSLidy provides a XSLT-based approach to generate presentations out of a mix of HTML and structural elements. Using XSLidy, the management and reuse of complex presentations becomes easier, and the results are more user-friendly in terms of usability and accessibility.
  • Erik Wilde and Felix Michel, XML-Based XML Schema Access, 16th International World Wide Web Conference (WWW2007), Banff, Alberta, May 2007. (available as abstract and PDF)
    Abstract: XML Schema's abstract data model consists of components, which are the structures that eventually define a schema as a whole. XML Schema's XML syntax, on the other hand, is not a direct representation of the schema components, and it proves to be surprisingly hard to derive a schema's components from the XML syntax. The Schema Component XML Syntax (SCX) is a representation which attempts to map schema components as faithfully as possible to XML structures. SCX serves as the starting point for applications which need access to schema components and want to do so using standardized and widely available XML technologies.
  • Erik Wilde and Felix Michel, SPath: A Path Language for XML Schema, 16th International World Wide Web Conference (WWW2007), Banff, Alberta, May 2007. (available as abstract and PDF)
    Abstract: XML is increasingly being used as a typed data format, and therefore it becomes more important to gain access to the type system; very often this is an XML Schema. The XML Schema Path Language (SPath) presented in this paper provides access to XML Schema components by extending the well-known XPath language to also include the domain of XML Schemas. Using SPath, XML developers gain access to XML Schemas and thus can more easily develop software which is type- or schema-aware, and thus more robust.
  • Felix Michel and Erik Wilde, Extensible Schema Documentation with XSLT 2.0, 16th International World Wide Web Conference (WWW2007), Banff, Alberta, May 2007. (available as abstract and PDF)
    Abstract: XML Schema documents are defined using an XML syntax, which means that the idea of generating schema documentation through standard XML technologies is intriguing. We present X2Doc, a framework for generating schema-documentation solely through XSLT. The framework uses SCX, an XML syntax for XML Schema components, as intermediate format and produces XML-based output formats. Using a modular set of XSLT stylesheets, X2Doc is highly configurable and carefully crafted towards extensibility. This proves especially useful for composite schemas, where additional schema information like Schematron rules are embedded into XML Schemas.
  • Erik Wilde, Modulare und Offene Komponenten zur Wissensverwaltung, 11. Europäische Jahrestagung der Gesellschaft für Medien in der Wissenschaft (GMW06), Zürich, Switzerland, September 2006. (available as abstract and PDF)
    Abstract: Wissensvermittlung setzt zu einem massgeblichen Teil nicht nur das Lehren von Fakten und Methoden voraus, sondern unverzichtbar auch deren Einordnung in den durch das Fachgebiet vorgegebenen Rahmen. Eine ICT Strategie wissensvermittelnder Organisationen sollte diesem weiten Fokus der Wissensvermittlung Rechnung tragen und durch strategische Zielsetzungen verhindern, dass geschlossene Insellösungen entstehen, die dem Ziel der Vermittlung vernetzten Wissens abträglich sind. Im Rahmen geeigneter strategischer und technischer Rahmenbedingungen können heutzutage basierend auf existierenden Technologien Tools entwickelt werden, die sich durch ihr modulares und offenes Konzept optimal im sich ständig ändernden ICT Umfeld einer Hochschule einsetzen lassen. Am Beispiel eines Tools zur Verwaltung von Literaturverweisen wird erläutert, wie eine offene ICT Strategie in Form technischer Lösungen umgesetzt werden kann.
  • Erik Wilde, Tables and Trees Don't Mix (very well), 15th International World Wide Web Conference (WWW2006), Edinburgh, UK, May 2006. (available as abstract, PDF, and HTML)
    Abstract: There are principal differences between the relational model and XML's tree model. This causes problems in all cases where information from these two worlds has to be brought together. Using a few rules for mapping the incompatible aspects of the two models, it becomes easier to process data in systems which need to work with relational and tree data. The most important requirement for a good mapping is that the conceptual model is available and can thus be used for making mapping decisions.
  • Kaspar Giger and Erik Wilde, XPath Filename Expansion in a Unix Shell, 15th International World Wide Web Conference (WWW2006), Edinburgh, UK, May 2006. (available as abstract, PDF, and HTML)
    Abstract: Locating files based on file system structure, file properties, and maybe even file contents is a core task of the user interface of operating systems. By adapting XPath's power to the environment of a Unix shell, it is possible to greatly increase the expressive power of the command line language. We present a concept for integrating an XPath view of the file system into a shell, which can be used to find files based on file attributes and contents in a very flexible way. The syntax of the command line language is backwards compatible with traditional shells, and the new XPath-based expressions can be easily mastered with a little bit of XPath knowledge.
  • Erik Wilde, Structuring Namespace Descriptions, 15th International World Wide Web Conference (WWW2006), Edinburgh, UK, May 2006. (available as abstract, PDF, and HTML)
    Abstract: Namespaces are a central building block of XML technologies today, they provide the identification mechanism for many XML-related vocabularies. Despite their ubiquity, there is no established mechanism for describing namespaces, and in particular for describing the dependencies of namespaces. We propose a simple model for describing namespaces and their dependencies. Using these descriptions, it is possible to compile directories of namespaces providing searchable and browsable namespace descriptions.
  • Erik Wilde, Merging Trees: File System and Content Integration, 15th International World Wide Web Conference (WWW2006), Edinburgh, UK, May 2006. (available as abstract, PDF, and HTML)
    Abstract: XML is the predominant format for representing structured information inside documents, but it stops at the level of files. This makes it hard to use XML-oriented tools to process information which is scattered over multiple documents within a file system. File System XML (FSX) and its content integration provides a unified view of file system structure and content. FSX's adaptors map file contents to XML, which means that any file format can be integrated with an XML view in the integrated view of the file system.
  • Erik Wilde, Describing Namespaces with GRDDL, 14th International World Wide Web Conference (WWW2005), Chiba, Japan, May 2005. (available as abstract and PDF)
    Abstract: Describing XML Namespaces is an open issue for many users of XML technologies, and even though namespaces are one of the foundations of XML, there is no generally accepted and widely used format for namespace descriptions. We present a framework for describing namespaces based on GRDDL using a controlled vocabulary. Using this framework, namespace descriptions can be easily generated, harvested and published in human- or machine-readable form.
  • Sai Anand and Erik Wilde, Mapping XML Instances, 14th International World Wide Web Conference (WWW2005), Chiba, Japan, May 2005. (available as abstract and PDF)
    Abstract: For XML-based applications in general and B2B applications in particular, mapping between differently structured XML documents, to enable exchange of data, is a basic problem. A generic solution to the problem is of interest and desirable both in an academic and practical sense. We present a case study of the problem that arises in an XML based project, which involves mapping of different XML schemas to each other. We describe our approach to solving the problem, its advantages and limitations. We also compare and contrast our approach with previously known approaches and commercially available software solutions.
  • Erik Wilde, Character Repertoire Validation for XML Documents, Twelfth International World Wide Web Conference (WWW2003), Budapest, Hungary, May 2003. (available as abstract, PDF, and HTML)
    Abstract: XML documents may contain a large diversity of characters. The Character Repertoire Validation for XML (CRVX) language is a simple schema language for specifying character repertoire constraints. These constraints can be specific for syntax- and/or context-based parts of an XML document. The constraints are based on the character classes introduced by XML Schema's regular expressions.
  • Erik Wilde and Kilian Stillhard, Making XML Schema Easier to Read and Write, Twelfth International World Wide Web Conference (WWW2003), Budapest, Hungary, May 2003. (available as abstract, PDF, and HTML)
    Abstract: XML Schema is a rather complex schema language, partly because of its inherent complexity, and partly because of its XML syntax. In an effort to reduce the syntactic verboseness and complexity of XML Schema, we designed the XML Schema Compact Syntax (XSCS), a non-XML syntax for XML Schema. XSCS is designed for human users, and transformations from and to XML Schema XML syntax are implemented using Java-based tools.
  • Erik Wilde, Martin Waldburger, and Beat Krähenmann, Conference Time-Table Management, Twelfth International World Wide Web Conference (WWW2003), Budapest, Hungary, May 2003. (available as abstract, PDF, and HTML)
    Abstract: Conference time-tables provide information that is indispensable for all attendees. Since there are a lot of reusable data structures and tasks, we have designed the Conference Time-Table Management (CTTM) system, which is intended to be used as a reusable component in a large diversity of conference Web sites. CTTM features a flexible concept for time-tables and provides users with personalization and notification services.
  • Erik Wilde, Linkbase Access Protocol Design, Eleventh International World Wide Web Conference (WWW2002), Honolulu, Hawaii, May 2002. (available as abstract and PDF)
    Abstract: XML itself does not support hypermedia, but the XLink standard has been defined to make XML usable for hypermedia. One of XLink's most interesting features is its support for external links and linkbases, which makes it possible to create links between resources without having to change the resources. In order to use these links, user agents must access linkbases and query them for relevant links, and we present our approach to create a protocol for linkbase access.
  • Marcel Dasen and Erik Wilde, Keeping Web Indices up-to-date, Tenth International World Wide Web Conference (WWW10), Hong Kong, May 2001. (available as abstract and PDF)
    Abstract: Search engines play a crucial role in the Web. Without search engines large parts of the Web becomes inaccessible for the majority of users. Search engines can make new and smaller sites accessible at low cost. Without them, other media, such as Television, would be needed to advertise the existence new site on the Web, only large commercial sites can follow this path. The Web would be endangered to become dominated by a few, well known sites. A crucial problem of search engines is to keep their index up-to-date. Especially if the index grows, the effort needed to update the index increases, since Web documents are dynamic and thus already stored data becomes obsolete. There have been various attempts to monitor the evolvement of the Web. However, we believe, that change model used in prior work over-estimates the rate of change due to an inadequate change model. Our change model has been adapted from the information retrieval field to distinguish index relevant changes from irrelevant modifications in Web documents, e.g. simple spelling corrections or dynamic advertisement links. We have monitored multiple smaller collections of documents over a time period of six month to measure the documents change.
  • Luca Previtali, Brenno Lurati, and Erik Wilde, BibTeXML: An XML Representation of BibTeX, Tenth International World Wide Web Conference (WWW10), Hong Kong, May 2001. (available as abstract and PDF)
    Abstract: BibTeXML is an XML representation of BibTeX data. It can be used to represent bibliographic data in XML. The advantage of BibTeXML over BibTeX's native syntax is that it can be easily managed using standard XML tools (in particular, XSLT style sheets), while native BibTeX data can only be manipulated using specialized tools.

Reviewed Workshop Papers

  • Erik Wilde, Site Metadata on the Web, Second Workshop on Human-Computer Interaction and Information Retrieval (HCIR 2008), Redmond, Washington, October 2008. (available as abstract and PDF)
    Abstract: The navigation structure of Web sites can be regarded as metadata that can be used for interesting applications in User Interface (UI) design and Human-Computer Interaction (HCI), as well as for Information Retrieval (IR) tasks. However, there currently is no established format for site metadata, which makes it hard for Web sites to publish their structure in a machine-readable way, which could then be used by HCI and/or IR applications. We propose a model and a format for site metadata that is built on top of an existing format and thus could be deployed with little overhead by publishers as well as consumers. Making site metadata available as machine-readable data can be used for improving user interfaces (informing user agents about the context of the page they are displaying) and better information retrieval (allowing search engines to use sitemap information for better ranking and display of the results).
  • Bernt Wahl and Erik Wilde, Mapping the World ... One Neighborhood at a Time, First International Workshop on Trends in Pervasive and Ubiquitous Geotechnology and Geoinformation, Park City, Utah, September 2008. (available as PDF)
  • Erik Wilde, Location Management for Mobile Devices, 3rd IEEE Workshop on Advanced Experimental Activities on Wireless Networks & Systems (EXPONWIRELESS 2008), Newport Beach, California, June 2008. (available as abstract and PDF)
    Abstract: Location-awareness, in the form of location information about clients and location-based services provided by servers, is becoming increasingly important for networked communications in general, and wireless and mobile devices in particular. The current fragmented landscape of location concepts and location-awareness, however, is not suitable for handling location information on a Web scale. Providing users with mechanisms which allow them to control how they want to expose their location information, and thus allow control over how to share location information with others and services, is a crucial step for better location management for mobile devices. This paper presents a concept for representing location vocabularies, matching and mapping them, how these vocabularies can be used to support better privacy for users of location-based services, and better location sharing between users and services. The concept is based on a language for describing place name vocabularies, which we call Place Markup Language (PlaceML), and on various ways how these vocabularies can be used in a location-aware infrastructure of networked devices.
  • Erik Wilde and Martin Kofahl, The Locative Web, First International Workshop on Location and the Web (LocWeb 2008), Beijing, China, April 2008. (available as abstract, PDF, and paper presentation)
    Abstract: The concept of location has become very popular in many applications on the Web, in particular for those which aim at connecting the real world with resources on the Web. However, the Web as it is today has no overall location concept, which means that applications have to introduce their own location concepts and have done so in incompatible ways. By turning the Web into a location-aware Web, which we call the Locative Web, location-oriented applications get better support for their location concepts on the Web, and the Web becomes an information system where location-related information can be more easily shared across different applications and application areas. We describe a location concept for the Web supporting different location types, its embedding into some of the Web's core technologies, and prototype implementations of these concepts in location-enabled Web components.
  • Erik Wilde, The Plain Web, Web Science Workshop (WSW2008) at WWW2008, Beijing, China, April 2008. (available as abstract, PDF, and paper presentation)
    Abstract: The Web has become a very popular starting point for many innovations targeting infrastructure, services, and applications. One of the challenges of today's vast Web landscape is to monitor ongoing developments, put them into context, and assess their chances of success. One of the main virtues of a more scientific approach towards the Web landscape would be a clear differentiation between approaches which build on top of the infrastructure of the Web, with little embedding into the landscape itself, and those that are intended to blend into the Web, becoming a part of the Web itself. One of the main challenges in this area is to understand and classify new developments, and a better understanding of various dimensions of Web technology design would make it easier to assess the chances of success of any given development. This paper presents a preliminary classification, and presents arguments how those factors influence the chance for success.
  • Erik Wilde, Metaschema Layering for XML, Workshop on XML Technologies for the Semantic Web (XSW 2004), Berlin, Germany, October 2004. (available as abstract, PDF, and paper presentation)
    Abstract: The Extensible Markup Language (XML) is based on the concept of schema languages, which are used for validation of XML documents. In most cases, the metamodeling view of XML-based application is rather simple, with XML documents being instances of some schema, which in turn is based on some schema language. In this paper, a metaschema layering approach for XML is presented, which is demonstrated in the context of various application scenarios. This approach is based on two generalizations of the standard XML schema language usage scenario: (1) it is assumed that one or more schema languages are acceptable as foundations for an XML scenario, but these schema languages should be customized by restricting, extending, or combining them; (2) for applications requiring application-specific schema languages, these schema languages can be implemented by reusing existing schema languages, thus introducing an additional metaschema layer. Metaschema layering can be used in a variety of application areas, and this paper shows some possible applications and mentions some more possibilities. XML is increasingly entering the modeling domain, since it is gradually moving from an exchange format for structured data into the applications as their inherent model. XML modeling still is in its infancy, and the metaschema layering approach presented in this paper is one contribution how to leverage the most important of XML feature's, which is the reuse of existing concepts and implementations.
  • Erik Wilde, Pascal Freiburghaus, Daniel Koller, and Bernhard Plattner, A Group and Session Management System for Distributed Multimedia Applications, Third COST 237 Workshop on Multimedia Telecommunications and Applications, Barcelona, Spain, November 1996. (available as abstract, PostScript, and PDF)
    Abstract: Distributed multimedia applications are very demanding with respect to support they require from the underlying group communication platform. In this paper, an approach is described which aims at providing group communication platform designers with a component which can be used for powerful group and session management functionality. This component, which can be integrated into group communication platforms, is part of a system called the group and session management system (GMS). The GMS model consists of GMS user agents, which are the components to be integrated into group communication platforms, and GMS system agents which are distributed directory agents providing the distributed database which the user agents access. Communication between these two types of agents is defined in two protocols, the GMS access protocol between user agents and system agents, and the GMS system protocol between system agents. GMS also defines a number of objects and relations which can be used to manage users, groups, and sessions on a very abstract level, thus providing both group communication platform designers and programmers of distributed multimedia application with a high-level description of group communications. This approach enables a truly integrated approach for collaborative applications, where all applications, even when using different group communication platforms, can share the same database about users, groups, and sessions. The paper also contains a short description of the ongoing implementation of GMS's components.
  • Daniel Bauer, Erik Wilde, and Bernhard Plattner, Design Considerations for a Multicast Communication Framework, Tenth Annual Workshop on Computer Communications (TCCC 95), Eastsound, Washington, September 1995. (available as abstract, PostScript, and PDF)
    Abstract: In the last years, networked multimedia multipoint applications have been developed in conjunction with emerging broadband networks. Experiences have shown that existing transport systems support these applications only insuffici