Web Site Metadata

Erik Wilde, Anuradha Roy

Erik Wilde, Anuradha Roy, Web Site Metadata, Journal of Web Engineering, 9(4):283-301, December 2010.

The currently established formats for how a Web site can publish metadata about a site's pages, the robots.txt file and sitemaps, focus on how to provide information to crawlers about where to not go and where to go on a site. This is sufficient as input for crawlers, but does not allow Web sites to publish richer metadata about their site's structure, such as the navigational structure. This paper looks at the availability of Web site metadata on today's Web in terms of available information resources and quantitative aspects of their contents. Such an analysis of the available Web site metadata not only makes it easier to understand what data is available today; it also serves as the foundation for investigating what kind of information retrieval processes could be driven by that data, and what additional data could be provided by Web sites if they had richer data formats to publish metadata.


Bibliography Navigation: Reference List; Author Index; Title Index; Keyword Index

Generated by sharef2html on 2011-04-15, 02:00:41.