| Citation |
, From Legacy Documents to XML: A Conversion Framework, pp. 92-103, 9th European Conference on Digital Libraries, Andreas Rauber, Stavros Christodoulakis, A. Min Tjoa (Ed.), Lecture Notes in Computer Science, Springer-Verlag, Vienna, Austria, Lecture Notes in Computer Science, Vol. 3652, September 2005.
|
|---|---|
| Descriptions |
Abstract:
We present an integrated framework for the document conversion from legacy formats to XML format. We describe the LegDoC project, aimed at automating the conversion of layout annotations layout-oriented formats like PDF, PS and HTML to semantic-oriented annotations. A toolkit of different components covers complementary techniques the logical document analysis and semantic annotations with the methods of machine learning. We use a real case conversion project as a driving example to exemplify different techniques implemented in the project. |
| Resources | |
Bibliography Navigation: Reference List; Author Index; Title Index; Keyword Index
Generated by sharef2html on 2011-04-15, 02:00:41.