<?xml version="1.0" encoding="UTF-8"?>
<!-- $Id: ecourts2008.xml 903 2008-12-09 21:38:04Z dret $ -->
<?hotspot layout-path="hotspot/hotspot/layout" ?>
<?hotspot kilauea-path="hotspot/kilauea" ?>
<?hotspot layout="iSchool" ?>
<hotspot xmlns="http://dret.net/xmlns/hotspot/1" xmlns:hotspot="http://dret.net/xmlns/hotspot/1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://dret.net/xmlns/hotspot/1 hotspot/hotspot/schemas/hotspot.xsd">
	<configuration>
		<link subsections="yes" bookmarks="yes" versions="ecourts2008.xml" home="./" help="quick" contents="./" glossary="http://dret.net/glossary/" author="http://dret.net/netdret/"/>
		<paths img="img" listing="src"/>
		<outline count-text=" [*]" count-depth="all"/>
		<hyperlink extra=""/>
		<extension file="html" link=""/>
		<counter separator=":&#160;"/>
	</configuration>
	<license uri="http://creativecommons.org/licenses/by/3.0/" short="CC 3.0">
		<div class="license">
			<p><a rel="license" title="view full text of license" href="http://creativecommons.org/licenses/by/3.0/"><img alt="Creative Commons License" src="hotspot/hotspot/layout/iSchool/iSchool/somerights20.png" border="0" height="31" width="88"/></a></p>
			<p><a class="outlink" rel="license" title="view full text of license" href="http://creativecommons.org/licenses/by/3.0/">This work is licensed under a CC<br/>Attribution 3.0 Unported License</a></p>
		</div>
	</license>
    <title><a href="http://www.e-courts.org/sites/S71/index.php?p=754">E-Courts 2008</a>, Las Vegas</title>
    <author short="E. Wilde" affiliation="UC Berkeley ISchool"><a href="http://dret.net/netdret/" title="dret.net">Erik Wilde</a></author>
    <affiliation short="UC Berkeley ISchool"><a href="http://www.berkeley.edu/" title="University of California, Berkeley">UC Berkeley</a> <a href="http://ischool.berkeley.edu/" title="ISchool">School of Information</a></affiliation>
    <copyright>2008 Erik Wilde</copyright>
    <presentation id="index">
        <title>Electronic Document Technology Standards and Signatures</title>
		<date>December 9, 2008</date>
        <toc class="abstract">PDF, PDF/A, OOXML, OpenDocument. What is the alphabet soup? In recent years technologists have been attempting to make electronic documents more transportable across systems and displays as well as improving their usability. This session will explain these various document formats and how your court can use the technology to improve data capture, display, and information security.</toc>
		<slide>
			<title>Abstract</title>
			<p class="abstract"><toc class="abstract"/></p>
		</slide>
		<part id="about">
			<title>About this Presentation</title>
			<part id="about-dret">
				<title>About Me</title>
				<slide>
					<title>About Me</title>
					<ul>
						<li>Computer Science at <a href="http://www.tu-berlin.de/eng/">Technical University of Berlin (TUB)</a> (88-91)</li>
						<li>Ph.D. at <a href="http://www.ethz.ch/index_EN">ETH Zürich</a> (92-97)</li>
						<li>Post-Doc at <a href="http://www.icsi.berkeley.edu/" title="International Computer Science Institute">ICSI</a>, Berkeley (97/98)</li>
						<li>Various activities back in Switzerland (98-06)</li>
						<ul>
							<li>teaching at <a href="http://www.ethz.ch/index_EN">ETH Zürich</a> and <a href="http://www.fhnw.ch/">FHNW</a></li>
							<li>working as independent consultant (training, courses, consulting)</li>
							<li>research in <a href="http://dret.net/projects/">various XML-related areas</a></li>
						</ul>
						<li>Professor at the <a href="http://ischool.berkeley.edu/">School of Information</a> (since Fall 2006)</li>
						<ul>
							<li>Technical Director of the <link href="about-isd">Information and Service Design (ISD) program</link></li>
						</ul>
					</ul>
				</slide>
			</part>
			<part id="about-isd">
				<title>About ISD</title>
				<slide id="isd">
					<title>Information and Service Design (ISD)</title>
					<ul>
						<li>Part of <a href="http://www.berkeley.edu/">UC Berkeley</a>'s <a href="http://ischool.berkeley.edu/">School of Information</a></li>
						<li>Connecting our students with real-world scenarios and projects</li>
						<ul>
							<li><q>Building Stuff That Actually Works</q></li>
							<li>getting involved in project management and associated challenges</li>
							<li>understanding the real-world challenges of information modeling</li>
						</ul>
						<li>Focus on open information systems and open information access</li>
						<ul>
							<li><q>usability</q> and <q>accessibility</q> should become terms beyond the UI realm</li>
						</ul>
						<li>Example areas of ISD interest</li>
						<ul>
							<li>e-Books beyond <q>iTunes for books</q>: open formats, flexible reuse</li>
							<li>open data for field researchers: sharing information as simply as possible</li>
							<li>location on the Web: how to turn the Web into a location-aware system</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Information-Intensive Applications</title>
					<ul>
						<li>Traditional enterprise IT solutions have limits</li>
						<ul>
							<li>built for long life-cycles of deployed system architectures</li>
							<li>built for integration of existing systems into a unified landscape</li>
						</ul>
						<li>Many enterprise IT solutions cannot keep up very well</li>
						<ul>
							<li>by definition, they never completely fail</li>
							<li>they dictate the shape and direction of information flows</li>
						</ul>
						<li>The Web is by far the biggest information system that ever existed</li>
						<ul>
							<li>built around an astonishingly primitive data model</li>
							<li>the simplicity is not a deficiency, it is a feature</li>
							<li>everybody can cooperate as long as there is minimal agreement</li>
							<li>the Web's architectural principle can be reused for enterprise IT</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Project: Environmental Data</title>
					<ul>
						<li>Government agencies collect and manage a lot of environmental data</li>
						<ul>
							<li>some of it is accessible in historical or current archives</li>
							<li>some of it is permanently produced by sensors</li>
						</ul>
						<li>Large-scale data aggregation presents various challenges</li>
						<ul>
							<li>implementation issues of sensor deployment and management</li>
							<li>organization issues of classifying and grouping sensors</li>
							<li>access issues of being able to access subsets of the available data</li>
							<li>policy issues of sensible data and possible access restrictions</li>
						</ul>
						<li>Web architecture presents a proven path for large-scale systems</li>
						<ul>
							<li>built on loose coupling and cooperation rather than integration</li>
							<li>built on a different architecture than traditional enterprise IT</li>
						</ul>
					</ul>
				</slide>
				<slide id="criminal-record">
					<title>Project: Justice and the Criminal Record</title>
					<ul>
						<li>Criminal records are important for background checks</li>
						<ul>
							<li>companies collect information and are re-sellers</li>
							<li>there is no expiration date for this information</li>
						</ul>
						<li>Criminal record information changes in important ways</li>
						<ul>
							<li>new entry: important for background check (false negative)</li>
							<li>expunged entry: not critical for background check (false positive)</li>
							<li>little business incentives for companies to properly delete entries</li>
						</ul>
						<li>Information accessibility can introduce new challenges</li>
						<ul>
							<li>how to hold people accountable for providing outdated data</li>
							<li>how to create incentives for properly updating data</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Dream Project: Services, not Sites</title>
					<ul>
						<li>Government agencies should provide services, not sites</li>
						<li>Sites are hard to build and hard to maintain</li>
						<ul>
							<li>often built with specific use cases in mind</li>
							<li>technology evolves and sites must be maintained to keep up</li>
						</ul>
						<li>Sites get in the way of services</li>
						<ul>
							<li>often service access is possible only through a site</li>
						</ul>
						<li>Services provide all the necessary information</li>
						<ul>
							<li>exposing what the public has paid for</li>
							<li>not spending public money for building interfaces</li>
						</ul>
						<li>Policy issues around service design and information usage</li>
						<ul>
							<li>information licenses must be developed to avoid <link href="criminal-record">data rot</link></li>
							<li><q href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1138083">eat your own dogfood</q> is a good start, but not sufficient (tastes differ)</li>
						</ul>
					</ul>
				</slide>
			</part>
		</part>
		<part id="document-formats">
			<title>Document Standards</title>
			<slide>
				<title>REST</title>
				<ul>
					<li>The Web is built on <em>Representational State Transfer (REST)</em></li>
					<ul>
						<li><em>resources</em> are the <q>units of interest</q> in any REST design</li>
						<li>peers interact by exchanging <em>representations of resources</em></li>
						<li>interactions can only use a <em>small number of predefined verbs</em> (4 in HTTP)</li>
						<li>state transitions are using <em>hypertext as the engine of application state</em></li>
					</ul>
					<li>Documents often are the core part of a RESTful system architecture</li>
					<ul>
						<li>the only absolute core part of REST is <link href="identity">identification</link> (URIs)</li>
						<li>communications are often based on HTTP(-S)</li>
						<li>representations often use HTML or some XML vocabulary</li>
						<li>representations have primacy over functions or interactions</li>
					</ul>
				</ul>
			</slide>
			<slide>
				<title>Document Exchange as Business Interactions</title>
				<ul>
					<li>Traditional enterprise IT is based on integration</li>
					<ul>
						<li>model the complete system as one big distributed program</li>
						<li>implement the system using some distributed programming environment</li>
						<li>programming is based on the abstraction of building one big system</li>
					</ul>
					<li>Web architecture is based on cooperation</li>
					<ul>
						<li>there is no overarching model, there are only local models</li>
						<li>peers can interact by exchanging information about resources</li>
						<li>cooperation is achieved by agreeing on representations of resources</li>
						<li>there should be no assumptions about availability, links can always break</li>
					</ul>
					<li>Names for the debate: <q>REST vs. SOAP</q> or <q>REST vs. WS-*</q></li>
					<li>This is an ongoing debate and will not go away anytime soon</li>
				</ul>
			</slide>
			<part>
				<title>Application-Independent Formats</title>
				<slide>
					<title>History of Document Interchange</title>
					<ul>
						<li>Plain text and structured text</li>
						<ul>
							<li>plain text only needs agreement on a common character set (e.g., ASCII or Unicode)</li>
							<li>first data formats were <em>comma-delimited</em> or <em>tab-delimited</em> structures</li>
							<li><em>SGML (Standard Generalized Markup Language)</em> was the first open document format</li>
							<li><link href="xml">XML (Extensible Markup Language)</link> streamlined SGML to become usable on the Web</li>
						</ul>
						<li><em>Document formats</em> vs. <em>data formats</em></li>
						<ul>
							<li><em>data formats</em> represent database-like structures (e.g., UML or ER)</li>
							<li><em>document formats</em> represent narrative documents structures</li>
							<li>many existing document collections use something in the middle</li>
							<li>many applications need something in the middle</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Structured Documents</title>
					<ul>
						<li>Most real-world data is <em>semi-structured</em> or <em>unstructured</em></li>
						<ul>
							<li>documents use titles, paragraphs, lists, and tables</li>
							<li>documents do not mark up person names, place names, …</li>
							<li><em href="http://en.wikipedia.org/wiki/Natural_language_processing">Natural Language Processing (NLP)</em> tries to extract structures</li>
						</ul>
						<li>IT people want structured data, users often don't like forms</li>
						<ul>
							<li>building good UIs is one of the core tasks for acceptance</li>
							<li>badly designed data entry is sabotaged and produces garbage</li>
							<li>provide feedback about the benefits of good data</li>
						</ul>
						<li>XML is a language for building languages, <em href="http://www.tbray.org/ongoing/When/200x/2006/01/09/On-XML-Language-Design">but don't do it</em></li>
						<ul>
							<li>XML does not define any semantics (i.e., it only defines structures)</li>
							<li>XML supports semi-structured data (supporting incremental refinement)</li>
							<li>vocabularies define structure and semantics of <em>XML document types</em></li>
							<li>vocabularies may provide/use <em>modules</em>, thereby allowing flexible reuse</li>
						</ul>
					</ul>
				</slide>
				<slide id="html">
					<title>HTML</title>
					<ul>
						<li>HTML is the standard document format on the Web</li>
						<li><a href="http://microformats.org/">Microformats</a> can be used to improve document semantics</li>
						<ul>
							<li>earlier microformats were not based on a common syntax</li>
							<li><em href="http://en.wikipedia.org/wiki/RDFa">RDFa</em> (October 2008) provides a standardized syntax</li>
						</ul>
						<li>Why HTML often is not even considered as a document format</li>
						<ul>
							<li>designed for logical structures, so rendering depends on clients</li>
							<li>designed for continuous display, so paged content is not a natural fit</li>
							<li>poor print support in regular browsers (problem of CSS and bad browser support)</li>
						</ul>
						<li>Why HTML should be considered as a document format</li>
						<ul>
							<li>focus on content structures rather than rendering</li>
							<li>easy to adapt to a wide variety of clients</li>
							<li>printing problem can be solved with custom print processing</li>
						</ul>
					</ul>
				</slide>
				<slide id="pdf">
					<title>PDF</title>
					<ul>
						<li><em href="http://en.wikipedia.org/wiki/Portable_Document_Format">Portable Document Format (PDF)</em> evolved from a printer language</li>
						<ul>
							<li>based on <em href="http://en.wikipedia.org/wiki/PostScript">PostScript</em>, a page description language for printers</li>
							<li>removed some programming features, added a lot of file format features</li>
						</ul>
						<li><em>Acrobat Reader</em> as a free product made PDF successful</li>
						<ul>
							<li>the <q>give away the reader, charge for the writer</q> strategy</li>
						</ul>
						<li>PDF has become a complex and complicated specification</li>
						<ul>
							<li>successful commercial products add features, which add data format complexity</li>
							<li>backwards compatibility almost always means that no features will be removed</li>
						</ul>
						<li>Microsoft wants a piece of the pie with its <em href="http://en.wikipedia.org/wiki/XML_Paper_Specification">XML Paper Specification (XPS)</em></li>
						<li>PDF 1.7 is the latest version (implemented by Acrobat 9.0)</li>
						<ul>
							<li>published by ISO as <em>ISO 32000-1:2008</em> in November 2008</li>
						</ul>
					</ul>
				</slide>
				<slide id="pdf-data">
					<title>PDF Data</title>
					<ul>
						<li>PDF has evolved into a multimedia container format</li>
						<ul>
							<li>support for various media types such as images, audio, and video</li>
							<li>PDF forms allow interactive forms to be created and filled out</li>
							<li>scripting can be used to further support interactive PDF</li>
							<li>extensions allow 3D models to be embedded into PDF</li>
						</ul>
						<li>Text can also appear in a variety of ways</li>
						<ul>
							<li>embedded images from scanning processes may only show text images</li>
							<li><em href="http://en.wikipedia.org/wiki/Optical_character_recognition">Optical Character Recognition (OCR)</em> may result in poorly recognized characters</li>
							<li>formatting software might include rendered characters (e.g., <q><code>fi</code></q> vs. <q><code>ﬁ</code></q>)</li>
							<li>formatted text might use non-embedded fonts</li>
						</ul>
						<li>rendering PDF is a challenging task</li>
						<li>searching PDF might be difficult or impossible</li>
					</ul>
				</slide>
				<slide id="pdf-metadata">
					<title>PDF Metadata</title>
					<ul>
						<li><em>Metadata (data about data)</em> is essential for document management</li>
						<ul>
							<li>it can be managed as an integral part of documents</li>
							<li>it can be managed externally by having <em>metadata records</em></li>
						</ul>
						<li>External metadata allows unified rules for metadata management</li>
						<ul>
							<li>the same metadata can be captured for all resources</li>
							<li>works for resource types with no metadata capabilities (e.g., books)</li>
						</ul>
						<li>Embedded metadata creates self-contained documents</li>
						<ul>
							<li>packaging issues become easier</li>
							<li>flexible embedded metadata formats support user-defined metadata models</li>
						</ul>
						<li>PDF supports various kinds of embedded metadata</li>
						<ul>
							<li>earlier versions had a small set of hardcoded metadata fields</li>
							<li><em href="http://en.wikipedia.org/wiki/Extensible_Metadata_Platform">Extensible Metadata Platform (XMP)</em> for extensible metadata (since PDF 1.4)</li>
						</ul>
					</ul>
				</slide>
				<slide id="pdfx">
					<title>PDF/X</title>
					<ul>
						<li>ISO-standardized PDF profile for pre-print document exchange</li>
						<ul>
							<li>focus on high fidelity rendering of PDF documents</li>
							<li>color spaces must be specified (important for printing)</li>
							<li>all fonts must be embedded</li>
							<li>various boxes must be defined for specifying the print area</li>
						</ul>
						<li>PDF/X is not a good choice for non-production workflows</li>
						<ul>
							<li>often very specific for one publishing workflow</li>
							<li>no constraints that focus on document management properties</li>
						</ul>
					</ul>
				</slide>
				<slide id="pdfa">
					<title>PDF/A</title>
					<ul>
						<li>ISO-standardized PDF profile for archiving PDF documents</li>
						<ul>
							<li>focus on long-term archiving of PDF documents</li>
							<li>color spaces must be specified (important for printing)</li>
							<li>all fonts must be embedded</li>
							<li>audio/video content and scripting are not allowed</li>
						</ul>
						<li>PDF/A is a good choice for archiving workflows</li>
						<ul>
							<li>documents should be verified before accepting them as PDF/A</li>
							<li>minimal amount of metadata must be embedded</li>
						</ul>
						<li>PDF/A-1b only focuses on the visual appearance of a document</li>
						<ul>
							<li>scanned pages can be contained as images only</li>
						</ul>
						<li>PDF/A-1a also focuses on the content of a document</li>
						<ul>
							<li><em href="http://blogs.adobe.com/acrolaw/2006/01/understanding_t_1.html">tagged PDF</em> supports searching and repurposing of document contents</li>
						</ul>
					</ul>
				</slide>
				<slide id="odf">
					<title>OpenDocument (ODF)</title>
					<ul>
						<li>Developed as the native format for <em href="http://www.openoffice.org/">OpenOffice</em></li>
						<li>Standardized by ISO as ISO/IEC 26300:2006</li>
						<li>Main starting point was the need for an open office format</li>
						<ul>
							<li>Microsoft's Office products used undocumented file formats</li>
							<li>document management should be based on documents, not products</li>
						</ul>
						<li>ODF's success forced Microsoft to open the Office file formats</li>
						<ul>
							<li>in 2005, Massachusetts stated that open formats should be used for public data</li>
							<li>in 2007, Massachusetts added <link href="ooxml"/> to the list of approved formats</li>
						</ul>
						<li>Disadvantages of ODF</li>
						<ul>
							<li>not as widely supported (but getting there)</li>
							<li>currently no standardized digital signature format (ODF 1.2)</li>
						</ul>
					</ul>
				</slide>
				<slide id="ooxml">
					<title>OOXML</title>
					<ul>
						<li>Microsoft started OOXML as a response to <link href="odf">ODF</link>'s challenge</li>
						<li>OOXML was blessed by <a href="http://www.ecma-international.org/">ECMA</a> (XPS uses the same strategy)</li>
						<ul>
							<li>ECMA is often used as a simple first step in standardization</li>
							<li>ECMA-approved specs can be fast-tracked in ISO</li>
							<li>Microsoft's tactics caused a lot of controversy among experts</li>
						</ul>
						<li>OOXML is a compressed package of various resources</li>
						<ul>
							<li>the <em>Open Packaging Conventions (OPC)</em> create an archive of all resources</li>
							<li>OOXML is a structured archive with conventions for its contents</li>
						</ul>
						<li>Disadvantages of OOXML</li>
						<ul>
							<li>6'500 pages of file format specification</li>
							<li>many redundancies for historical reasons (e.g., three different table models)</li>
							<li>the document XML format is not easy to process</li>
						</ul>
					</ul>
				</slide>
			</part>
			<part>
				<title>Application-Specific Formats</title>
				<slide>
					<title>Why Use XML?</title>
					<ul>
						<li>Because you want to share data</li>
						<ul>
							<li>share it in a format which is widely used and easy to use</li>
							<li>enable others to use it on various platforms with existing tools</li>
						</ul>
						<li>Because you want to share data cheaply</li>
						<ul>
							<li>it is easier to use XML than to invent something new</li>
							<li>it is even easier to use an existing XML schema than to invent a new one</li>
						</ul>
						<li>Because you want to share data openly</li>
						<ul>
							<li>if you invent new formats, people must process them</li>
							<li>avoid applying the <q>security through obscurity</q> principle inadvertently</li>
							<li>application-specific processing should be deferred to higher layers</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Is XML Self-Describing?</title>
					<ul>
						<li>XML is often said to be <q>self-describing</q></li>
						<ul>
							<li>many people think this is the same as <q>self-explanatory</q></li>
							<li>the catch is what exactly it is you refer to by <q>describing</q></li>
						</ul>
						<li>Database data cannot live without a database</li>
						<ul>
							<li>database data is simply content, the structure is provided by a DBMS</li>
							<li>XML documents have their structure encoded within them</li>
							<li>compared to database data, XML in fact is <q>self-describing</q></li>
						</ul>
						<li>What is the gap between <q>self-describing</q> and <q>self-explanatory</q>?</li>
						<ul>
							<li>it is impossible to find out how the document could be modified</li>
							<li>there are no semantics associated with neither structure nor content</li>
							<li>so <q>self-describing</q> means, you can guess a lot, but you maybe wrong</li>
						</ul>
					</ul>
				</slide>
				<slide id="xml">
					<title>XML is Syntax</title>
					<p>XML documents can use a wide array of characters. They are defined by <a href="http://www.unicode.org/">Unicode</a>, which currently (Version 5.0) defines more than 100'000 characters (#100'000 added in 2005).</p>
					<listing src="japanese1.xml"/>
					<listing src="japanese2.xml"/>
				</slide>
				<slide>
					<title>XML is Character-Based</title>
					<ul>
						<li>XML is <em>not</em> a binary format, it is <link href="unicode">based on Unicode</link></li>
						<ul>
							<li><q>binary structures</q> cannot (or rather should not) be described using XML</li>
						</ul>
						<li>Multimedia formats often are binary</li>
						<ul>
							<li>image formats such as GIF, JPEG, and PNG</li>
							<li>audio formats such as MP3 and AAC</li>
							<li>video formats such as MPEG4 and H.264</li>
						</ul>
						<li>But: multimedia also uses many XML formats</li>
						<ul>
							<li>vector graphics formats such as <em>Scalable Vector Graphics (SVG)</em></li>
							<li><em>Synchronized Multimedia Integration Language (SMIL)</em> for describing presentations</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>XML is a Syntax for Trees</title>
					<ul>
						<li>Not all data is easily represented by trees</li>
						<ul>
							<li>overlapping markup (multiple <q>views</q> of the same content)</li>
							<li>graph-like structures which are less constrained than trees</li>
						</ul>
						<li>What is it that you have in your tree?</li>
						<ul>
							<li>XML encodes a structure purely on the syntactic level</li>
							<li>what the structures <u>mean</u> is in no way described by XML</li>
							<li>XML structures must be accompanied by semantic descriptions</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>XML Usages</title>
					<ul>
						<li>XML can be used in different ways</li>
						<ul>
							<li>people should be able to use your XML directly using standard tools</li>
							<li>if they <em>absolutely need</em> a set of special tools, something is wrong</li>
						</ul>
						<li>XML is hip, so everybody wants to use it</li>
						<ul>
							<li>many things have been created ad-hoc and without much planning</li>
							<li>if you start something which is XML-based, use XML responsibly</li>
							<li>if you have to use some <q>bad XML</q>, complain about it</li>
						</ul>
						<li>Finding the balance can be hard</li>
						<ul>
							<li>XML is great for prototyping and experiments</li>
							<li>once you decide to redesign your XML, it may be too late</li>
							<li><em>XML documents</em> may be short-lived, <em>XML schemas</em> are definitely not</li>
						</ul>
					</ul>
				</slide>
			</part>
		</part>
		<part>
			<title>Document Security</title>
			<slide id="identity">
				<title>Identity</title>
				<ul>
					<li>Identity is a central hub of any IT security</li>
					<ul>
						<li><em>identity</em> is established by associating digital identities with real entities</li>
						<li>identities can be <em>grouped</em> and they can have <em>assigned roles</em></li>
						<li><em>authentication</em> is the process of verifying an identity claim</li>
						<li><em>access control</em> can be based on <em>identities</em>, <em>groups</em>, or <em>roles</em></li>
						<li><em>authorization</em> is the process of providing access to a controlled resource</li>
					</ul>
					<li><em>Authentication</em> is one of the tough problems of IT security</li>
					<ul>
						<li><em>usernames</em> and <em>passwords</em> are commonly used</li>
						<li>additional cues (smartcards, images, biometrics) may be used</li>
						<li><em>security questions</em> often are a bad idea for establishing identity</li>
					</ul>
					<li>Almost all IT security revolves around some <q>digital identity</q></li>
					<ul>
						<li>users find many ways around inconvenient security implementations</li>
					</ul>
				</ul>
			</slide>
			<part id="one-way-function">
				<title>One-Way Function</title>
				<slide>
					<title>Essence of Data</title>
					<ul>
						<li>Hashes (or <em>message digests</em>) are a well-known principle in computer science</li>
						<ul>
							<li>fast to compute (the goal is to make data handling more efficient)</li>
							<li>few collisions (there are always collisions because of the smaller size)</li>
							<li><em>checksums</em> and <em>Cyclic Redundancy Check (CRC)</em> are popular hashes</li>
						</ul>
						<li>One-way functions are cryptographically safe hashes</li>
						<ul>
							<li>not just for detecting errors, but also for preventing tampering</li>
							<li>often referred to as <em>cryptographic hash</em> or <em>digital fingerprint</em></li>
						</ul>
						<li>One-way functions must satisfy some additional criteria</li>
						<ul>
							<li>it must be very hard to find an input producing a given output</li>
							<li>it must be very hard to find two inputs producing the same output (<q>collision</q>)</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Reducing Data</title>
					<img style="width : 90% ; margin : 2% ; " src="hash.gif" title="Hash"/>
				</slide>
			</part>
			<part id="digital-signature">
				<title>Digital Signature</title>
				<slide>
					<title>Encrypted Fingerprints</title>
					<ul>
						<li>Hashes are used to check data integrity</li>
						<li><link href="one-way-function"/>s are used to check data integrity securely</li>
						<ul>
							<li>it is not possible to reverse engineer data for a given hash</li>
						</ul>
						<li>Signed hashes can be used to ensure data authenticity</li>
						<ul>
							<li>if the hash sum is signed, it cannot be changed</li>
							<li>if the data is changed, its hash will not match the signed hash</li>
						</ul>
						<li>Digital signatures work as long as the hash can be securely signed</li>
						<ul>
							<li>there must be a trusted <link href="identity"/> for verifying the hash signature</li>
						</ul>
					</ul>
				</slide>
				<slide id="certificate">
					<title>Certificate</title>
					<ul>
						<li>Certificates are digital signatures issued by a trusted party</li>
						<ul>
							<li>most digital signatures are created with certified public keys</li>
							<li>this means the digital signature is created based on a digitally signed key</li>
						</ul>
						<li>Who can you trust on the Web?</li>
						<ul>
							<li>trust can only start to grow based on initial trust in something</li>
							<li>many systems come with pre-installed trust (<em>root certificates</em>)</li>
							<li>certificates from other issuers will cause <a href="https://katapultmedia.com/">browsers to complain</a></li>
						</ul>
						<li>Certificates (like domain names) are a very easy way to make money</li>
						<ul>
							<li>in theory there are different levels of certificates with different levels of identity checking</li>
							<li>in practice most sites choose the cheapest one that does not produce an error message</li>
						</ul>
					</ul>
				</slide>
				<slide>
					<title>Creating a Digital Signature</title>
					<img style="height : 70% ; margin : 2% ; " src="signature-sign.jpg" href="http://en.wikipedia.org/wiki/Digital_signature"/>
				</slide>
				<slide>
					<title>Verifying a Digital Signature</title>
					<img style="height : 70% ; margin : 2% ; " src="signature-verify.jpg" href="http://en.wikipedia.org/wiki/Digital_signature"/>
				</slide>
			</part>
		</part>
		<slide>
			<title>Conclusions</title>
			<ul>
				<li>IT architecture has two major design phases</li>
				<ol>
					<li>modeling of information structures and business processes</li>
					<li>exposing required functionality through an interface/implementation</li>
				</ol>
				<li>Documents formats are essential for information models</li>
				<ul>
					<li>build your own model and use existing formats as a guidance</li>
					<li>provide implementations of the model by mapping it to (existing) formats</li>
				</ul>
				<li>Information models are the very core of many activities</li>
				<ul>
					<li><q>Getting the Job Done</q> requires good understanding of the job</li>
					<li>short term hacks are sufficient for activities with a short term horizon</li>
					<li>thorough analysis and understanding is required for longevity</li>
				</ul>
			</ul>
		</slide>
    </presentation>
 </hotspot>
