Internationalization (I18N) & Localization (L10N)

Web-Based Publishing (INFO 290-19)

Erik Wilde, UC Berkeley School of Information
2007-03-22
Creative Commons License

This work is licensed under a Creative Commons
Attribution-NonCommercial-ShareAlike 2.5 License.

Abstract

Many publishing environments need to support multiple languages. In many cases, the requirement to support multiple languages surfaces in later stages of a product development or publishing solution, which can cause major design changes, driving up costs. Internationalization (I18N) is the approach to design systems which can adapt to different locales. Localization (L10N) is the activity to identify, define, and encode locales, based on internationalized software.

Blinkbase I18N & L10N

  <arc xlink:type="arc" xlink:from="permalink" xlink:to="blog" xlink:title="Blog Home"/>
  <arc xlink:type="arc" xlink:from="blog" xlink:to="author" xlink:title="Blog Author"/>
  <arc xlink:type="arc" xlink:from="author" xlink:to="blog" xlink:title="Authored Blogs"/>
  <arc xlink:type="arc" xlink:from="permalink" xlink:to="blog">
   <title xlink:type="title" xml:lang="en">Blog Home</title>
   <title xlink:type="title" xml:lang="de">Blog Webseite</title>
  </arc>
  <arc xlink:type="arc" xlink:from="blog" xlink:to="author">
   <title xlink:type="title" xml:lang="en">Blog Author</title>
   <title xlink:type="title" xml:lang="de">Autor des Blogs</title>
  </arc>
  <arc xlink:type="arc" xlink:from="author" xlink:to="blog">
   <title xlink:type="title" xml:lang="en">Authored Blogs</title>
   <title xlink:type="title" xml:lang="de">Geschriebene Blogs</title>
  </arc>

What is Language?

Beyond Language

Directionality and Screen Layout

Right-to-Left Layout for Outlook Web Access

Outline (Internationalization (I18N))

  1. Internationalization (I18N) [2]
  2. Localization (L10N) [2]
  3. Language Identification in Resources [4]
  4. URIs for Multilingual Resources [12]
  5. Conclusions [1]

Definition

Internationalization is the design and development of a product, application or document content that enables easy localization for target audiences that vary in culture, region, or language.

I18N Tasks

  1. UI elements (windows, menus) must be modified to accept translated text
  2. Static text must be made configurable
  3. Icons and graphics must be changed to be more culturally appropriate
  4. Sound files that contain spoken language must be re-recorded
  5. Online help must be translated
  6. Dynamic text (dates, times) must be formatted using the locale
  7. Text handling code must calculate word breaks using the locale
  8. Tabular data must be sortable using the locale

Outline (Localization (L10N))

  1. Internationalization (I18N) [2]
  2. Localization (L10N) [2]
  3. Language Identification in Resources [4]
  4. URIs for Multilingual Resources [12]
  5. Conclusions [1]

Definition

Localization refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale).

L10N Tasks

  1. Create translations for all interface elements
  2. Translate all static texts
  3. If necessary, create localized icons and graphics
  4. Any spoken text must be recorded in the target language
  5. Make sure that the localized product uses the localized online help
  6. Formatting of data types must be treated locale-specific
  7. If necessary, dictionaries and other language tools must be integrated
  8. Sorting functions in the code must respect the locale

Outline (Language Identification in Resources)

  1. Internationalization (I18N) [2]
  2. Localization (L10N) [2]
  3. Language Identification in Resources [4]
  4. URIs for Multilingual Resources [12]
  5. Conclusions [1]

Language Codes

ISO 639-2 Code List

dum|||Dutch, Middle (ca.1050-1350)|néerlandais moyen (ca. 1050-1350)
dut|nld|nl|Dutch; Flemish|néerlandais; flamand
dyu|||Dyula|dioula
dzo||dz|Dzongkha|dzongkha
efi|||Efik|efik
egy|||Egyptian (Ancient)|égyptien
eka|||Ekajuk|ekajuk
elx|||Elamite|élamite
eng||en|English|anglais
enm|||English, Middle (1100-1500)|anglais moyen (1100-1500)
epo||eo|Esperanto|espéranto
est||et|Estonian|estonien
ewe||ee|Ewe|éwé
ewo|||Ewondo|éwondo
fan|||Fang|fang
fao||fo|Faroese|féroïen
fat|||Fanti|fanti
fij||fj|Fijian|fidjien
fil|||Filipino; Pilipino|filipino; pilipino
fin||fi|Finnish|finnois
fiu|||Finno-Ugrian (Other)|finno-ougriennes, autres langues
fon|||Fon|fon

IANA Language Subtag Registry

%%
Type: region
Subtag: UA
Description: Ukraine
Added: 2005-10-16
%%
Type: region
Subtag: UG
Description: Uganda
Added: 2005-10-16
%%
Type: region
Subtag: UM
Description: United States Minor Outlying Islands
Added: 2005-10-16
%%
Type: region
Subtag: US
Description: United States
Added: 2005-10-16
%%

Language Identification in XML

  <arc xlink:type="arc" xlink:from="permalink" xlink:to="blog">
   <title xlink:type="title" xml:lang="en">Blog Home</title>
   <title xlink:type="title" xml:lang="de">Blog Webseite</title>
  </arc>
  <arc xlink:type="arc" xlink:from="blog" xlink:to="author">
   <title xlink:type="title" xml:lang="en">Blog Author</title>
   <title xlink:type="title" xml:lang="de">Autor des Blogs</title>
  </arc>

Outline (URIs for Multilingual Resources)

  1. Internationalization (I18N) [2]
  2. Localization (L10N) [2]
  3. Language Identification in Resources [4]
  4. URIs for Multilingual Resources [12]
  5. Conclusions [1]

Naming Language Variants

Variant Naming Variations

DNS Domains

http://en.example.com/foo/bar

Constructed Paths

http://example.com/en/foo/bar

Query Strings

http://example.com/foo/bar?lang=en

DNS TLDs

http://example.us/foo/bar

Cookies

http://example.com/foo/bar

Content Negotiation

http://example.com/foo/bar

Path Segment Name

http://example.com/foo/bar.en

URI Sub-Delimiter Comma

http://example.com/foo/bar,en

URI Sub-Delimiter Semicolon

http://example.com/foo/bar;lang=en

Now What?

Outline (Conclusions)

  1. Internationalization (I18N) [2]
  2. Localization (L10N) [2]
  3. Language Identification in Resources [4]
  4. URIs for Multilingual Resources [12]
  5. Conclusions [1]

Babelification