#Standards used in Language Technology and Lingusitics
##Language related ISO standards
##Language and Language Family Identification
-
ISO 639-1
-
ISO 639-2
-
ISO 639-3
-
ISO 639-4
-
ISO 639-5
-
ISO 639-6
-
Language tags as defined by the Internet Engineering Task Force (IETF)
-
BCP 47: Best Current Practice 47, which includes RFC 5646
-
RFC 5646, which superseded RFC 4646, which superseded RFC 3066. (Therefore all standards which depend on any of these 3 IETF standards now use ISO 639-3.)
##Character Encoding
- Unicode
- UTF-8
- UTF-16
##Script Identification Standards
- ISO 15924 More on Unicode Website
- ISO 15919 A standard for transliteration of Indic scripts to Roman scriptsMore on Wikipedia
##Metadata Standards
- OLAC 1.1 OLAC: the Open Languages Archive Community
- Dublin Core Metadata Initiative: DCMI Metadata Term(http://purl.org/dc/elements/1.1/language) for language, via IETF's RFC 4646 (now superseded by RFC 5646)
- MARC library codes.
- MODS (Metadata Object Description Schema) library codes: Incorporates IETF's RFC 3066 (now superseded by RFC 5646).
- DOAP Metadata for application profile See DCMI discussion See GitHub See Paper: Severiens, Thomas & Greenberg, Jane. 2007. The DCMI Tools application profile. 2007 Proc. Int’l Conf. on Dublin Core and Metadata Applications. http://dcpapers.dublincore.org/pubs/article/view/874
- Unicode's CLDR (Common locale data repository): Uses several hundred codes from ISO 639-3 not included in ISO 639-2.
##Text Markup Formats
###Documents
###Corpora
- NLP Annotation Format (NAF), formerly known as KAF (Knowledge/Kyoto Annotation Format)
- NLP Interchange Format (NIF)
###Lexicons
- Lexical Markup Framework: ISO specification for representation of machine-readable dictionaries.