兩種寫法都可以,看你的需求選擇。參考 W3C: Language information and text direction:
Briefly, language codes consist of a primary code and a possibly empty series of subcodes:
language-code = primary-code ( "-" subcode )*
Here are some sample language codes:
"en": English
"en-US": the U.S. version of English.
"en-cockney": the Cockney version of English.
"i-navajo": the Navajo language spoken by some Native Americans.
"x-klingon": The primary tag "x" indicates an experimental language tag
「實際上它是一個刮鬍刀」——長文考證,結束此題。
既然是說網頁HTML,那就先找到HTML的制定組織,即W3C(http://w3.org)。 然後看官方文檔: 最新的HTML5的:3 Semantics, structure, and APIs of HTML documents HTML 4.01的:Basic HTML data types
HTML5官方文檔節選:
The lang attribute (in no namespace) specifies the primary language for the element"s contents and for any of the element"s attributes that contain text. Its value must be a valid BCP 47 language tag, or the empty string. Setting the attribute to the empty string indicates that the primary language is unknown. [BCP47]
HTML 4.01官方文檔節選:
The value of attributes whose type is a language code ( %LanguageCode in the DTD) refers to a language code as specified by [RFC1766], section 2. For information on specifying language codes in HTML, please consult the section on language codes.
現在這個年代,討論HTML5即可,那就看看「BCP 47」是什麼。W3C官方有講解:Choosing a language tag,這篇文章詳細的講解了如何選擇語文代碼,仔細讀一遍,此問題即有了答案。 節選如下:
Language tag syntax is defined by the IETF"s BCP 47. In the past it was necessary to consult lists of codes in various ISO standards to find the right subtags, but now you only need to look in the IANA Language Subtag Registry. We will describe the new registry below.
BCP stands for "Best Current Practice", and is a persistent name for a series of RFCs whose numbers change as they are updated. The latest RFC describing language tag syntax is RFC 5646, Tags for the Identification of Languages, and it obsoletes the older RFCs 46463066 and 1766.
language = 2*3ALPHA ; shortest ISO 639 code
["-" extlang] ; sometimes followed by
; extended language subtags
/ 4ALPHA ; or reserved for future use
/ 5*8ALPHA ; or registered language subtag
%%
Type: language
Subtag: zh
Description: Chinese
Added: 2005-10-16
Scope: macrolanguage
%%
Type: language
Subtag: cmn
Description: Mandarin Chinese
Added: 2009-07-29
Macrolanguage: zh
%%
Type: language
Subtag: yue
Description: Yue Chinese
Description: Cantonese
Added: 2009-07-29
Macrolanguage: zh
%%
Type: language
Subtag: nan
Description: Min Nan Chinese
Added: 2009-07-29
Macrolanguage: zh
%%
Type: language
Subtag: lzh
Description: Literary Chinese
Added: 2009-07-29
Macrolanguage: zh
%%
Type: script
Subtag: Hans
Description: Han (Simplified variant)
Added: 2005-10-16
%%
Type: script
Subtag: Hant
Description: Han (Traditional variant)
Added: 2005-10-16
%%
Type: script
Subtag: Latn
Description: Latin
Added: 2005-10-16
%%
Type: region
Subtag: CN
Description: China
Added: 2005-10-16
%%
Type: region
Subtag: HK
Description: Hong Kong
Added: 2005-10-16
%%
Type: region
Subtag: TW
Description: Taiwan, Province of China
Added: 2005-10-16
%%
Type: redundant
Tag: zh-Hans-TW
Description: Taiwan Chinese in simplified script
Added: 2005-04-11
For example, the macro language Chinese ("zh") encompasses a number of languages. For compatibility reasons, each of these languages has both a primary and extended language subtag in the registry. A few selected examples of these include Gan Chinese ("gan"), Cantonese Chinese ("yue"), and Mandarin Chinese ("cmn"). Each is encompassed by the macro language "zh" (Chinese). Therefore, they each have the prefix "zh" in their registry records. Thus, Gan Chinese is represented with tags beginning "zh-gan" or "gan", Cantonese with tags beginning either "yue" or "zh-yue", and Mandarin Chinese with "zh-cmn" or "cmn". The language subtag "zh" can still be used without an extended language subtag to label a resource as some unspecified variety of Chinese, while the primary language subtag ("gan", "yue", "cmn") is preferred to using the extended language form ("zh-gan", "zh-yue", "zh-cmn").
W3C解釋如下:Choosing a language tag
As we recommended for the collection subtags mentioned above, in most cases you should try to use the more specific subtags, but there are a small number of important exceptions. These are situations where you should continue using a macrolanguage subtag for reasons of backward compatibility.
For example, although BCP 47 explains that zh (the macrolanguage subtag for Chinese) doesn"t actually specify which of the many, sometimes mutually unintelligible, dialects of Chinese is actually meant by this subtag, in practice convention overwhelmingly associates the macrolanguage subtag with the predominant language among the encompassed subtags - in this case, cmn (Mandarin Chinese). If your application identified Mandarin Chinese in the past using the language tag zh-CN (Chinese as used in Mainland China), or even just zh, you can continue to use zh in this way. Using cmn or cmn-CN may cause serious compatibility problems if the software or users expect a tag such as zh.
If, on the other hand, you are using zh to refer to another Chinese dialect such as Hakka, you should use the language subtag hak instead.
Many of these registered tags were made redundant by the advent of either RFC 4646 or this document. A redundant tag is a grandfathered registration whose individual subtags appear with the same semantic meaning in the registry. For example, the tag "zh-Hant" (Traditional Chinese) can now be composed from the subtags "zh" (Chinese) and "Hant" (Han script traditional variant). These redundant tags are maintained in the registry as records of type "redundant", mostly as a matter of historical curiosity.
language = 2*3ALPHA ; shortest ISO 639 code
["-" extlang] ; sometimes followed by
; extended language subtags
/ 4ALPHA ; or reserved for future use
/ 5*8ALPHA ; or registered language subtag