Linking to translations

The other day I worked on some front-end code that takes users to a version of the website in a different language. Two attributes can make such links more accessible: lang and hreflang. What’s the difference?

Strings in another language

The lang attribute can expose what language a page is in, when it is added to the root tag. For example, a page in English, will likely have this as its root element:

<html lang="en">

You can also use lang on any tag in a document, to signify a section, paragraph or word is in a different language. This is useful when you link to a translation. If the link text has a different language than the rest of the document, you can expose that.

A link with link text in Chinese, as spoken in Taiwan, one of my favourite countries:

<a href="/zh-TW">
  中文
</a>

The same link, now with the language exposed:

<a href="/zh-TW" lang="zh-TW">
  中文
</a>

That’s all. Note: this only means that the word within this tag is in a different language. It says nothing about the page we’re linking to.

Words marked up in an HTML element with a lang attribute, will get read out with a voice in that language in Safari with VoiceOver and some versions of other screenreaders, depending on settings and voice availability

Links to pages in another language

If the page you’re linking to is in a different language, you can use hreflang to signify that (hreflang in the spec).

For example:

<a href="/zh-TW" hreflang="zh-tw">
  Chinese
</a>

If you link to a page in a different language and the text you use is also in a different language, both attributes can be used at the same time.

<a href="/zh-TW" hreflang="zh-tw" lang="zh-TW">
  中文
</a>

To be very honest, I don’t know of an actual system using it for detection. Implementing the attribute doesn’t hurt though, if more of us do it, tools can more reliably take advantage of that. The hreflang attribute is also used by people who want to increase their search engine rankings, in a per page setting exposed via a link element or as a header.

Language tags: BCP 47

The format of languages used in lang and hreflang is specified in something called BCP47.

They consist of one or more subtags, separated by a hyphen (-). The first one is the primary language, for example en for English and zh for Chinese (zhōngwén means Chinese writing). Most are two characters, some are more.

There is a list of language tags on the Library of Congress website.

Summary

So, to sum up:

  • lang on any HTML element describes what language is used in that element
  • hreflang on a link (<a>) describes the language of the page you’re linking to
  • both lang and hreflang are in the format described in BCP47

Happy internationalising!

Thanks Marcus for reminding me of hreflang!

Comments, likes & shares

No webmentions about this post yet! (Or I've broken my implementation)