The first line of the head element of a html document is usually
<meta charset="UTF-8" />
What does it do?
Unicode lists characters. Each character in the list corresponds with a number. The encoding UTF-8 determines how these numbers are stored in computer files/memory.
Example 1: The letter z is listed as number 122. It is stored as a single byte:
The first 128 characters are the same in all encodings (basic Latin), so the letter z would be save without a charset specification.
Example 2: The character é is number 233 of the unicode list, stored as
Example 3: The Kannada letter ಊ has number 3210. It’s stored as
11100000 10110010 10001010
HTML-codes and entities
HTML-codes use Unicode, independent of the character set used. So if you type
ಊ you’ll get ಊ even if you omit the charset meta tag or choose another one, like iso-8859
For é you may also use the HTML-code
é or the HTML entity
Is UTF-8 the default?
It’s sometimes stated that UTF-8 is the default character encoding for HTML5. But it isn’t. Not in the sense that it will be active if you don’t specify it. So make sure the tag is always present as the first child of