What Is HTML Entity Encoding?

Understand what HTML entities are, which characters must be escaped, when entities are still necessary in a UTF-8 world, and how to avoid XSS.

5 min read Updated Jun 2026

Quick Answer

HTML entities let you display characters the browser would otherwise treat as markup, like <, >, and &. The entity &lt; renders as <; &amp; renders as &.

To html encode text, replace reserved characters with their entity equivalents. To html decode or unescape html, do the reverse. You can also escape html before inserting user content into a page to prevent XSS. Entities come in two forms: named entity references like &lt;, and numeric references like &#60;.

If you need to encode or decode HTML entities right now, use the tool directly.

Try the HTML Entity Encoder & Decoder →

What Are HTML Entities?

HTML has structure. The < character opens a tag. The > closes it. The & starts a character reference. If you want to display any of those characters as content on the page (not as syntax) you need a way to say "I mean the literal character, not the markup."

HTML entities solve this. They're escape sequences: &lt; for <, &gt; for >, &amp; for &. The browser reads the entity, recognizes it as a character reference, and renders the actual character on screen.

Entities come in two forms. Named references use a human-readable name: &copy; for ©, &mdash; for the em dash (—), &nbsp; for a non-breaking space. Numeric references use the Unicode code point: either decimal (&#169;) or hexadecimal (&#xA9;), and work for any character, including ones that have no named form.

Why They Exist

HTML parsers treat angle brackets and ampersands as syntax. Write <b>Hello</b> and you get Hello. If you actually want the text <b>Hello</b> displayed on screen, the parser needs a different signal.

That's the core problem entities solve, and it predates the modern web. In the early days, pages often ran in ASCII or Latin-1 encoding. Characters outside those sets: copyright signs, accent marks, mathematical symbols, currency signs, couldn't always be typed or reliably stored in source files. Named entities like &copy;, &eacute;, and &pound; gave authors a portable way to include them.

Today UTF-8 is universal, so most of those named entities are unnecessary: you can paste the character directly into your HTML. But the four reserved characters: <, >, &, and " inside attributes: still require escaping regardless of encoding, because they carry structural meaning in the parser.

When to Use HTML Entities

HTML entities show up in a handful of specific situations. For example:

  • Showing code examples: To display <div class="container"> on a web page, write it as &lt;div class=&quot;container&quot;&gt;. Without escaping, the browser renders the div, not its source.
  • User-generated content: Comments, usernames, forum posts, and search terms must be HTML-escaped before being placed in the page. A username like <script>alert(1)</script> should render as text, not execute.
  • Typography: &mdash; (—) and &ndash; (–) for proper dashes instead of double hyphens. &ldquo; and &rdquo; for curly quotes.
  • Non-breaking spaces: &nbsp; between a number and its unit keeps them on the same line: 10&nbsp;kg, §&nbsp;4.2.
  • Symbols: &copy;, &reg;, &trade; for © ® ™, &euro; for €, &deg; for °.
  • Math notation: &ne; for ≠, &asymp; for ≈, &le; and &ge; for ≤ and ≥.

Common Mistakes

You May Also Need

Alternatives

Continue Learning