Which characters must I escape in HTML?

In HTML text you must escape & and , " and ' are escaped defensively so the same output is safe inside attributes too. Those five — & < > " ' — are the only escapes HTML predefines, and this tool emits exactly them.

Does encoding HTML entities prevent XSS?

Escaping & and < on output stops text from being parsed as markup, which neutralises stored and reflected XSS in an HTML text context. It is not a complete defence on its own: attribute values, URLs, and inline scripts each need their own escaping in addition to this.

What is the difference between decimal and hexadecimal entities?

They are two spellings of the same code point. Decimal uses © and hexadecimal uses ©, and both resolve to U+00A9, the copyright sign ©. The decoder accepts either form, including an uppercase &#X…; prefix.

Why does ' show instead of '?

' is defined by XML and was not part of HTML 4, so it can fail to render in older or non-conforming parsers. The numeric reference ' means the same apostrophe and works everywhere, so the encoder uses it.

Why do I see & in my output?

That is double encoding. Because & is itself escaped to &, running encode on already-escaped text turns < into <. Decode the string once and find where the extra escaping pass was added in your pipeline.

HTML Entity Encoder / Decoder

What it does & when you need it

You need HTML entity encoding whenever text has to appear literally inside an HTML document instead of being interpreted as markup. Paste a code snippet, an error message, or a chunk of user input into a page unescaped and the browser will happily read a stray <script> as a real element. Encoding rewrites the dangerous characters as entities so they render as text; decoding does the reverse, turning <, é, or © back into the characters they stand for. Both directions run entirely in your browser, so pasted markup and customer data never leave your machine.

Reach for it when you are escaping content before dropping it into a template, reading a scraped page whose text is full of & and  , preparing a code sample for a blog post, or debugging why a & in a query string turned into &amp; somewhere in your pipeline.

How to use

Pick Encode or Decode with the toggle in the toolbar.
Paste your text into the left plain text / html buffer, press Sample to load a realistic example, or Upload an .html file.
In Encode mode, tick Encode non-ASCII if you also want every character above code point 127 (accented letters, symbols, emoji) turned into numeric references — useful for ASCII-only channels.
The result updates as you type in the right buffer. Press Ctrl/Cmd + Enter (or the Copy result button) to copy it, and use Clear to reset the input.

Things worth knowing

HTML predefines only five escapes. Despite there being thousands of named entities, the characters you actually need to escape in HTML text are just &, <, >, ", and the apostrophe as '. This tool deliberately emits ' rather than ': ' is defined by XML and was not part of HTML 4, so it can fail to render in older or non-conforming parsers, whereas the numeric reference is universally safe.

This is your first line of defence against XSS. Cross-site scripting — stored or reflected — happens when attacker-controlled text is interpolated into a page and the browser executes it as markup. Escaping < and & on output, in the correct context, is what neutralises it. The key word is output: encode at the moment you insert data into HTML, not when you store it, and remember that attribute values, URLs, and inline scripts each need their own escaping rules on top of this. For query strings specifically, reach for the URL Encoder rather than HTML entities.

Numeric references come in two flavours for the same character. A code point can be written in decimal, like ©, or in hexadecimal, like © — both resolve to U+00A9, the copyright sign ©. The decoder here accepts either form (and an uppercase &#X…;), so you can throw mixed content at it and get consistent output.

Named entities are convenient but optional. Friendly names such as  , —, and € are easier to read, but the full HTML named-character list runs past 2,000 entries, and no tool memorises all of them. Numeric references sidestep the lookup table entirely: any Unicode code point can be written as &#N; and it will always decode, which is why encoders lean on them for anything outside the common set. If you prefer to inspect characters as \u-style escapes instead, the Unicode Escape tool covers that representation, and the Base64 Encode / Decode tool handles the binary-to-text case entirely differently.

A last gotcha: encoding is not idempotent. Because & is itself escaped to &, running Encode twice turns < into &lt;. If you see doubled entities in production output, something in the chain is escaping already-escaped text — decode once and check where the extra pass came from.

HTML Entity Encode / Decode

What it does & when you need it

How to use

Things worth knowing

Examples

Escape an HTML snippet

Decode named and numeric entities

Frequently asked questions

How to use HTML Entity Encode / Decode

What it does & when you need it

How to use

Things worth knowing

Examples

Escape an HTML snippet

Decode named and numeric entities

Frequently asked questions

Related tools