How to use Line Sorter & Deduplicator
What it does & when you need it
You have a pile of lines — a list of email addresses, a .gitignore, exported
tags, a column pasted out of a spreadsheet — and you need them in order, with the
duplicates gone. This tool sorts the lines of whatever you paste and, optionally,
strips out repeats and blank rows. It handles the fiddly parts a naive sort gets
wrong: numbers that should read as numbers, mixed casing, accented characters,
and stray whitespace.
It runs entirely in your browser. Nothing you paste is uploaded, which matters when the list is a set of customer emails, internal hostnames, or API keys. The sort is the same locale-aware comparison your language runtime uses, so the result is stable and predictable rather than a surprise from a raw byte ordering.
Reach for it when you're cleaning a word list before a diff, de-duplicating a CSV column, alphabetising enum values before committing, or turning a messy paste into something you can scan.
How to use
- Paste your lines into the input buffer, or press Sample to load an
example. You can also Upload a
.txt,.csv, or.logfile. - Pick a direction with the A → Z / Z → A toggle, then enable the options you need: Remove duplicates, Remove blank lines, Ignore case, Numeric, and Trim.
- The sorted buffer updates as you type. Press Sort (or
Ctrl/Cmd+Enter) to re-run explicitly, and Copy to grab the result. The status bar reports the line count and how many lines were removed.
Things worth knowing
Lexicographic order is not numeric order. By default the sort compares lines
character by character, so "10" lands before "9" — the first character 1
is "smaller" than 9, and the comparison never gets far enough to notice that
ten is the larger number. That's correct for words but wrong for anything with
embedded numbers, like item2, item9, item10 or version strings. Turn on
Numeric and the comparison switches to natural ordering (built on
localeCompare with { numeric: true }), so runs of digits are compared by
value instead of one glyph at a time.
Deduplication happens after the sort, and keeps the first occurrence. Order matters here: sorting first and then removing duplicates can give a different result from removing duplicates and then sorting, because "first occurrence" is defined relative to whatever order the lines are already in. This tool always sorts before it dedups, so the survivor of each duplicate group is the first one in sorted order — deterministic and easy to reason about. If you only want to collapse repeats without reordering, use a dedicated whitespace and line cleaner instead.
Ignore case groups variants without rewriting them. With Ignore case on,
Apple and apple sort next to each other, but the output still shows each line
exactly as you typed it — the option changes the comparison, not the text. Two
lines that differ only in case are therefore not treated as duplicates, so both
survive de-duplication. If you actually want to normalise casing across the list,
run it through the case converter first.
Comparison is locale-aware, so accents sort sensibly. A raw code-point sort
orders by Unicode value, which puts z (U+007A) before an accented é
(U+00E9) and scatters international text to the bottom of the list. The
localeCompare collation used here places é next to e, where a reader
expects it, so apple, égal, zebra come out in that order rather than with
égal marooned at the end.
Once your list is clean, you might count what's left or line two versions up in a diff checker to see exactly what changed.