BOM Remover - Strip Byte Order Mark from Text

What is a byte order mark (BOM)?

A BOM at the start of a UTF-8 file signals encoding to some editors. In other contexts — JSON APIs, Unix scripts, CSV imports — that same invisible character causes immediate failure. Excel and Windows Notepad sometimes save BOM-prefixed UTF-8 without warning.

When you paste such content into a web form or codebase, the BOM travels as U+FEFF (also called ZWNBSP). Removing it is often the fix for "invalid JSON" errors where the payload looks perfect on screen.

The byte order mark was designed for UTF-16 where byte order matters. UTF-8 has no endianness ambiguity, yet U+FEFF persists as an optional file prefix because Microsoft tools popularized BOM-prefixed UTF-8 saves. The character is invisible in Notepad and Word, so authors do not realize their "plain text" file carries a three-byte preamble.

U+FEFF doubles as the zero-width no-break space when it appears mid-string — same code point, different semantic role. Whether at file start or between words, the byte pattern is identical and equally invisible in most editors. This remover handles both positions.

Cloud storage and email attachments preserve BOM bytes faithfully. Downloading a config file from a colleague's OneDrive or opening a UTF-8 CSV from a Windows Excel export are the two most common ways BOM enters a Mac or Linux developer's workflow.

BOM symptoms in development and data

JSON.parse throws on the first key. Shell scripts report command not found on line 1. CSV importers create a blank first column header. Git diffs show an entire file changed when only an invisible prefix was added.

Paste the suspect content here before re-uploading to S3, a CMS, or a CI fixture. Compare hex before and after if you need audit evidence.

Environment-variable files (.env) with a leading BOM break Docker Compose and many dotenv parsers silently — the first variable name includes an invisible prefix and never matches what your application expects. Stripping before deploy is a one-time fix with permanent payoff.

REST API gateways that validate Content-Type strictly may accept a BOM-prefixed JSON body from a legacy client while your local tests pass — because you edited the fixture in VS Code without BOM. Paste production payloads here to reconcile the difference.

Static site generators reading Markdown front matter fail cryptically when the opening --- delimiter is preceded by U+FEFF. The YAML parser sees garbage before the delimiter and rejects the entire page build. Cleaning the paste before saving fixes the build without touching visible content.

Database migration scripts executed via psql or mysql CLI interpret a BOM on line one as part of the first SQL keyword, producing syntax errors that point at "CREATE" even though CREATE looks fine when you open the file.

BOM removal vs. full Unicode cleaning

This page targets users who diagnosed BOM specifically. The remover still clears other invisible watermark characters in the same pass when you copy cleaned output — helpful when BOM and zero-width spaces coexist.

If you are unsure whether the problem is BOM or another invisible character, paste anyway — the tool reports what it finds. Many contaminated files contain U+FEFF at the start plus U+200B scattered through the body from a subsequent AI edit pass.

After BOM removal, save explicitly as UTF-8 without BOM in your editor if you control the file format. VS Code, Sublime Text, and most IDEs offer "UTF-8" versus "UTF-8 with BOM" in the status bar encoding selector.

For teams standardizing on BOM-free UTF-8, add a pre-commit hook that rejects files starting with EF BB BF. This page is the manual counterpart for one-off pastes and third-party files you cannot reconfigure at the source.

How to remove a BOM from text

Checking a piece of AI-generated text for invisible watermarks takes less than a minute:

Copy your AI-generated text. Copy the text you want to clean from your document, AI chat, or clipboard.
Paste into the checker. Paste the text into the input box on this page.
Run the check. Click Check for watermarks. The tool scans for invisible Unicode characters and hidden formatting markers in seconds.
Copy the cleaned output. Review the detection report, then copy the cleaned, watermark-free version of your text.

BOM and related invisible markers we remove

AI systems can hide two broadly different kinds of signal in their output. Our checker is specifically built to detect and remove the first kind — invisible Unicode characters. The second kind, statistical watermarks, requires rewriting to neutralise.

Invisible Unicode watermarks

These are real characters inserted between visible letters that don't render on screen. They travel with copy-paste, get carried into Word documents, Google Docs and CMS fields, and can fingerprint text back to the model that produced it. The checker scans for:

Zero-width space (U+200B)
Zero-width non-joiner (U+200C) and zero-width joiner (U+200D)
Word joiner (U+2060)
Soft hyphen (U+00AD)
Variation selectors (U+FE00 - U+FE0F)
Left-to-right and right-to-left marks (U+200E / U+200F)
Byte order mark / ZWNBSP (U+FEFF)
Other non-printing formatting characters commonly used as covert channels

Statistical (cryptographic) watermarks

These are patterns in which words the model chooses. They are imperceptible in any one sentence and only emerge over many words. A Unicode scan cannot remove them — to neutralise a statistical watermark you typically need to lightly rewrite the text. Our guide to natural AI writing techniques covers how to do this without losing meaning.

Frequently asked questions

Is UTF-8 BOM always wrong?

Some Windows tools expect it. Many web APIs and Unix pipelines reject it. Removal is correct when BOM causes parse or import errors.

Will removing BOM change encoding?

Your text remains UTF-8 without the leading BOM code point. Visible characters are unchanged.

Can BOM appear mid-document?

Rarely at start; ZWNBSP can appear elsewhere as a zero-width marker. This tool removes watermark-class uses throughout the paste.

Does this fix Excel CSV imports?

Often yes, when the first column header includes an invisible BOM prefix.

Is this watermark checker free?

Yes. You can scan up to 500 words without an account. Sign in for longer documents, full cleaned text, and a character-level breakdown of every hidden marker removed.

Is my text stored when I use the checker?

We process your text only to return a detection report and cleaned output. We do not retain the content of your pasted text for any other purpose.

BOM Remover — Byte Order Mark (U+FEFF)

Your Text

Need to pass AI detection?

What are AI Watermarks?

Unicode Watermarks

Character Detection