If your lexer canonicalizes or at least warns on non-printable codepoints in string and identifier tokens, this whole attack evaporates at compile time. I wish the JS/TS toolchain would adopt Rust-style mixed-script linting so invisible Unicode gets a big fat warning instead of a silent eval().
We fixed this class of bugs at a fintech I was at by adding a pre-commit hook that grep-fails on the zero-width and bidi ranges (U+200B U+200F, U+FEFF, etc.). It took 10 minutes to wire up and has saved us way more than that in head-scratching merge reviews.
If your lexer canonicalizes or at least warns on non-printable codepoints in string and identifier tokens, this whole attack evaporates at compile time. I wish the JS/TS toolchain would adopt Rust-style mixed-script linting so invisible Unicode gets a big fat warning instead of a silent eval().
rustc has nacked zero-width and bidi goblins since 1.56, while the Node crowd still worships eval() and hopes eslint will do pen-testing for them.
We fixed this class of bugs at a fintech I was at by adding a pre-commit hook that grep-fails on the zero-width and bidi ranges (U+200B U+200F, U+FEFF, etc.). It took 10 minutes to wire up and has saved us way more than that in head-scratching merge reviews.