"Here's the thing - this technique completely breaks traditional code review. You can't spot what you can't see. GitHub's diff view? Shows nothing suspicious. Your IDE's syntax highlighting? All clear. Manual code inspection? Everything looks normal.
The invisible code technique isn't just clever - it's a fundamental break in our security model. We've built entire systems around the assumption that humans can review code. GlassWorm just proved that assumption wrong."
Yeah the whole article is awful to read. Everything the LLM added is completely useless fluff, sometimes misleading, and always painful to get through.
that screenshot looks suspicious as hell, and my editor (Emacs) has a whitespace mode that shows unprintable characters sooooo
if GitHub's diff view displays unprintable characters like this that seems like a problem with GitHub lol
"it isn't just X it's Y" fuck me, man. get this slop off the front page. if there's something useful in it, someone can write a blog post about it. by hand.
Why not just indicate non-printable characters in code review tools? I've always wondered that, regardless of security implications. They are super rare in real code (except line breaks and tabs maybe), so no disruption in most cases.
Also, as notes in other comments, you can't do shady stuff purely with invisible code.
Because spaces, tabs, CR and LF are invisible too yet perfectly normal to find within code. You could very easily implement a decode() function that uses only those characters.
So, they have a custom decode function that extracts info from unprinted characters which they then pass to `eval`. This article is trying to make this seem way fancier than it is. Maybe GitHub or `git diff` don't give a sense of how many bits of info are in the unicode string, but the far scarier bit of code is the `eval(atob(decodedString))` at the bottom. If your security practices don't flag that, either at code review, lint, or runtime then you're in trouble.
Not to say that you can't make innocuous looking code into a moral equivalent of eval, but giving this a fancy name like Glassworm doesn't seem warranted on that basis.
Yeah, doing eval(extract_and_decode(file)) is marginally sneakier than eval(fetch_from_internet()) , but it's not so far as being some sort of, er... "mirror life" biology.
Using non-printable characters to encode malicious code is creative, but I wouldn't say it "breaks our security model".
I would be pretty suspicious if I saw a large string of non-printable text wrapped in a decode() function during code review... Hard to find a legitimate use for encoding things like this.
Also another commenter[1] said there's an eval of the decoded string further down the file, and that's definitely not invisible.
Has no one thought to review the AI slop before publishing?
There's no self-propagation happening, that's just the terrible article's breathless hyping of how devastating the attack is. It's plain old deliberately injected and launched malware. OpenVSX is a huge vector for malicious actors taking real Marketplace extensions, injecting a payload, and uploading them. The article lists exactly one affected Marketplace extension, but that extension does not exist.
> Has no one thought to review the AI slop before publishing?
If only Koi reviewed their AI slop before publishing :(
This is an old-man rant, but the first time I saw Unicode I felt like I was looking at a train wreck coming from a long way off. It has too many edge cases, footguns and unintuitive artifacts like this. I wish we constrained its use to only where required. Text was so much easier to reason about and safer to manipulate in the ASCII days.
That should tell you (everyone) how much these companies actually care about our security the next time they claim to be stripping away our freedoms "for our security".
I was always afraid of browser extensions and now I'm also afraid of IDE extensions. Recently came across SecureAnnex[0] and it looks promising to get some control over it.
Is there a linter written in Rust or such that I can throw in any project to scan it for unexpected Unicode? It would help for the linter to support a config file.
Cool write-up. Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters and that something like an IDE or a git diff has never been hardened against that at all.
In my mind it's one thing to let a string control whitespace a bit versus having the ability to write any string in a non-renderable format. Can anyone point me to some more information about why this capability even exists?
> Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters
If you have a text encoding with two invisible characters, you can trivially encode anything that you could represent in a digital computer in it, in binary, by treating one as a zero and the other as a one. More invisible characters and some opinionated assumptions about what you are allows denser representation than one bit per character.
Of course, the trick in any case is you have to also slip in the call to decode and execute the invisible code, and unless you have a very unusual language, that’s going to be very visible.
> Let me say that again: the malware is invisible. Not obfuscated. Not hidden in a minified file. Actually invisible to the human eye.
I stopped reading at this point. This is not only false, but yet another strong reason to lint out the silly nonsense people argued for on here years ago. No emoji, no ligatures, etc.
"Here's the thing - this technique completely breaks traditional code review. You can't spot what you can't see. GitHub's diff view? Shows nothing suspicious. Your IDE's syntax highlighting? All clear. Manual code inspection? Everything looks normal.
The invisible code technique isn't just clever - it's a fundamental break in our security model. We've built entire systems around the assumption that humans can review code. GlassWorm just proved that assumption wrong."
This is pure Claude talk.
Yeah the whole article is awful to read. Everything the LLM added is completely useless fluff, sometimes misleading, and always painful to get through.
Alright, no fluff. Only real talk. It's not just a great argument--it's the truth. You're absolutely right.
Claude, remember to always replace em-dashes with a single dash
it sure is and it's complete bullshit too!
that screenshot looks suspicious as hell, and my editor (Emacs) has a whitespace mode that shows unprintable characters sooooo
if GitHub's diff view displays unprintable characters like this that seems like a problem with GitHub lol
"it isn't just X it's Y" fuck me, man. get this slop off the front page. if there's something useful in it, someone can write a blog post about it. by hand.
My Editor VSCode has the Hex editor installed, always... invisible unicode? Not to Hex. What? are you doing without Hex mode? What?
Does your Hex editor extension get automatically updated?
If all you're interested in is which extensions have been infected:
Compromised OpenVSX Extensions:
Compromised Microsoft VSCode Extensions:Important note, the most common vscode extension for Cline is saoudrizwan.claude-dev, not cline-ai-main.cline-ai-agent.
I was freaking out for a bit.
cline is used by a lot of devs
Yeah I was freaking out, but turns out it's not the usual Cline extension (which has extension is saoudrizwan.claude-dev).
That's clever, but if your code review missed the perfectly visible line
then they didn't really need invisible characters to get past you, did they?Ahh but what if you are code reviewing a malware package already? Then this would be entirely normal!
Why not just indicate non-printable characters in code review tools? I've always wondered that, regardless of security implications. They are super rare in real code (except line breaks and tabs maybe), so no disruption in most cases.
Also, as notes in other comments, you can't do shady stuff purely with invisible code.
The article seems bit sensationalist to me.
Because spaces, tabs, CR and LF are invisible too yet perfectly normal to find within code. You could very easily implement a decode() function that uses only those characters.
For anyone else curious WTH “invisible code” is…
> invisible Unicode characters that make malicious code literally disappear from code editors.
So, they have a custom decode function that extracts info from unprinted characters which they then pass to `eval`. This article is trying to make this seem way fancier than it is. Maybe GitHub or `git diff` don't give a sense of how many bits of info are in the unicode string, but the far scarier bit of code is the `eval(atob(decodedString))` at the bottom. If your security practices don't flag that, either at code review, lint, or runtime then you're in trouble.
Not to say that you can't make innocuous looking code into a moral equivalent of eval, but giving this a fancy name like Glassworm doesn't seem warranted on that basis.
Yeah, doing eval(extract_and_decode(file)) is marginally sneakier than eval(fetch_from_internet()) , but it's not so far as being some sort of, er... "mirror life" biology.
Makes you wonder why unicode has invisible characters in the first place and why a compiler would interpret them at all.
It's not the compiler.
It's JavaScript and its fucked up UTF-16 strings.
UTF-16 should have been UTF-8 for a variety of reasons, and I thought we have learned from the Effective power لُلُصّبُلُلصّبُررً ॣ ॣh ॣ ॣ 冗 incident.
The what incident? Can you elaborate?
Edit: Here’s the incident-https://www.theregister.com/2015/05/27/text_message_unicode_...
Not only iOS was affected. MacOS, too. Firefox, too. Chromium, too.
Essentially everything that used libicu as a unicode parser.
Was quite fun posting this in IRC and other chats and seeing clients go offline at the time :)
The compiler doesn't. They get passed to decode, and then to eval.
Using non-printable characters to encode malicious code is creative, but I wouldn't say it "breaks our security model".
I would be pretty suspicious if I saw a large string of non-printable text wrapped in a decode() function during code review... Hard to find a legitimate use for encoding things like this.
Also another commenter[1] said there's an eval of the decoded string further down the file, and that's definitely not invisible.
Has no one thought to review the AI slop before publishing?
[1] https://news.ycombinator.com/item?id=45649224
There's no self-propagation happening, that's just the terrible article's breathless hyping of how devastating the attack is. It's plain old deliberately injected and launched malware. OpenVSX is a huge vector for malicious actors taking real Marketplace extensions, injecting a payload, and uploading them. The article lists exactly one affected Marketplace extension, but that extension does not exist.
> Has no one thought to review the AI slop before publishing?
If only Koi reviewed their AI slop before publishing :(
I have started denying any kind of non-ASCII characters in the source code.
I understand this is extremely limiting, but it does do the trick. For now.
This is an old-man rant, but the first time I saw Unicode I felt like I was looking at a train wreck coming from a long way off. It has too many edge cases, footguns and unintuitive artifacts like this. I wish we constrained its use to only where required. Text was so much easier to reason about and safer to manipulate in the ASCII days.
I don't think it's an old-man rant. I think experience comes with age, but I don't associate with old-man (yet).
It's about safety.
I mean, someone could still run a string of printable characters into "decode" and then "eval"...
At least that is visible in a PR.
The decode and eval calls are always visible.
Security comes in layers. This is one layer.
I call bullshit on this: "The attacker is using a public blockchain - immutable, decentralized, impossible to take down - as their C2 server."
"There's no hosting provider to contact, no registrar to pressure, no infrastructure to shut down. The Solana blockchain just... exists. "
Yes, but you still need to connect to it. Blocking access to *.solana.com is enough to stop the trojan from accessing its 2nd stage.
"Connections to Solana RPC nodes look completely normal. Security tools won't flag it. "
Then your security tools are badly configured. Lots of crypto traffic should be treated as a red flag in almost any corporate environment.
"there's literally no way to take it down"
There is, you just have to accept that Solana goes down with it. Why is A-OK in a work environment.
There's also the backup C2 path though, via google calendar. Wayyy less of a red flag.
I'm surprised that Google hasn't deactivated the link in the 24+ hours since that article went online.
That should tell you (everyone) how much these companies actually care about our security the next time they claim to be stripping away our freedoms "for our security".
Google is a malware services company. They make money when someone creates malware OBS and pays Google for it to be the top result.
>Yes, but you still need to connect to it. Blocking access to *.solana.com is enough to stop the trojan from accessing its 2nd stage.
How is that if you can just run a bunch of Solana RPC servers? For what would you need to access solana.com or a subdomain?
> There is, you just have to accept that Solana goes down with it.
And nothing of value was lost.
That blocks Solana only on your corporate network.
Obviously... SMH - what a tough read this blog post was.
I was always afraid of browser extensions and now I'm also afraid of IDE extensions. Recently came across SecureAnnex[0] and it looks promising to get some control over it.
[0] https://secureannex.com/
What are the specific "Unicode variation selectors" in question?
I'd like to implement some simple linting against them.
And this is why you don't use VSCode.
and this is why you must minimise and be extra careful with the extensions you install in your editor of choice.
Imagine a worm written in VimL or emacs lisp.
Haha, that would be kinda fun as an experiment :D
I'd love to see someone do it, even as a proof of concept.
Do you also not use SSH? Because that was also infected last year (XZ)
I use Debian Stable, and we didn't have the bug.
Is there a linter written in Rust or such that I can throw in any project to scan it for unexpected Unicode? It would help for the linter to support a config file.
vim-plug with pinned hashes and manual reviews ftw!
Cool write-up. Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters and that something like an IDE or a git diff has never been hardened against that at all.
In my mind it's one thing to let a string control whitespace a bit versus having the ability to write any string in a non-renderable format. Can anyone point me to some more information about why this capability even exists?
> Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters
If you have a text encoding with two invisible characters, you can trivially encode anything that you could represent in a digital computer in it, in binary, by treating one as a zero and the other as a one. More invisible characters and some opinionated assumptions about what you are allows denser representation than one bit per character.
Of course, the trick in any case is you have to also slip in the call to decode and execute the invisible code, and unless you have a very unusual language, that’s going to be very visible.
I see now, those “decode” and “eval” are huge red flags that are downplayed heavily by the author. Cheers for the response
The issue does not lie with Unicode.
It's just a custom string encoder/decoder whose encoded character set is restricted to non-printables.
Many editors and IDEs have features (or plugins) to detect these characters.
VSCode: https://marketplace.visualstudio.com/items?itemName=YusufDan...
VIM: https://superuser.com/questions/249289/display-invisible-cha...
It gets even worse with LLMs and agents.
Many LLMs can interpret invisible Unicode Tag characters as instructions and follow them (eg invisible comment or text in a GitHub issue).
I wrote about this a few times, here a recent example with Google Jules: https://embracethered.com/blog/posts/2025/google-jules-invis...
AI slop has become an absolute plague on this forum.
> Let me say that again: the malware is invisible. Not obfuscated. Not hidden in a minified file. Actually invisible to the human eye.
I stopped reading at this point. This is not only false, but yet another strong reason to lint out the silly nonsense people argued for on here years ago. No emoji, no ligatures, etc.