LLMs are excellent at text transformation. It's their core strength and I don't see it being used enough.
Should the title say ChatGPT or gpt-4 (the model) instead of OpenAI (the company)?
I left my Kleenex next to the Xerox.
Better Hoover it up!
All jokes aside, I've never heard anyone call vacuuming hoover. I wonder if that was a older statement?
I've also never heard anyone call photocopying "xeroxing". I'm guessing maybe it's an age thing.
It depends on the region. In certain countries Gillette is used for any shaving razor.
growing up in India over past 4 decades .. 'Xerox' was/is the default and most common word used for photocopying ... only recently have I started using/hearing the term 'photocopy'.
every town and every street had "XEROX shops" where people went to get various documents photocopied for INR 1 per page for example
Most photocopy centers are still called XEROX Shops -- and their boards say that in big bold text: https://www.google.com/search?q=xerox+shop+india&udm=2
It doesnt matter if they use Canon, HP, or other brands of machines
It was the fashion at the time, even if the hoover did keep bumping the onion.
(This is actually really interesting, I had no idea that 'hoover' was specifically a U.K. thing that didn't make it to the U.S.)
In the UK it's very common
It might just be a regionalism, it's not uncommon that such genericization only applies to specific dialects (Like calling all sodas coke)
Everyone I know from the UK says "hoovering" 100% of the time instead of vacuuming.
I have, but only as an idiom, never literally. E.g. "Microsoft just keeps hoovering up companies", but the literal act of vacuuming is only called vacuuming.
More common in the UK
I got hurt doing it so applied some Bandaids.
Don't say Velcro!
https://www.youtube.com/results?search_query=don't+say+velcr...
(Content warning: profanity. This search page is SFW, but the videos it links to may not be.)
There is a certain justice in the use of OpenAI as a name for their product, given that OpenAI has turned the generic technical GPT name into a brand.
GPT is not a brand. A court ruling turned down that notion. It's a technology.
That only means it’s not a legally recognised brand, but it is a brand nonetheless if people associate the two (and they do). A bit like the way people associate tissue paper with Kleenex, or photocopies with Xerox, or git with GitHub.
I wonder if OpenAI will stick with the GPT acronym, given that most people don't know what it's an acronym for and it's a bit of a mouthful.
The generative pretrained transformer was invented by OpenAI, and it seems reasonable for a company to use the name it gave to its invention in its branding.
Of course, they didn't invent Generative pretraining (GP) or transfomers (T) but AFAIK they were the first to publicly combine them
more likely to get downvotes that way, potentially even downweighted
I agree, this would make the title more accurate.
It would have been a better title, yes.
I think it should not say the name of the company, but either ChatGPT or GPT-x.
Anyone working on decompiler LLMs? Seems like we could render all code open source.
Training data would be easy to make in this case. Build tons of free GitHub code with various compilers and train on inverting compilation. This is a case where synthetic training data is appropriate and quite easy to generate.
You could train the decompiler to just invert compilation and the use existing larger code LLMs to do things like add comments.
The potential implications of this are huge. Not just open sourcing, but imagine easily decompiling and modifying proprietary apps to fix bugs or add features. This could be a huge unlock, especially for long dead programs.
For legal reasons I bet this will become blocked behavior in major models.
I've never seen a law forbidding decompiling programs. But, some programs forbid to decompile applications by the license agreement. Further, you still don't have any right on this source code. It depends on the license...
A mere decompilation or general reverse engineering should be fine in many if not most jurisdictions [1]. But it is a whole different matter to make use of any results from doing so.
Using an LLM (or any technique) to decompile proprietary code is not clean room design. Declaring the results "open source" is deception and theft, which undermines the free open source software movement.
Only if you use the decompiled code. But if one team uses decompiled code to write up a spec, then another team writes an implementation based on that spec, then that could be considered clean room design. In this case, the decompiler would merely be a tool for reverse engineering.
It is true that at least some jurisdictions do also explicitly allow for reverse engineering to achieve interoperability, but I don't know if such provision is widespread.
Unminifying isn't decompiling.
It's just renaming variable and functions and inserting line breaks.
No but it’s a baby brother of the same problem. Compiling is a much more complex transform but ultimately it is just a code transform.
It is true that compilation and minification are both code transformations (it's a correct reduction [1]), but this doesn't seem a very useful observation in this discussion. In the end, everything you do to something is an operation. But that's not very workable.
In practice, compilation is often (not always, agreed!) from a language A to a lower level language B such that the runtime for language A can't run language B or vice-versa, if language A has a runtime at all. Minification is always from language A to the same language A.
The implication is that in practice, deminification is not the same exercise as decompilation. You can even want to run a deminification phase after a decompilation phase, using two separate tools, because one tool will be good at translating back, and the other will be good at pretty printing.
Minifying includes way more tricks than shorter variable names and removing white-space
Seems like we could render all code open source.
I agree. I think "AI generating/understanding source code" is a huge red herring. If AI was any good at understanding code, it would just build (or fix) the binary.
And I believe how it will turn out to be, when we really have AI programmers, they will not bother with human-readable code, but code everything in machine code (and if they are tasked in maintaining existing system, they will understand in its entirety, across the SW and HW stack). It's kinda like diffusion models that generate images don't actually bother with learning drawing techniques.
Why wouldn't AIs benefit from using abstractions? At the very least it saves tokens. Fewer tokens means less time spent solving a problem, which means more problem solving throughput. That is true for machines and people alike.
If anything I expect AI-written programs in the not so distant future to be incomprehensible because they're too short. Something like reading an APL program.
I agree, they might create abstractions, but I doubt they're going to reuse the same abstractions as human programming languages.
Seems like we could render all code open source
Unfortunately not really. Having the source is a first step, but you also need the rights to use it (read, modify, execute, redistribute the modifications), and only the authors of the code can grant these rights.
Doesn't it count as 'clean room' reverse engineering - or alternatively, we could develop an LLM that's trained on the outputs and side-effects of any given function, and learns to reproduce the source code from that.
Or, going back to the original idea, while the source code produced in such a way might be illegal, it's very likely 'clean' enough to train an LLM on it to be able to help in reproducing such an application.
IANAL but if your only source for your LLM is that code, I would assume the code it produces would be at high risk of being counterfeit.
I would guess clean room would still require having someone reading the LLM-decompiled code, write a spec, and have someone else write the code.
But this is definitely a good question, especially given the recent court verdicts. If you can launder open source licensed code, why not proprietary binaries? Although I don't think the situation is the same. I wouldn't expect how you decompile a code matters.
I think there's actually some potential here, considering LLMs are already very good at translating text between human languages. I don't think LLMs on their own would be very good, but a specially trained AI model perhaps, such as those trained for protein folding. I think what an LLM could do best is generate better decompiled code, giving better names to symbols, and generating code in a style a human is more likely to write.
I usually crap on things like chatgpt for being unreliable and hallucinating a lot. But in this particular case, decompilers already usually generate inaccurate code, and it takes a lot of work to fix the decompiled code to make it correct (I speak from experience). So introducing AI here may not be such a huge stretch. Just don't expect an AI/LLM to generate perfectly correct decompiled code and we're good (wishful thinking).
It can’t really compensate for missing variable and function names, not to mention comments.
Anyone working on decompiler LLMs?
Here is an LLM for x86 to C decompilation: https://github.com/albertan017/LLM4Decompile
There was a paper about this at CGO earlier this year [1]. Correctness is a problem that is hard to solve, though; 50% accuracy might not be enough for serious use cases, especially given that the relation to the original input for manual intervention is hard to preserve.
Seems like we could render all code open source.
That's not how copyright and licensing works.
You could already break the law and open yourself up to lawsuits and prosecution by stealing intellectual property and violating its owners rights before there were LLMs. They just make it more convenient, not less illegal.
JS minification is fairly mechanical and comparably simple, so the inversion should be relatively easy. It would be of course tedious enough to be manually done in general, but transformations themselves are fairly limited so it is possible to read them only with some notes to track mangled identifiers.
A more general unminification or unobfuscation still seems to be an open problem. I wrote handful of programs that are intentionally obfuscated in the past and ChatGPT couldn't understand them even at the surface level in my experience. For example, a gist for my 160-byte-long Brainfuck interpreter in C had some comment trying to use GPT-4 to explain the code [1], but the "clarified version" bore zero similarity with the original code...
[1] https://gist.github.com/lifthrasiir/596667#gistcomment-47512...
JS minification is fairly mechanical and comparably simple, so the inversion should be relatively easy.
Just because a task is simple doesn't mean its inverse need be. Examples:
- multiplication / prime factorization
- deriving / integrating
- remembering the past / predicting the future
Code unobfuscation is clearly one of those difficult inverse problems, as it can be easily exacerbated by any of the following problems: - bugs
- unused or irrelevant routines
- incorrect implementations that incidentally give the right results
In that sense, it would be fortunate if chatGPT could give decent results at unobfuscating code, as there is no a priori expectation that it should be able to do so. It's good that you've also checked chatGPT's code unobfuscation capabilities on a more difficult problem, but I think you've only discovered an upper limit. I wouldn't consider the example in the OP to be trivial.Of course, it is not generalizable! In my experience though, most minifiers do only the following:
- Whitespace removal, which is trivially invertible.
- Comment removal, which we never expect to recover via unminification.
- Renaming to shorter names, which is tedious to track but still mechanical. And most minifiers have little understanding of underlying types anyway, so they are usually very conservative and rarely reuse the same mangled identifier for multiple uses. (Google Closure Compiler is a significant counterexample here, but it is also known to be much slower.)
- Constant folding and inlining, which is annoying but can be still tracked. Again, most minifiers are limited in their reasoning to do extensive constant folding and inlining.
- Language-specific transformations, like turning `a; b; c;` into `a, b, c;` and `if (a) b;` into `a && b;` whenever possible. They will be hard to understand if you don't know in advance, but there aren't too many of them anyway.
As a result, minified code still remains comparably human-readable with some note taking and perseverance. And since these transformations are mostly local, I would expect LLMs can pick them up by their own as well.
(But why? Because I do inspect such programs fairly regularly, for example for comments like https://news.ycombinator.com/item?id=39066262)
I feel you’re downplaying the obfuscatory power of name-mangling. Reversing that (giving everything meaningful names) is surely a difficult problem?
I would say the actual difficulty greatly varies. It is generally easy if you have a good guess about what the code would actually do. It would be much harder if you have nothing to guess, but usually you should have something to start with. Much like debugging, you need a detective mindset to be good at reverse engineering, and name mangling is a relatively easy obstacle to handle in this scale.
Let me give some concrete example from my old comment [1]. The full code in question was as follows, with only whitespaces added:
function smb(){
var a,b,c,d,e,h,l;
return t(function(m){
a=new aj;
b=document.createElement("ytd-player");
try{
document.body.prepend(b)
}catch(p){
return m.return(4)
}
c=function(){
b.parentElement&&b.parentElement.removeChild(b)
};
0<b.getElementsByTagName("div").length?
d=b.getElementsByTagName("div")[0]:
(d=document.createElement("div"),b.appendChild(d));
e=document.createElement("div");
d.appendChild(e);
h=document.createElement("video");
l=new Blob([new Uint8Array([/* snip */])],{type:"video/webm"});
h.src=lc(Mia(l));
h.ontimeupdate=function(){
c();
a.resolve(0)
};
e.appendChild(h);
h.classList.add("html5-main-video");
setTimeout(function(){
e.classList.add("ad-interrupting")
},200);
setTimeout(function(){
c();
a.resolve(1)
},5E3);
return m.return(a.promise)
})
}
Many local variables should be easy to reconstruct: b -> player, c -> removePlayer, d -> playerDiv1, e -> playerDiv2, h -> playerVideo, l -> blob (we don't know which blob it is yet though). We still don't know about non-local names including t, aj, lc, Mia and m, but we are reasonably sure that it builds some DOM tree that looks like `<ytd-player><div></div><div class="ad-interrupting"><video class="html5-main-video"></div></ytd-player>`. We can also infer that `removePlayer` would be some sort of a cleanup function, as it gets eventually called in any possible control flow visible here.Given that `a.resolve` is the final function to be executed, even later than `removePlayer`, it will be some sort of "returning" function. You will need some information about how async functions are desugared to fully understand that (and also `m.return`), but such information is not strictly necessary here. In fact, you can safely ignore `lc` and `Mia` because it eventually sets `playerVideo.src` and we are not that interested in the exact contents here. (Actually, you will fall into a rabbit hole if you are going to dissect `Mia`. Better to assume first and verify later.)
And from there you can conclude that this function constructs a certain DOM tree, sets some class after 200 ms, and then "returns" 0 if the video "ticks" or 1 on timeout, giving my initial hypothesis. I then hardened my hypothesis by looking at the blob itself, which turned out to be a 3-second-long placeholder video and fits with the supposed timeout of 5 seconds. If it were something else, then I would look further to see what I might have missed.
How do we end up with you pasting large blocks of code and detailed step-by-step explanations of what it does, in response to someone noting that just because process A is simple, it doesn't mean inverting A is simple?
This thread is incredibly distracting, at least 4 screenfuls to get through.
I'm really tired of the motte/bailey comments on HN on AI, where the motte is "meh the AI is useless, amateurish answer thats easy to beat" and bailey is "but it didn't name a couple global variables '''correctly'''." It verges on trolling at this point, and is at best self-absorbed and making the rest of us deal with it.
I believe the person you're responding to is saying that it's hard to do automated / programmatically. Yes a human can decode this trivial example without too much effort, but doing it via API in a fraction of the time and effort with a customizable amount of commentary/explanation is preferable in my opinion.
This is, IMO, the better way to approach this problem. Minification applies rules to transform code, if we know the rules, we can reverse the process (but can't recover any lost information directly).
A nice, constrained, way to use a LLM here to enhance this solution is to ask it some variation of "what should this function be named?" and feed the output to a rename refactoring function.
You could do the same for variables, or be more holistic and ask it to rename variables and add comments (but risk the LLM changing what the code does).
JSNice[1] is an academic project that did a pretty good job of this in the 2010s and they give some pointers on how it is accomplished[2].
[1]: http://jsnice.org/
As a result, minified code still remains comparably human-readable with some note taking and perseverance.
At least some of the time, simply taking it and reformatting to be unfolded and on multiple lines is useful enough to be readable/debuggable. FIXING that bug is likely more complex, because you have to find where it is in the original code, which, to my eyes, isn't always easy to spot.
As a point of order Code Minification != Code Obfuscation.
Minification does tend to obfuscate as as side effect, but it is not the goal, so reversing minification becomes much easier. Obfuscation on the other hand can minify code, but crucially that isn't the place it starts from. As the goal is different between minificaiton and obfuscation reversing them takes different efforts and I'd much rather attempt to reverse minification than I would obfuscation.
I'd also readily believe there are hundreds/thousands of examples online of reverse code minification (or here is code X, here is code X _after_ minifcation) that LLMs have ingested in their training data.
Yeah, having run some state of the art obfuscated code through ChatGPT, it still fails miserably. Even what was state of the art 20 years ago it can't make heads or tails of.
JS minification is fairly mechanical and comparably simple, so the inversion should be relatively easy.
This is stated as if it's a truism, but I can't understand how you can actually believe this. Converting `let userSignedInTimestamp = new Date()` to `let x = new Date()` is trivial, but going the other way probably requires reading and understanding the rest of the surrounding code to see in what contexts `x` is being used. Also, the rest of the code is also minified, making this even more challenging. Even if you do all that right, it's at best it's still a lossy conversion, since the name of the variable could capture characteristics that aren't explicitly outlined in the code at all.
Because of how trivial that step is, it's likely pretty easy to just take lots of code and minify it. Then you have the training data you need to learn to generate full code from minified code. If your goal is to generate additional useful training data for your LLM, it could make sense to actually do that.
I suspect, but definitely do not know, that all the coding aspects of llms work something like this. It’s such a fundamentally different problem from a paragraph, which should never be the same as any other paragraph. Seems to me that coding is a bit more like the game of go, where an absolute score can be used to guide learning. Seed the system with lots and lots of leetcode examples from reality, and then train it to write tests, and now you have a closed loop that can train itself.
If you're able to generate minified code from all the code you can find on the internet, you end up with a very large training set. Of course in some scenarios you won't know what the original variable names were, but you would expect to be able to get something very usable out of it. These things, where you can deterministically generate new and useful training data, you would expect to be used.
And I can’t understand why any reasonably intelligent human feels the need to be this abrasive. You could educate but instead you had to be condescending.
Converting a picture from color to black and white is a fairly simple task. Getting back the original in color is not easy. This is if course due to data lost in the process.
Minification works in the same way. A lot of information needed for understanding the code is lost. Getting back that information can be a very demanding task.
But it is not much different from reading through badly documented codes without any comments or meaningful names. In fact, many codes to be minified are not that bad and thus it is often possible to infer the original code just from its structure. It is still not a trivial task, but I think my comment never implied that.
Random try (the first one) with Claude 3.5 Sonnet: https://claude.site/artifacts/246c1b1a-3088-447a-a526-b1e716...
I'm not on PC so it's not tested.
That's much better in that most of the original code remains present and comments are not that far off, but its understanding of global variables are utterly wrong (to be expected though, as many of them serve multiple purposes).
Yep, I've tried to use LLMs to disassemble and decompile binaries (giving them the hex bytes as plaintext), they do OK on trivial/artificial cases but quickly fail after that.
Author of HumanifyJS here! I've created specifically a LLM based tool for this, which uses LLMs on AST level to guarantee that the code keeps working after the unminification step:
More tools should be built on ASTs, great work!
I'm still waiting for the AST level version control tbh
Unison supposedly has an AST-aware version control system: https://www.unison-lang.org/
Wow this looks so cool.
content-addressed too, I think!
Smalltalk envy source controll
Does it work with huge files? I'm talking about something like 50k lines.
Edit: I'm currently trying it with a mere 1.2k JS file (openai mode) it's only 70% done after 20 minutes. Even if it works therodically with 50k LOC file, I don't think you should try.
It has this in the README
Large files may take some time to process and use a lot of tokens if you use ChatGPT. For a rough estimate, the tool takes about 2 tokens per character to process a file:
echo "$((2 * $(wc -c < yourscript.min.js)))" > So for refrence: a minified bootstrap.min.js would take about $0.5 to un-minify using ChatGPT.
Using humanify local is of course free, but may take more time, be less accurate and not possible with your existing hardware.
This only talks about the cost.
I'm more concerned about if it can actually deobfuscate such large file (context) and generate useful results.
how do you make an LLM work on the AST level? do you just feed a normal LLM a text representation of the AST, or do you make an LLM where the basic data structure is an AST node rather than a character string (human-language word)?
It looks like they're running `webcrack` to deobfuscate/unminify and then asking the LLM for better variable names.
The frontier models can all work with both source code and ASTs as a result of their standard training.
Knowing this raises the question, which is better to feed an LLM source code of ASTs?
The answer is really it depends on the use case, there are tradeoffs. For example keeping comments intact possibly gives the model hints to reason better. On the other side, it can be argued that a pure AST has less noise for the model to be confused by.
There are other tradeoffs as well. For example, any analysis relating to coding styles would require the full source code.
Would it be difficult to add a 'rename from scratch' feature? I mean a feature that takes normal code (as opposed to minified code) and (1) scrubs all the user's meaningful names, (2) chooses names based on the algorithm and remaining names (ie: the built-in names).
Sometimes when I refactor, I do this manually with an LLM. It is useful in at least two ways: it can reveal better (more canonical) terminology for names (eg: 'antiparallel_line' instead of 'parallel_line_opposite_direction'), and it can also reveal names that could be generalized (eg: 'find_instance_in_list' instead of 'find_animal_instance_in_animals').
Yes, I think you could use HumanifyJS for that. The way it works is that:
1. I ask LLM to describe what the meaning of the variable in the surrounding code
2. Given just the description, I ask the LLM to come up with the best possible variable name
You can check the source code for the actual prompts:
https://github.com/jehna/humanify/blob/eeff3f8b4f76d40adb116...
Is it possible to add a mode that doesn't depend on API access (e.g. copy and paste this prompt to get your answer)? Or do you make roundtrips?
There is a fully local mode that does not use ChatGPT at all – everything happens on your local machine.
API access of ChatGPT mode is needed as there are many round trips and it uses advanced API-only tricks to force the LLM output.
What kind of question does it ask the LLM? Giving it a whole function and asking "What should we rename <variable 1>?" repeatedly until everything has been renamed?
Asking it to do it on the whole thing, then parsing the output and checking that the AST still matches?
Looks useful! I will update the article to link to this tool. Thanks for sharing!
Finally someone else using ASTs while working with LLMs and modifying code! This is such an under-utilized area. I am also doing this with good results: https://codeplusequalsai.com/static/blog/prompting_llms_to_m...
Came here to say Humanify is awesome both as a specific tool and in my opinion a really great way to think about how to get the most from inherently high-temperature activities like modern decoder nucleus sampling.
+1
Thanks for your tool. Have you been able to quantify the gap between your local model and chatgpt in terms of ‘unminification performance’?
This post basically says that I don't need to document my code anymore. No more comments, they can be generated automatically. Hurray!
Unfortunately the comments that could be generated are exactly the ones that should never be written. You want the comment to explain why, the information missing from the code.
This is something I always disagreed with. In my experience, I rather read a short comment explaining what is the purpose of a block of code, than trying to decipher it. Yes, code "should speak for itself", but reading a comment is almost always faster than reading blocks of code. And then there is also documentation (if you include it in what you define as comment). I much rather go through a website, with a search function, example, description, made with some docgen tool, than having to go through a library or programming language source code every time I need to remember how to do X, or if object B has implement function Y ...
It's just a rule of thumb, like anything else. In most code, "why" is the hard part; I see that you are incrementing that account by a penny from out of the blue, but why? When you are in code where "what" is the hard part, like an implementation of a book algorithm or some tricky performance optimization, then by all means comment that.
Really all this rule amounts to is
// Increment by a penny
accountValue += 1
is a pointless comment, please don't do that. Schools had a way of accidentally teaching that by too-rigidly requiring "commented code", in situations where there wasn't much else to say, or situations where the students themselves didn't necessarily have a strong sense of "why". Any comment that isn't just literally "this is what the next line does" is probably useful to somebody at some point.I do agree that documenting the why is way more important than the how/what. But having a short comment to summarize a block of code like:
// Parse the filename and remove the extension
let fext_re = Regex::new(r"(.\*)\.(.+)$").unwrap();
let page_cap = fext_re.captures(fname).unwrap();
let page_base_filename = page_cap.get(1).unwrap().as_str();
Is still useful. Instead of having to read the next few line of code, I already know what they are suppose to do and expect. It makes discovery, later down the line, easier.This would be entirely self-documenting by replacing that with a function named after what it does, then the comment isn't necessary.
To boot, a unit test could be written that would reveal the bug in the regular expression that makes it only work with filenames that have an asterisk before the extension. Unless you intended that (unlikely), in which case the comment is wrong/not comprehensive and misdirects the reader.
If you didn't name you variables "fext_re" or "page_cap" you wouldn't need that comment to explain what the code does.
You can put these comments into the name of a function, getting rid of the redundancy and having them read by whoever would just be reading the code not to be distracted by the comments.
reading a comment is almost always faster than reading blocks of code
Not to a competent programmer when reading well-written code.
This also means that you read what the code does, rather than what a comment says the code does. Otherwise you will be blind to bugs. Any experienced developer will tell you that code very often doesn't do what the original programmer thought it did.
Not to a competent programmer when reading well-written code.
No, literally reading a one line about what the next 4 lines do is mechanically faster. It does not matter that you are good or bad, it is about simple reading speed.
This also means that you read what the code does, rather than what a comment says the code does. Otherwise you will be blind to bugs. Any experienced developer will tell you that code very often doesn't do what the original programmer thought it did.
I am an experience developer. I have worked on several "legacy" projects, and started many from 0.
1. It does not make you blind to anything, it is just a way to learn/direct yourself in the code base faster.
2. Knowing what the original developer wanted is often as useful as knowing what the code actually does. More info is better than no info.
Even outdated comment can be useful.
For me, this type of thinking that comment are unnecessary, that competent ppl can just read the code, etc. is actually a sign of younger dev who never had to work on a long-lived codebase.
For me, this type of thinking that comment are unnecessary, that competent ppl can just read the code, etc. is actually a sign of younger dev who never had to work on a long-lived codebase.
It sounds like you're conflating "helpful comments that explain why" with "no comments are needed ever because read the code", and we're talking past each other.
Goes to show that it can be hard to have a meaningful conversation via text. Maybe we should add support for audio comment in code!
But the purpose is the Why; forced comments tend to tell you What the code does, which is better explained by the code itself.
A comment that is incorrect can do a lot of damage, and they tend to get confused about implementation details over time.
For forced comment I fully agree, especially for function or class when the name already says whats on the tin.
I suspect you work at OpenAI and you're afraid that you will run out of training data.
what if the comment doesn't match what you intended to write?
Maybe it's better for the comment to match the code you wrote, not the code you intended to write.
// This is an integer containing the accountId
int accountId;
We truly live in the future
Okay, but if the unminified code doesn't match the minified code (as noted at the end "it looks like LLM response overlooked a few implementation details"), that massively diminishes its usefulness — especially since in a lot of cases you can't trivially run the code and look for differences like the article does.
[ed.: looks like this was an encoding problem, cf. thread below. I'm still a little concerned about correctness though.]
This refers to the fact that ChatGPT generated version is missing some characters that are used in the original example. Namely, [looks like HN does not allow me to paste unicode characters, but I am referring to the block characters] can be seen in their version, but cannot be seen in the ChatGPT generated version. However, it very well might be that it is simply because I didn't include all the necessary context.
Discrediting the entire output because a few missing characters would be very pedantic.
Otherwise, the output is identical as far as I can tell by looking at it.
It's because the author miscopy-pasted the original code: those "â–‘â–’â–“â–ˆ" at the end of the O5 string are supposed to be the block characters. E.g. "â–‘" in Windows-1252 [0] is 0xE2 0x96 0xE2 which, in UTF-8, exactly the encoding for U+2592 MEDIUM SHADE [1].
[0] https://en.wikipedia.org/wiki/Windows-1252#Character_set
Possible that this is the mistake.
However, I don't think I miscopied the original code.
https://reactive.network/assets/index-8b4ef4ac.js
If you look for `oahkbdpqwmZO0QLCJUYXzcvunxrjft` in the output, you should see that those characters appear exactly like that. Maybe an issue with encoding of the script file?
Most definitely; if I use "View >> Repair Text Encoding" in Firefox, it shows the block characters. But I have to admit, it's strange that Firefox does not choose UTF-8 by default in this case.
Yes, turns out I was the one who made the mistake.
I updated the article to reflect the mistake.
Update (2024-08-29): Initially, I thought that the LLM didn’t replicate the logic accurately because the output was missing a few characters visible in the original component (e.g., ). However, a user on HN forum pointed out that it was likely a copy-paste error.
Upon further investigation, I discovered that the original code contains different characters than what I pasted into ChatGPT. This appears to be an encoding issue, as I was able to get the correct characters after downloading the script. After updating the code to use the correct characters, the output is now identical to the original component.
I apologize, GPT-4, for mistakenly accusing you of making mistakes.
If no character set is specified, plain text content is assumed to be 1252. This probably extends to application/javascript as well but I'd have to check to be sure.
The web pre-dates utf-8, although not by much. Ken Thompson introduced utf-8 at winter Usenix in 1993 and CERN released the web in April, but it would be several more years before utf-8 became common. The early web was ISO 8859-1 by default. But people were pretty lazy about specifying character sets back then (still are actually) and Microsoft started sending or assuming their 1252 character set where 8859-1 was required by the spec. Eventually the spec was changed to match de facto behavior. I guess the assumption was that if you're too stupid or lazy to say what character set you're using, then it's probably 1252. (Today the assumption would be that it's probably utf-8). I'm not sure what the specs say today, but I think html is assumed to be in utf-8, and everything else is assumed to be 1252 (if the character set is not explicitly declared).
It does seem that the unminified code is very close to the original. In some cases ChatGPT even did its own refactoring in addition to the unminification:
// ORIGINAL:
j.useEffect(() => {
function r() {
n({ height: window.innerHeight, width: window.innerWidth });
}
if (typeof window < "u") return n({ height: window.innerHeight, width: window.innerWidth }), window.addEventListener("resize", r), () => window.removeEventListener("resize", r);
}, []),
// UNMINIFIED:
useEffect(() => {
const handleResize = () => {
setSize({ height: window.innerHeight, width: window.innerWidth });
};
// Initial size setting
handleResize();
window.addEventListener('resize', handleResize);
return () => {
window.removeEventListener('resize', handleResize);
};
}, []);
Note that the original code doesn't call `handleResize` immediately, but have its contents inlined instead. (Probably the minifier did the actual inlining.) The only real difference here is a missing `if (typeof window < "u")` condition.the condition is a constant so it can be safely removed
Only in the web environment. In fact the condition itself is true only when it runs in a web browser and not in a web worker.
You need to use another tool to do the actual renames, like HumanifyJS does:
He also told it to reimplement from JavaScript to TypeScript.
I would guess if he just told it to rename the variables and method first, it would have been closer to the original.
This is an example of superior intellectual performance to humans.
There’s no denying it. This task is intellectual. Does not involve rote memorization. There are not tons and tons of data pairs on the web of minimized code and unminified code for llms to learn from.
The llm understands what it is unminifying and it is in general superior to humans on this regard. But only in this specific subject.
I'm bullish on AI, but I'm not convinced this is an example of what you're describing.
The challenge of understanding minified code for a human comes from opaque variable names, awkward loops, minimal whitespacing, etc. These aren't things that a computer has trouble with: it's why we minify in the first place. Attention, as a scheme, should do great with it.
I'd also say there is tons of minified/non-minified code out there. That's the goal of a map file. Given that OpenAI has specifically invested in web browsing and software development, I wouldn't be surprised if part of their training involved minified/unminified data.
minification and unminification is a heuristic process not an algorithmic one. It is akin to decompiling code or reverse engineering. It's a step beyond just your typical AI you see in a calculator.
I don’t claim expertise in AI or understanding intelligence, but could we also say that a pocket calculator really understands arithmetic and has superior intellectual performance compared to humans?
I think I’d agree with your statement, in the same sense that a chess simulator or AlphaGo are superior to human intellect for their specific problem spaces.
LLMs are very good at a surprisingly broad array of semi-structured-text-to-semi-structured-text transformations, particularly within the manifold of text that is widely available on the internet.
It just so happens that lots of code is widely available on the internet, so LLMs tend to outperform on coding tasks. There’s also lots of marketing copy, general “encyclopedic” knowledge, news, human commentary, and entertainment artifacts (scripts, lyrics, etc). LLMs traverse those spaces handily as well. The capabilities of AI ultimately boil down to their underlying dataset and its quality.
Yeah, ok. Now count the number of Rs in this word.
This is just transforming text.
There are not tons and tons of data pairs on the web of minimized code and unminified code for llms to learn from.
Are you sure about this? These can be easily generated from existing JS to use as a training set, not to mention the enormous amount of non-minified JS which is already used to train it.
Does not involve rote memorization. There are not tons and tons of data pairs on the web of minimized code and unminified code for llms to learn from.
GPT-4 has consumed more code than your entire lineage ever will and understands the inherent patterns between code and minified versions. Recognizing the abstract shape of code sans variable names and mapping in some human readable variable names from a similar pattern you've consumed from the vast internet doesn't seem farfetched.
Here's a hint, STOP MINIFYING CODE! gzip over transport is enough.
Not exactly, because you still have to pay any distinct identifier present in your code. Also many minifiers do constant folding and inlining and remove comments, any of which almost surely remove redundant or unused information to compress.
I don’t think they’re saying that minifying provides no additional space savings, but rather that those additional savings are small and not worth the tradeoffs.
Not even that is true in my knowledge. For example a particular benchmark [1] demonstrates that many popular libraries benefit much from minification even after gzip compression, with the saving ranging from 35% to 75%. Sure, a small library would be fine without any minification or even compression, but otherwise minification is clearly beneficial.
[1] https://github.com/privatenumber/minification-benchmarks
I think you have to look at this in the context of an entire bundle or project, and then you have to weigh it against the download speeds you’re generally expecting for the users of your site or app.
I agree that as a blanket statement “gzip is enough” is not technically correct, but I think it’s largely correct in spirit, in that people tend to reach for minification by default, without really thinking about what they’re gaining.
If minifying saves you 200 KB overall, for example, and you expect your average user to have a 200 Mbps connection, you’re saving a grand total of 8 ms on page load, which is an imperceptible difference on its own. In exchange, you’re getting worse debugging, and worse error reporting.
I wonder if 200 KB is small enough that TCP slow start will be the constraint on a download over a new connection, rather than the bandwidth.
I think probably not, if the assets are coming from the same place, since the connection will be reused in most modern situations. Maybe if you’re loading the JS from a CDN though, and there are no other large resources, or those resources come from a different server.
Minification would be indeed useless under that set of assumptions, but the real world is much more variable and you need a comfortable margin. For example, mobiles rarely have that large bandwidth sustained all the time.
Comprehensively speaking, the minification is only a small step in building a performant website or web application. You have way more things to do, for example choosing a correct image compression format and method would have much more impact in general. But not everyone can be expected to understand them in depth, so we have best practices. Doing the minification therefore qualifies as a good best practice, even though it would be just a single one out of many others.
LLMS are trained to predict next text. But examples like these look like they have also 'learned patterns'. If rot13 is applied on this minified code, will LLM still find meaning in it? if it still could, its more than just next tokens. Need to try it.
edit: chatgpt found out that its rot13 and couldn't explain the code directly without deobfuscating it first.
Claude 3.5 Sonnet can natively speak double base64 encoded English. And I do mean it - you can double b64 encode something, send to it, and it'll respond as if it was normal English. Obviously base64 is a simpler transformation than rot13, but no GPT models can deal with double b64.
Obviously base64 is a simpler transformation than rot13
Is it? It’s probably more obscuring from an LLM’s perspective, assuming the LLM has seen enough rot13 text during training. Spaces and punctuation are untouched by rot13, unlike base64, which means that word and sentence boundaries will still be denoted by tokens that denote those boundaries in plaintext.
it appears that openai's gpt-4 model can speak base64 as well. I jumped to your comment seeing if anyone else had tried it following the OP. double b64 I didn't try, but that is interesting.
$ ask4 ' what does dGhhdCBpcyBxdWl0ZSBpbnRlcmVzdGluZw== decode to? ' > A "dGhhdCBpcyBxdWl0ZSBpbnRlcmVzdGluZw==" is a Base64 encoded string. When decoded, It translates to "that is quite interesting" in English.
I asked Claude 3.5 Sonnet a question in Italian in rot13 and it replied in Italian in rot13, there are a few typos but it's perfectly understandable.
I tried with GPT-4o and it also responded in rot13, the response was on topic, but quite non-sensical, like GPT-2 or lower level.
However I can confirm that Claude was able to identify that it's rot13 and also respond properly.
An interesting use-case of this capability is refactoring, which, for me, ChatGPT has been unmistakably good at. It's amazing how I can throw garbage code I wrote at ChatGPT, ask it to refactor, and get clean code that I can use without worrying if it's going to work or not, because in 99% of cases it works without breaking anything.
What language(s), out of interest?
I use node.js, but I think it will work for anything. I recommend trying small chunks first vs dumping your whole file.
I've had pretty good results dumping entire files in to Sonnet3.5.
For example, "Here's my app.js file, please add an endpoint for one user to block another. Feel free to suggest schema changes. Please show me the full app.js with these changes implemented"
The model seems to be great at figuring out frameworks and databases just by seeing the contents of a full app.js file.
I do find this type of prompt works much better with Sonnet3.5 than GPT4o.
No, most other languages absolutely don’t work as well as JS, simply because there’s been less training material available. It’s useless with Rust, for example (hell, I’d be totally impressed if it has any idea how to appease the borrow checker!)
It is good at unminifying and "minifying" as well.
I have been doing the Leetcode thing recently, and even became a subscriber to Leetcode.
What I have been doing is I go through the Grind 75 list (Blind 75 successor list), look for the best big O time and space editorial answer, which often has a Java example, and then go to ChatGPT (I subscribe) or Perplexity (don't subscribe to Pro - yet) and say "convert this to Kotlin", which is the language I know best. Jetbrains IDE or Android Studio is capable of doing this, but Perplexity and ChatGPT are usually capable of doing this as well.
Then I say "make this code more compact". Usually I give it some constraints too - keep the big O space and time complexity the same or lower it, keep the function signature of the assigned function the same, and keep the return explicit, make sure no Kotlin non-null assertions crop up. Sometimes I continually have it run these instructions on each version of the iterated code.
I usually test that the code compiles and returns the correct answers for examples after each iteration of compacting. I also copy answers from one to the other - Perplexity to ChatGPT and then back to Perplexity. The code does not always compile, or give the right answers for the examples. Sometimes I overcompact it - what is clear in four lines becomes too confusing in three compacted lines. I'm not looking for the most compact answer, but a clear answer that is as compact as possible.
One question asked about Strings and then later said, what if this was Unicode? So now for String manipulation questions I say assume the String is Unicode, and then at the end say show the answer for ASCII or Unicode. Sometimes the big O time is tricky - it is time O(m+n) say, but since m is always equal to or less than m in the program, it is actually O(n), and both Perplexity and ChatGPT can miss that until it is explained.
People bemoan Leetcode as a waste of time, but I am wasting even less time with it, as ChatGPT and Perplexity are helping give me the code I will be demonstrating in interviews. The common advice I have heard from everywhere is don't waste time trying to figure out the answers myself - just look at the given answers, learn them, and then look for patterns (like binary search problems, which are usually similar), so that is what I am doing.
Initially I was a ChatGPT and Perplexity skeptic for early versions of those sites, in terms of programming, as they stumbled more, but these self-contained examples and procedures they seem well-suited for. Not that they don't hallucinate or give programs that don't compile, or give the wrong answers sometimes, but it saves me time ultimately.
This does seem to be a smart use of the tools available to skip the grind and get to the point of the leetcode questions.
However, I wonder about this: What will you do in a live interview situation? Will you pull up ChatGPT?
I have been told by people working in $200k+/$300k+ SWE jobs to look up at the answers and just be able to regurgitate something along the lines of the Grind 75 answers as a first step.
As a next step - even within these 75 questions, Grind 75's eighth answer and fourteenth answer are answered essentially the same way, as are other questions in there. So the next step would be to see these patterns (binary search, priority queues, sliding window, backtracking) and how to answer them, and then be able to solve them in slightly novel problems (in the more complex questions I understand one might run into more than one of these patterns).
You are doing it right. Pattern matching and lightening fast regurgitation are what is needed. There isn't enough time to "solve and implement".
This is a good way to do it IMO. Though I would say you don't want to just memorize answers; you want to fully understand them. Also, paying for LeetCode premium is very helpful since their official solutions are easy to understand and explain how you might arrive at these solutions yourself.
I recognized this a few months back when I wanted to see the algorithm that a website used to do a calculation. I just put the minified JS in ChatGPT and figured it out pretty easily. Let's take this a few steps out. What happens when a LLM can clone a whole SAAS app? Let's say I wanted to clone HubSpot. If an LLM can interact with a browser and figure out how a UI works and take code hints from un-mimified code I think we could see all SAAS apps be commoditized. The backend would be proprietary, but it could figure out API formats and suggest a backend architecture.
All this makes me think AI's are going to be a strong deflationary force in the future.
I was with you until:
If an LLM can interact with a browser and figure out how a UI works and take code hints from un-mimified code I think we could see all SAAS apps be commoditized. The backend would be proprietary, but it could figure out API formats and suggest a backend architecture.
whoooha! that's a lot of probing and testing of the SAAS that would be required in order to see how it behaved. SAAS aren't algorithms, they operate over data that's unseen on the front end as well...
All this makes me think AI's are going to be a strong deflationary force in the future.
I don't get this. I've literally never worked anywhere which had enough software engineers, we've been going on about software crisis for about 50 years and things are arguably worse than ever. The gap between the demand for good software (in the sense that allocating capital to producing it would be sensible) and the fulfillment of that demand is bigger than ever. We just don't have the mechanisms to make this work and to make it work at an economically viable level.
Then we get AI to help us and everyone thinks that the economy will shrink?
You wouldn't necessarily need to do much probing - consider that the documentation would provide numerous hints to the agent as to what each endpoint was actually doing.
Honestly, the value in most business software isn't the actual technology. It's the customer base and data held by the platforms.
Someone could already easily clone HubSpot relatively cheaply even if they hired developers, but that doesn't mean it will be anywhere near successful.
This might be fun:
Train on java compiled to class files. Then go from class back to java.
Or even:
Train java compiled to class files, and have separate models that train from Clojure to class and Scala to class files. Then see if you can find some crufty (but important) old java project and go: crufty java -> class -> Clojure (or Scala).
If you could do the same with source -> machine instructions, maybe COBAL to C++! or whatever.
I agree, it is fun!
LLM source recovery from binaries is thing. The amazing part is that they are pretty good at adding back meaningful variable names to the generated source code.
This is something you don't need AI for, there are many decompilers out there already as well.
AI cannot even lint properly right now and you want it to decompile? good luck, there's too much hype going on people really think this is possible this year?
In the end always remember it's just autocomplete, it's pretty terrible at translations that are not natural language to natural language. I worked on a natural language to SQL and it was impossible to make it consistently generate valid SQL for Postgres, and I'm talking about natural language to SQL not virtual machine instructions...
I think there are already decompilers and code analyzers at NSA like this. For 10 years or so.
I apologize, GPT-4, for mistakenly accusing you of making mistakes.
I am testing large language models against a ground truth data set we created internally. Quite often when there is a mismatch, I realize the ground truth dataset is wrong, and I feel exactly like the author did.
Apologizing to a program seems rather silly though. Do you apologize to your compiler when you have a typo in your code, and have to make it do all that work again?
If the compiler could listen and update its functions based on the tone of what I said to it, yes I probably would.
I use LLMs to assist with reverse engineering all the time right now. From minified, to binary, alongside Ghidra, its very helpful.
Can you provide more details? I'm curious about the performance and limitations of these models.
Like all LLMs you greatly benefit from prior experience or you risk just falling for hallucinations which is a limitation of a non-deterministic black box, and degrades performance relative to the task. Ive commented in other threads, LLMs are great at amplifying my output in an area I already have domain knowledge in. I think this is why people fail to realize any gains or give up, they think it will unlock areas they dont fully understand themselves. Blind leading the blind problem.
That's an interesting finding so far!
The provided code is quite complex, but I'll break it down into a more understandable format, explaining its different parts and their functionalities.
Reading the above statement generated by ChatGPT, I asked myself: Will we live to the day where these LLMs could take a large binary executable as input, read it, analyze it, understand it, then reply with the above statement?
I followed up asking to "implement equivalent code in TypeScript and make it human readable" and got the following response.. To my surprise, the response is not only good enough, but it is also very readable.
What if this day came and we can ask these LLMs to rewrite the binary code in [almost] any programming language we want? This would be exciting, yet scary to just think about!
You should give it a try and report back! One easy way would be to take an open-source Android app, compile the APK, then decompile it and feed the bytecode to an LLM and ask it to write the java/kotlin equivalent and compare the source and LLM decoded one.
Hopefully it can help do this on emscripten files too and help adblockers dechipher obfuscated code for that purpose
Likewise for css class names
I can imagine that finetuning a model for this task could be very successful. Time for another AI startup.
Yet another surprising side effects of LLMs.
If the training data's included both unminified and minified libs, then is it such a stretch?
Is it though? The developer tabs have an unminify button which yields similar results. JavaScript minification is not hard in any way and the guessing of variable names is not that hard given such a simple code example.
Is there any reason why it’s ‘OpenAI’ in the title rather than ‘ChatGPT’?
It's the title of the blog post.
Dont know if this will apply directly here, but --
As someone who is "not a developer" - I use the following process to help my:
1. I setup StyleGuide rules for the AI, telling it how to write out my files/scripts:
- Always provide full path, description of function, invocation examples, and version number.
- Frequently have it summarize and explain the project, project logic, and a particular file's functions.
- Have it create a README.MD for the file/project
- Tell it to give me mermaid diagrams and swim diagrams for the logic/code/project/process
- Close prompts with "Review, Explain, Propose, Confirm, Execute" <-- This has it review the code/problem/prompt, explain what it understands, propose what its been asked to provide, confirm that its correct or I add mroe detail here - then execute and go with creating the artifacts.
I do this because Claude and ChatGPT are FN malevelant in their ignoring of project files/context - and their hallucinate as soon as their context window/memory fills up.
Further they very frequently "forget" to refer to the project context files uploaded/artifacts they themselves have proposed and written etc.
But - asking for a readme with code mermaid and logic is helpful to keep me on track.
Agents like Aider or Plandex wrap that up nicely. They do the automatic review and have a very verbose description of the edit format. If you do that often manually, it may be worth testing their prepackaged approach.
Please let this one have knock-on effects on reverse engineering.
See Binary Ninja's Sidekick plugin
Had tweeted about this sometime back. Found a component which was open source earlier and then removed and only minfied JS was provided. Give the JS to Claude and get the original component back. It even gave good class names to the component and function names.
Actually this opens up a bigger question. What if I like an open source project but don't like its license. I can just prompt AI by giving it the open source code and ask it to rewrite it or write in some other language. Have to look up the rules if this is allowed or will be considered copying and how will a judge prove?
Almost likely you would be found guilty because the intent matters. It is easy to check that the generated code is much similar to the original code, and you surely had a reason to bypass the original license. The exact legal reasoning would vary but any reasonable laywer would recommend you to do not.
In the historic Google v. Oracle suit, the only actual code that was claimed to be copied was a trivial `rangeCheck` function, but Google's intent and other circumstances like the identical code structure and documentation made it much more complicated, and the final decision completely bypassed the copyrightability of APIs possibly for this reason.
The site the post mentions for the original code (https://reactive.network/hackathon) is an accessibility nightmare.
The garbled text is included in the tree as relevant, pronounceable, and constantly changing text. Here's Chrome's accessibility tree: https://imgur.com/a/V1589Jr
(I'd love if a screen reader user could upload some audio of how awful this sounds, by the by)
Please use `aria-hidden="true"` for stuff like this, it just removes the element from the accessibility tree. I've also emailed Reactive a link to this thread.
Here is a decent intro to ARIA things: https://www.smashingmagazine.com/2022/09/wai-aria-guide/
* takes out soap box and stands on it *
We should go back to uncompiled JavaScript code, our democracy depends on it.
Or learn to read minified JS code, which is actually not that difficult! (But you do have to take notes to track identifiers.)
Most expensive unminify software in history
A human would be even more expensive.
LLMs are good at modeling and transforming text, news at 11. AI proponent hypes AI. I could go on, but I shouldn't have been this sarcastic to start with
You should be. I am facepalming at the topic and every single comment in here. It's so full of holes, the Swiss dairy industry went out of business.
it is also pretty good at decompiling - try feeding it the output of https://godbolt.org/
I recently learned this too, just a few months ago. Ended up making a frontend so I could do it automatically: https://decompiler.zeroday.engineering/
You can do this on minified code with beautifiers like js-beautify, for example. It's not clear why we need to make this an LLM task when we have existing simple scripts to do it?
Beautifiers will restore whitespace, but they won’t rename variables by inferring their semantic meaning.
I find LLMs good at these kind of tasks, also converting between CSV to JSON for example (although you have to remind it not to be lazy and do the whole file)
I can see some ways to use this and easily check that the LLM is not hallucinating parts of it, because you can ask the LLM to unminify (or deobfuscate) some component, then request unit tests to be written by the LLM, then humanly check that the unit tests are meaningful and that they don't miss things on the unminified code, then run the tests on the original minified version to confirm the LLM's work, maybe set up some mutation testing if it is relevant.
I have to ask the obvious question: how do you know the unminified code is semantically equivalent to the minified code? If someone knows how to verify LLM code transformations for semantic fidelity then I'd like to know because I think that would qualify as a major breakthrough for programming languages and semantics.
Only thing I'd like to suggest is an option to search for Windows 11 compatible machines. With MS cutting off support for Windows 10 next year, making sure a machine has the system requirements needed.
However, I have seen a lot of sellers install W11 on non-compatible devices using a few tricks. I'm not sure how you check that in a search tool, but great job otherwise! I'll definitely be using this in the future (and I think you should pass everything through affiliate links! Pay for the upkeep at least)
And shockingly shit at writing articles that don't sound like essays.
Looks like the end is here for security via obscurity.
It is also shockingly good at converting/extracting data to CSV or JSON, but not JSONL. Even the less capable model, `gpt-4o-mini`, can "reliably" parse database schemas in various formats into CSV with the structure:
```csv table_name,column_name,data_type table_name,column_name1,data_type table_name,column_name2,data_type ... ```
I have been running it in production for months[1] as a way to import and optimize database schemas for AI consumption. This performs much better than including the `schema.sql` file in the prompt.
[1]: https://www.sqlai.ai/app/datasources/add/database-schema/ai-...
Are there any serious security implications for this? Of course obfuscation through minification won't work anymore, but I'm not sure if that's really all that serious of an issue at the end of the day.
That's interesting. It's gotten a lot better I guess. A little over a year ago, I tried to use GPT to assist me in deobfuscating malicious code (someone emailed me asking for help with their hacked WP site via custom plugin). I got much further just stepping through the code myself.
After reading through this article, I tried again [0]. It gave me something to understand, though it's obfuscated enough to essentially eval unreadable strings (via the Window object), so it's not enough on it's own.
Here was an excerpt of the report I sent to the person:
For what it’s worth, I dug through the heavily obfuscated JavaScript code and was able to decipher logic that it:
- Listens for a page load
- Invokes a facade of calculations which are in theory constant
- Redirects the page to a malicious site (unk or something)
[0] https://chatgpt.com/share/f51fbd50-8df0-49e9-86ef-fc972bca6b...
Have used Claude to reverse engineer some minified shopify javascript code recently. Definitely handy for unpicking things.
LLMs are very good at text reading. LLMs read tokenized text, while human use eyes to view words. Another scenario is that ChatGPT is good at analyzing cpp template error messages, which are usually long and hard to understand for human.
This is very close to how I often use LLMs [0]. A first step in deciphering code where I otherwise would need to, to use the authors words, power through reading the code myself.
It has been incredibly liberating to just feed it a spaghetti mess, ask to detangle it in a more readable way and go from there.
As the author also discovered, LLMs will sometimes miss some details, but that is alright as I will be catching those myself.
Another use case is when I understand what the code does, but can't quite wrap my head around why it is done in that specific way. Specifically, where the author of the code is no longer with the company. I will then simply put the method in the LLM chat, explain what it does, and just ask it why some things might be done in a specific way.
Again, it isn't always perfect, but more often than not it comes with explanations that actually make sense, hold up under scrutiny and give me new insights. It actually has prevented me once or twice from refactoring something in a way that would have caught me headaches down the line.
[0] chatGPT and more recently openwebUI as a front end to various other models (Claude variants mostly) to see the differences. Also allows for some fun concepts of having different models review each others answers.
Would have been cool if this had been used in that air con reverse engineering story yesterday.
I noticed while reading the blog entry that the author described using a search engine multiple times and thought, "I would have asked ChatGPT first for that."
I'm sure there's some number greater than zero of developers who are upset because they use minification as a means of obfuscation.
Reminds me of the tool that was provided in older versions of ColdFusion that would "encrypt" your code. It was a very weak algorithm, and didn't take long for someone to write a decrypter. Nevertheless some people didn't like this, because they were using this tool, thinking it was safe for selling their code without giving access to source. (In the late 90s/early 2000s before open source was the overwhelming default)
Is this code available in ChatGPT's training data?
Tried hard, couldn't find any similar code.
I’ve tried using LLMs to deobfuscate libraires like fingerprintjs-pro to understand what specific heuristics implementation details they use to detect bots.
They mostly fail. A human reverse engineer will still do better.
punkpeye could also have asked the LLM to replace the cryptic function and variable names with nice ones. I'm hopeful it would have done a good job.
wow, is openAI such a great magic to you?
"Usually, I would just powerthrough reading the minimized code..."
Huh? Is this a thing? There are endless online code formatting sites. It takes two seconds. Why would anyone ever do this? I don't get it.
It’s not only their core strength — it’s what transformers were designed to do and, arguably, it’s all they can do. Any other supposed ability to reason or even retain knowledge (rather than simply regurgitate text without ‘understanding’ its intended meaning) is just a side effect of this superhuman ability.
I see your point, but I think there's more to it. It's kind of like saying "all humans can do is perceive and produce sound, any other ability is just a side-effect". We might be focusing too much on their mechanism for "perception" and overlooking other capabilities they've developed.
Sure, but that claim wouldn't be true for humans, right? So it's a nonsequiteur.
The relevant claim would be: all humans can do is move around in their environments, adapt the world around them through action, observe using adaptive sensory motor systems, grow and adapt their brains and bodies in response to novel and changing environments, abstract sensory motor techniques into symbolic concepts, vocalize this using inherited systems of meaning acquired as very young children in adaption within their environments, etc.
In the case of transformers all they can do is, in fact, sample from a compression of historical texts using a weighted probability metric.
If you project both of these into "problems an office worker has"-space, then they can appear simimlar -- but this projection is an incredibly dumb one, and offered as a sales pitch by charlatans looking to pretend that a system which can generate office emails can communicate.
To me, results like the Othello paper make any sort of "stochastic parrot" thinking completely untenable.
https://thegradient.pub/othello/
This result is an argument for the conclusion you are reading it as arguing against.
Abstract functions are fully representable by function approximations in the limit n->inf; ie., sampling from a circle becomes a circle as samples -> infinity.
This makes all "studies" whose aim is to approximate a fully representable abstract mathematical domain irrelevant to the question.
This is just more evidence of the naivety, mendacity, and pseudoscientific basis of ML and its research.
I don't think that's all they can do.
I think they know more than what is explicitly stated in their training sets.
They can generalize knowledge and generalize relationships between the concepts that are in the training sets.
They're currently mediocre at it, but the results we observe from SOTA generative models are not explainable without accepting that they can create an internal model of the world that's more than just a decompression algorithm.
I'm going to step away from LLMs for a moment, but: How are video generator models capable of creating videos with accurate shadows and lighting that is consistent in the entire frame and consistent between frames?
You can't do that simply by taking a weighted average of the sections of videos you've seen in your training set.
You need to create an internal 3D model of the objects in the scene, and their relative positions in space across the length of the video. And no one told the model explicitly how to do that, it learned to do it "on its own".
I think the same principle applies to LLMs.
Compression is understanding. If you have a model which explains shadows you can compress your video data much better. Since you "understand" how shadows work.
this overlooks how they do it. we don't really know. it might be logical reasoning, it might be a very efficient content addressable human-knowledge-in-a-blob-of-numbers lookup table... it doesn't matter if they work, which they do, sometimes scarily well. dismissing their abilities because they 'don't reason' is missing the forest for the trees in that they'd be capable of reasoning if they were able to run sat solvers on their output mid generation.
Dismissing claims that LLMs "reason" because these machines perform no actions similar to reasoning seems pretty motivated. And I don't think "blindly take input from a reasoning capable system" counts as reasoning.
"pretty motivated"? Did you mean biased?
I assume they meant motivated as shorthand for "motivated reasoning" which implies a bias that's motivating them to reason a certain way
Does it? I think Blindsight (the book) had a good commentary on reason being a thing we think is a conscious process but doesn't have to be.
I think most people talking past each other are really discussing whether the GPT is conscious, has a mental model of self, that kind of thing, as long as your definition of reasoning doesn't include consciousness it clearly does it (though not well.)
Hinton claims they do reason. I am going to go with Hinton on this.
One potential benefit should be that with the right tooling around it it should be able to translate your code base to a different language and/or framework more or less at the push of a button. So if a team is wondering if it would be worth it to switch a big chunk of the code base from python to elixir they don't have to wonder anymore.
I tried translating a python script to javascript the other day and it was flawless. I would expect it to scale with a bit of hand-railing.
ChatGPT is trained well enough on all things AWS that it can do a decent job translating Python based SDK code to Node and other languages, translate between CloudFormation/Terraform/CDK (in various languages).
It does a well at writing simple to medium complexity automation scripts around AWS.
If it gets something wrong, I tell it to “verify your answer using the documentation available on the web”
It was scary to me how to chatting with GPT or Claude would give me information which was a lot more clear than what I could deduce after hours of reading AWS documentation.
Perhaps, the true successor to Google search has arrived. One big drawback of Google was asking questions that can't be converted to a full long conversation.
To that end. LLM chat is the ultimate socratic learning method tool till date.
ChatGPT is phenomenal for trying new techniques/libraries/etc. It's very good at many things. In the past few weeks I've used it to build me a complex 3D model with lighting/etc with Three.JS, rewrote the whole thing into React Three Fiber (also with ChatGPT), for a side project. I've never used Three.JS before and my only knowledge of computer graphics is from a class I took 20 years ago. For work I've used it to write me a CFN template from scratch and help me edit it. I've also used it to try a technique with AST - I've never used ASTs before and the first thing ChatGPT generated was flawless. Actually, most of the stuff I have it generate is flawless or nearly flawless.
It's nothing short of incredible. Each of those tasks would normally have taken me hours and I have working code in actual seconds.
And we are still at the beginning of this. Some what like where Google search was in early 2000s.
As IDE integration grows and there are more and better models, that can do this better than ever. We will unlock all sort of productivity benefits.
There is still skepticism about making these work at scale, with regards to both electricity and compute requirement for the larger audience. But if they can get this to work, we might see a new era tech boom way bigger than we have seen anything before.
I see your point but that specific analogy makes me wince. Google search was way better in the 2000s. It has become consistently dumber since then. Usefulness doesn't necessarily increase in a straight line over time.
see projects like https://github.com/joshpxyne/gpt-migrate
think there's also a YC company recently focusing on the nasty, big migrations with LLM help
It seems that this kind of application can really change how the tech industry can evolve down the line. Maybe we will more quickly converge on tech stacks if everyone can test new one's out "within a week".
The problem is the use case is where you don't care about the risk of hallucinations or you can validate the output without already having the data in a useful format. Plus you need to lack the knowledge/skill to do it more quickly using awk/python/perl/whatever.
That's why having good test suites and tools are more important than ever.
I think text transformation is a sufficiently predictable task that one could make a transformer that completely avoids hallucinations. Most LLMs have high temperatures which introduces randomness and therefore hallucinations into the result.
Isn't this already their main use case for business? We use them primarily for extracting structured data from other forms.
Particularly those that are basically linear, that don’t involve major changes in the order of things or a deep consideration of relationships between things.
They can’t sort a list but they can translate languages, for instance, given that a list sorted almost right is wrong but that we will often settle for an almost right translation.