If LLMs can be taught to write assembly (or LLVM) very efficiently, what would it take to create a full or semi-automatic LLM compiler from high languages or even from pseudo-code or human language.
The advantages could be monumental:
- arguably much more efficient utilization of resources on every compile target
- compilation is flexible and not rule based. an LLM won’t complain over a missing “;” as it can “understand” the intent
- it can rewrite many of the software we have today just based on the disassembled binaries to squeeze more out of HW
- can we convert an assembly block from ARM to RISC? and vice versa?
- potentially, iterative compilation (ala open interprator) can also understand the runtime issues and exceptions to have a “live” assembly code that changes as issues arise
>> Any projects exploring this?
>> I feel it is an issue of dimensionality (ie “context” size), very similar to having a latent space for entire repos. Do you agree?
Frankly, I’d argue LLMs are not the tool for this—not only at a fundamental level (they aren’t the right tool for the job, given hallucination and a host of other factors), but they are also way too resource intensive right now.
Resource optimization on the compiling stage isn’t necessary a priority. You can use a cheap compiler to iterate and an expensive one to do one time optimization.
Agree on hallucinations… but it’s not a catch all phrase.
Creativity comes from micro hallucinations :)