All You Need Is Style Transfer
My Hovercraft Is Full of Python
People have recently discovered LLM translation tools that let you translate between more than just the standard set of human languages.
Instead of being limited to boring old languages like “English” and “Swahili”, these tools let you translate from anything you want to anything else. For example, you can set the target language to “LinkedIn”, plug in the Declaration of Independence, and recover such gems as, “after a long series of blockers and non-consensual acquisitions, it’s clear the current ‘King of Great Britain’ brand is a total mismatch for our culture. It’s giving absolute tyranny. 🚩”
Because you can put anything you want in the “to” field, the target language can be anything you imagine. LinkedIn is just one of a million options. You could also pop in a normal English sentence and get it back in the style of Matt Yglesias, Bernie Sanders, Barack Obama, or even George Costanza himself. “My favorite is the Gen Z translator,” writes one user. “I can finally communicate with my teenage daughter!” This is how you learn that government of the people, by the people, for the people, will never fall off.
This phenomenon is known as style transfer. And one surprise of the current AI era is that for some reason, style transfer is something Deep Learning1 does really really well.
This isn’t only possible in text — image tools can do the same thing. Pop in an image, name a style, and watch your dog slowly turn into a Monet. This one has been around for a while: check out this James Gurney blog post all the way back from 2016 marveling at the effect.
And it’s probably no mistake that one of the first practical uses of LLMs was in machine translation tools like Google Translate. Because as you can likely guess by now, Google Translate is also doing a form of style transfer. From the model’s perspective, “¿Quién eres y qué has hecho con mis zapatos?” is the same subject as “Who are you, and what have you done with my shoes?”, just in a different style.
I’m not sure why deep learning is so good at style transfer. Maybe it’s because LLMs are already generating a pastiche of the text that went into them, so it’s easy to ask for your pastiche in a different genre. Maybe it’s because neural networks see all things as existing at certain points in high dimensional spaces, so they find it easy to push a concept far out along any dimension you might choose, even if that dimension is “Golden Gate Bridge”.
Large language models have recently gotten much, much better at coding. In just a few years, models went from “struggling to write basic functions” to “building complete applications with minimal oversight”. They’re getting better fast, and show little sign of stopping. It’s natural to wonder how good they will be six months from now, and to wonder where the ceiling is, if indeed there is a ceiling at all.
There are a lot of software engineers and they’re all paid quite well, so as LLMs get better at writing, debugging, and reviewing code, people worry that all these high-paying programming jobs might disappear, possibly overnight. This could begin with something as simple as jobs lost and industries disrupted, but it also might spiral into tanking the whole economy. This concern is real and worth taking seriously, but there are bigger, more serious questions here that aren’t really about jobs at all.
Most of the time when technology improves, it improves incrementally. Cars get faster, commercial flights get cheaper, computers get smaller. For a long time, LLMs followed the same pattern. Language models got better and better at producing plausible-sounding text, but that was basically all they could do.
In 2016, LLMs would write recipes that started: “4 caam pruce 6 ½ Su ; cer”. So close! By 2021 they could take the string “I decided to call my blog…” and complete it as: “I decided to call my blog Vodka Logic because Vodka is the liquid I would choose for a well-balanced life.” This is a huge improvement, but it still looks like very good autocomplete. It was hard to beat the stochastic parrot accusations.
LLM coding seems different. Code has a serious logical structure. If you want to write code, you have to model the problem you’re trying to solve, catch internal contradictions, and reason about states that don’t yet exist. The bar on functional code is even higher. If the logic or syntax isn’t sound, it just doesn’t work. Coding seems like it requires abilities beyond sophisticated pattern matching. And now that language models have gotten good at coding, it’s getting harder to maintain the feeling that these systems are just very good autocomplete.
This is a new and not entirely comfortable situation to be in. LLMs look less and less like very sophisticated lookup tables and more like they’re starting to do something that resembles genuine thought. Maybe LLM coding is a sign that AGI, “artificial intelligence that can do everything humans can do on a computer” is nearly or already here, and humans are not only out of a job, but possibly on the verge of being obsolete.
But a different possibility is that LLM coding is simply another example of style transfer.
For humans, writing code is a long and tricky process of intricate logical reasoning. You start with a common-sense idea of what you want the program to do, but to actually make the program, you have to hold layers upon layers of rules in your head, work through recursive commands, account for double binds and ambiguities, and so on. If you have a brain that’s made out of meat, this process can be slow and difficult.
In fact, it’s difficult in a really familiar way. It’s difficult in almost exactly the same way as translating a sentence into a foreign language. In both cases, you start with something structured and meaningful, then you go through a complex process to re-render the same meaning in a new structure.
When you translate a sentence into a new language, you keep the meaning of the original, but change the surface features and the grammar. There’s a new format, but it’s fully mapped over the same content. To a language model, the Hungarian sentence „A légpárnás hajóm tele van angolnákkal”, the Japanese sentence “私のホバークラフトは鰻でいっぱいです”, and the English sentence “My hovercraft is full of eels” are all the same idea, just expressed in three different styles.
So it’s easy to imagine that to an LLM, the sentence “check to see if a number is odd” has the same core meaning as this code, with the only difference being the expressed style:
def is_odd(number):
if number % 2 == 1:
return True
else:
return FalseThese two bits of prose look different on the surface, but they’re just different mappings of nearly-identical terms and logic. The only difference between them is that one of them is in the style “English” and one of them is in the style “Python”. This is just style transfer again, the same thing we saw in the LinkedIn translator.
If you find this kind of hard to swallow, it might be just because language feels intuitive while, unless you’ve spent a truly staggering amount of time with it, code does not. But the style transfer / language translation of it all becomes somewhat more apparent when we compare code to code. Translating between programming languages looks a lot like translating between natural languages. Look at the two functions below — aren’t these just style transfer?
function isOdd(number) {
if (number % 2 === 1) {
return true;
} else {
return false;
}
}FUNCTION IsOdd(number)
IF number MOD 2 = 1 THEN
IsOdd = TRUE
ELSE
IsOdd = FALSE
END IF
END FUNCTIONThis might be less surprising when you consider that all LLMs are in some sense descended from Google Translate. The transformer architecture that powers your local chatbot was originally developed to solve machine translation problems, and the 2017 “Attention Is All You Need” paper that kicked off the modern LLM era came straight out of Google’s translation work. It’s all translation.
If LLMs are translation machinery that has been aggressively expanded to everything else, should it be any surprise they’re especially good at the task they were originally built for? If anything, it’s more of a surprise that you can push software designed for translation to the point where it will write you a grocery list or pretend to be your boyfriend.
So of course LLMs can nail a translation/style transfer — that’s what they were built to do, and everything else is an extension of that capability. Coding, making an email seem more business-casual, re-writing a parking ticket in the style of Jane Austen.2 These are all the same operation, just with more or less exotic target “languages”.
If LLM coding is just style transfer from natural language to code, then we shouldn’t find the recent advances in LLM coding all that surprising. The advances would still be very useful, but they wouldn’t represent a categorical step change in what LLMs are able to do. If LLM coding is “the same as translation,” that might still be impressive, but it’s not a new kind of reasoning. We already knew that LLMs are good at style transfer; this would just be evidence of them getting better, or of learning new styles or languages.
In a recent test, LLMs were asked to write code in esoteric programming languages like Brainfuck or the Shakespeare Programming Language, a language “designed to make programs look like Shakespearean plays”. Despite their strange nature, these languages are Turing-complete and use the same fundamental logic as Python or C#. Any coder who understands the logical core of programming should be able to chug along in these languages, even if it’s slow and difficult. But compared to Python or C#, they have many thousand times fewer GitHub repos, and orders of magnitude fewer examples for an LLM to learn from. This is a natural test of the question of understanding vs. translation: the exact same problems, just with radically less training data.
Sure enough, when asked to write code in these esoteric languages, LLMs perform abysmally. If they really had a logical understanding of code, they would be able to apply that logic to any arbitrary new syntax, however strange. If they’re just doing very sophisticated style transfer, then they would need many thousands of examples to be able to render an existing idea in a new language. That seems to be what’s happening.
A human coder should be able to ship functional code in an unknown language in a matter of days or hours. It might be a challenge, but they wouldn’t need millions of hours of experience or hundreds of thousands of examples in a new language. But LLMs do, so they are probably doing something different than the human is.
That said, I don’t know how far this argument can go. When an LLM turns instructions into a single python function, that does seem a lot like translation. But working code more than a few functions long requires some intellectual heavy lifting. Writing a large program involves maintaining coherent state across thousands of lines of code. And recent LLMs sometimes make connections and suggestions that go beyond translation, and seem like genuine flashes of insight.
But when you use English sentences to ask for a function or a piece of code, I think we can understand that process as being a lot like asking the LLM to translate an English argument into Hindi, Greek, or Italian. It’s something that comes naturally to them — as natural as taking their “Golden Gate Bridge” vector and turning it up to 11.
The technique behind AI tools like Large Language Models (LLMs).
“It is with a degree of reluctance—yet with an unwavering commitment to order and propriety—that the undersigned must inform you of a small irregularity in the stationing of your carriage (known in modern parlance as a motor vehicle).”












Very nice explicaton of what Large Language Models can and can't do. Yes, they can reproduce and translate what is in their training corpus but they can't really think, so they can't extend things beyond that corpus. As a techie, I think of it in terms of curve-fitting. A 9 degree polynomial will fit 10 data points exactly, but it gives bizarre results outside of that range.
What this goes to show is that computer programming is a practice, just like playing a musical instrument or painting a picture. But since it is for commercial purposes, creativity is not really valued. I have been known to mumble something along the lines of "If I had a dollar for every sort function I've written, I could buy ...", then sigh and get to work.
Vibe coding is going to throw 80% of all programmers out of work because it was all drudge work anyway, even though it required a lot of skill and the output was pure thought-stuff. But in the end, it is typically no more creative than the guy on the assembly line welding the right front panel onto the next Edsel that arrives at his station.
I once was in the position of staffing the software department of a small start-up. For every person who applied for the job, I just said "Write a bubble sort." I even described what a bubble sort is in a couple of sentences. Then I left them to their own devices for a half-hour - just them and a pad of paper. 80% of the people who came in could not do it. Of the people I hired, one was a young lady, Eve, whose mother was the head librarian in East Orange New Jersey. The room where I would leave the applicants happened to hold a wall full of books that came from our Chief Technical Officer, someone who had retired as an engineer from Bell Labs. Eve had finished the task and had pulled a book of computer algorithms from the shelf to check her work. She was the only person to do this. She felt guilty when I walked in on her, like she was cheating. I was delighted. I hired her on the spot.
Most programmers aren't like that. They take what they have been taught and just plug it in. At best, they will learn a new library of functions, but in the end it is all plug and play. They can be replaced by a robot just as sure as the assembly line welder was. In terms of a career, they are doomed.