Neural networks are epicycles

Ludwig Yeetgenstein

Aug 22

Recently there was a paper that studied what happens when you train neural networks to learn planetary orbits.

Read →

7 Comments

Ruben

Really cool read! I always perk up when I see you've got something new out.

Expand full comment

Reply (1)

Ludwig Yeetgenstein

Thank you for reading! An above-average amount of work went into this post, so I appreciate it.

Expand full comment

Dillon

Copernicus is an interesting example. In some ways he actually took a step backward because he was most interested in preserving uniform speeds of the planets in their orbits, which he thought more elegant than Ptolemy’s system of equants. That’s actually wrong and meant that Copernicus wasn’t really much better than other models in the end, other than being right about that one big thing. I suppose LLMs fail because they can’t take that step backward to simplify the system.

Expand full comment

Reply (1)

Ludwig Yeetgenstein

Right, that’s kind of the claim I’m trying to make, that NNs have a bias towards complexity rather than simplicity. Sometimes people claim the opposite, that something about huge data sets or something about gradient descent biases the model towards simplicity, but I never find such claims to be convincing. Besides, simplicity is not always the right bias either. As you point out, the Copernican model, the simplest model, was not more accurate than epicycles. The “right” level of simplicity/complexity is something that is contextual.

Expand full comment

Ulkar Aghayeva

great post, and so lucid! thank you for writing it up. i will be pondering it, it’s actually related to something i’m researching atm

Expand full comment

Reply (1)

Ludwig Yeetgenstein

Thanks Ulkar! Would be curious to hear more about what you're researching.

Expand full comment

Reply (1)

Ulkar Aghayeva

for one, i’m looking into what is considered an elegant experiment, and seems like economy of means and simplicity are among their key features (as opposed to brute force experiments). so it’s more about how we probe reality rather than what our model of reality is as a result, though these are related.

second, have you seen this paper? https://arxiv.org/abs/2505.24832 it’s about the memorization vs generalization trade-off in LLMs which seems related to epicycles vs compression/generalization you wrote about here. i’m interested in this in light of using LLMs for literature-based discovery

Expand full comment

yeetgenstein

Neural networks are epicycles