The article is misleading and badly written. None of the mentioned works seem to have used language or knowledge based models.
It looks like all the results were driven by optimization algorithms, and yet the writing describes AI 'using' concepts and "tricks". This type of language is entirely inappropriate and misleading when describing these more classical (if advanced) optimization algorithms.
Looking at the paper in the first example, they used an advanced gradient descent based optimization algorithm, yet the article describes "that the AI was probably using some esoteric theoretical principles that Russian physicists had identified decades ago to reduce quantum mechanical noise."
Ridiculous, and highly misleading. There is no conceptual manipulation or intuition being used by the AI algorithm! It's an optimization algorithm searching a human coded space using a human coded simulator.
I agree with your point, but I think it's worth noting that there's a real problem of language today both in popular and scientific communication. On the one hand, in popular understanding, there's the importance of clearly separating the era of "machine learning" as let's say Netflix recommendations from the qualitative leap of modern AI, most obviously LLMs. This article clearly draws on the latter association and really leads to confusion, most glaringly in the remark you note that the AI probably took up some forgotten Russian text etc.
However, scientifically, I think there's a real challenge to clearly delineate from the standpoint of 2025 what all should fall under the concept of AI -- we really lose something if "AI" comes to mean only LLMs. Everyone can agree that numeric methods in general should not be classed as AI, but it's also true that the scientific-intellectual lineage that leads to modern AI is for many decades indistinguishable from what would appear to be simply optimization problems or the history of statistics (see especially the early work of Paul Werbos where backpropagation is developed almost directly from Bellman's Dynamic Programming [1]). The classical definition would be that AI pursues goals under uncertainty with at least some learned or search‑based policy (paradigmatically but not exclusively gradient-descent of loss function), which is correct but perhaps fails to register the qualitative leap achieved in recent years.
Regardless -- and while still affirming that the OP itself makes serious errors -- I think it's hard to find a definition of AI that is not simply "LLMs" under which the methods of the actual paper cited [2] would not fall.
[1] His dissertation was re-published as The Roots of Backpropagation. Especially in the Soviet Union, important not least for Kolmogorov and Vapnik, AI was indistinguishable from an approach to optimization problems. It was only in the west where "AI" was taken to be a question of symbolic reasoning etc, which turned out to have been an unsuccessful research trajectory (cf the "AI winter").
- methods that were devised with domain knowledge (= numerical methods)
- generic methods that rely on numerical brute forcing to interpolate general behaviour (= AI)
The qualitative leap is that numerical brute forcing is at a stage where it can be applied to useful enough generic models.
There's a fundamental difference between any ML based method and, say, classic optimization. Let's take a simple gradient descent. This solves a very specific (if general) class of problems: min_x f(x) where f is differentiable. Since f is differentiable, someone had the (straightforward) idea of using its gradient to figure out where to go. The gradient is the direction of greatest ascent, so -grad(f) comes as a good guess of where to go to decrease f. But this is local information, only valid at (or rather in the vicinity of) a point. Hence, short of improving the descent direction (which other methods do, like quasi-Newton methods, which allow a "larger vicinity" of descent direction pertinence), the best you can do is iterate along x - h grad(f) at various h and find one that is optimal in some sense. How this is optimal is all worked out by hand: it should provide sufficient decrease, while still giving you some room for progression (not too low a gradient), in the case of the Wolfe-Armijo rules, for example.
These are all unimportant details, the point is the algorithms are devised by carefully examining the objects at play (here, differentiable functions), and how best to exploit their behaviour. These algorithms are quite specific; some assume the function is twice differentiable, others that it is Lipschitzian and you know the constant, in others you don't know the constant, or the function is convex...
Now in AI, generally speaking, you define a parametric function family (the parameters are called weights) and you fit that family of functions so that it maps inputs to desired ouputs (called training). This is really meta-algorithmics, in a sense. No domain knowledge required to devise an algorithm that solves, say, the heat equation (though it will do so badly) or can reproduce some probability distribution. Under the assumption that your parametric function family is large enough that it can interpolate the behaviour you're looking after, of course. (correct me on this paragraph if I'm wrong)
To summarize, in my (classic numerics trained) mind, classic numerics is devising methods that apply to specific cases and require knowledge of the objects at play, and AI is devising general interpolators that can fit to varied behaviour given enough CPU (or GPU as it were) time.
So, this article is clearly not describing AI as people usually mean it in academia, at least. I'll bet you a $100 the authors of the software they used don't describe it as AI.
I think it's pretty clear that they suspect the mechanism underlying the model's output is the same as the mechanism underlying said theoretical principles, not that the AI was literally manipulating the concepts in some abstract sense.
I don't really get your rabid dismissal. Why does it matter that they are using optimisation models and not LLMs? Nobody in the article is claiming to have used LLMs. In fact the only mention of it is lower down where someone says they hope it will lead to advances in automatic hypothesis generation. Like, fair enough?
They write the "AI was probably using some esoteric theoretical principles." That is a direct quote of the article.
If it was an LLM based model this could be a correct statement, and it would suggest a groundbreaking achievement: the AI collated esoteric research, interpreting it correctly and used that conceptual understanding to suggest a novel experiment. This might sound far fetched, but we already have LLM based systems doing similar... Their written statement is plausible given the current state of hype (and also a plausible, though ground breaking, given the current state of research).
In reality, the statement is incorrect. The models did not 'use' any concepts (and the only way to know that the article is wrong is to actually bother to consult the original paper, which I did).
The distinction matters: they implied something ground breaking, when the reality is cool, but by no means unprecedented.
Tldr: using concepts is not something classic ML algorithms do. They thus directly erroneously imply (a groundbreaking) foundation model based (or similar) approach. I care because I don't like people being mislead.
I think you're taking the statement way too literally. It's very clear to me what they are trying to communicate there - sure, you can read all sorts of things into a sentence like that if you try, but let's assume the best in people when there are unknowns, not the worst?
Again, the authors never said anything about language models. That's entirely on you.
What makes it clear to you that they don't mean what they explicitly write? What are you defending and why?
Philosophical discussions aside, it is entirely possible for current AI to use concepts (but the research they are describing does not employ that kind of AI).
I also think most lay people seeing the term AI are likely to think of something like ChatGPT.
It is a) literally incorrect what they write, and b) highly misleading to a lay person (who will likely think of something like ChatGPT when they read the term AI). Why are you defending their poor writing?
> What makes it clear to you that they don't mean what they explicitly write?
Because that's how language works - it's inherently ambiguous, and we interpret things in the way that makes the most sense to us. Your interpretation makes no sense to me, and requires a whole host of assumptions that aren't present in the article at all (and are otherwise very unlikely, like an AI that can literally work at the level of concepts).
> Why are you defending their poor writing?
I'm defending them because I don't think it's poor writing.
There are two ways to interpret the sentence we are discussing:
A: a grammatically false statement, saying that "the AI used theory", when they mean that "the AI's design can be understood using theory" (or more sloppy "that the design uses the theory").
B: a grammatically valid if contentious statement about an LLM based system (e.g., something like the AI Scientist paper) parsing theory and that being used to create the experiment design.
As I have explained, B is a perfectly valid interpretation, given the current state of the art. It is also the likely interpretation of a lay person, who is mainly exposed to hype and AI systems like chatGPT.
You favor interpretation A (as do I, knowing that interpretation B implies groundbreaking work).
Regardless, they a) introduce needless ambiguity that is likely to mislead a large proportion of readers. And b) of they are not actively misleading then they have written something grammatically incorrect.
Both findings mean that the article is a sloppy and bad piece of writing.
This particular sentence is also only a particular example of how the article is likely to mislead.
Right - I hate that "AI" is just being used as (at best) a replacement term for ML, and very misleading for the public who is being encouraged to believe that some general purpose AGI-like capabiity is behind things like this.
The article is so dumbed down that it's not clear if there is even any ML involved or if this is just an evaluation of combinatorial experimental setups.
> The outputs that the thing was giving us were really not comprehensible by people,
> Adhikari’s team realized that the AI was probably using some esoteric theoretical principles that Russian physicists had identified decades ago to reduce quantum mechanical noise.
Not-so-many years ago, this kind of work developing optimization algorithms would have been called optimization algorithms, not AI.
> We develop Urania, a highly parallelized hybrid local-global optimization algorithm, sketched in Fig. 2(a). It starts from a pool of thousands of initial conditions of the UIFO, which are either entirely random initializations or augmented with solutions from different frequency ranges. Urania starts 1000 parallel local optimizations that minimize the objective function using an adapted version of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm. BFGS is a highly efficient gradient-descent optimizer that approximates the inverse Hessian matrix. For each local optimization, Urania chooses a target from the pool according to a Boltzmann distribution, which weights better-performing setups in the pool higher and adds a small noise to escape local minima.
This irks me to no end, why not just call it applied mathematics algorithms to not use specific terms, rather than AI. Is grep AI? Is your web browser AI?
> Initially, the AI’s designs seemed outlandish. “The outputs that the thing was giving us were really not comprehensible by people,” Adhikari said. “They were too complicated, and they looked like alien things or AI things. Just nothing that a human being would make, because it had no sense of symmetry, beauty, anything. It was just a mess.”
This description reminds me of NASA’s evolved antennae from a couple of decades ago. It was created by genetic algorithms:
There was something similar about using evolutionary algorithms to produce the design for a mechanical piece used to link two cables or anchor a bridge’s cable, optimizing for weight and strength.
The design seemed alien and somewhat organic, but I can’t seem to find it now.
“Typology optimization” is probably what you’re thinking of. All current versions of it result in this similar blobby-spider-web vaguely alien and somewhat organic structures.
Looking at things like bicycles designed this way leaves me suspicious that it doesn’t actually have the power to derive interesting insights about material properties. I suspect future versions may end up starting to look more mechanical as it discovers that, for example, something under tension should be a straight line.
Reminds me a bit of chess engines that crush the best humans with ease but play moves that human players can identify as "engine moves". In chess the environment is fixed by the rules so I'd assume this deeper understanding of underlying patterns is only amplified in more open environments.
They used genetic algorithms to evolve digital circuits directly on FPGAs. The resulting design exploited things like electromagnetic interference to end up with a circuit much more efficient than a human could've created.
In my mind this brings some interesting consequences for 'AI apocalypse' theories. If the AI understand everything, even an air gap might not be enough to contain it, since it might be able to repurpose some of its hardware for wireless communication in ways that we can't even imagine.
In practice, we'll just let that AI have a direct internet connection, and also give it enough access to push code straight to prod. For the good measure.
I don't remember the title, but someone wrote a story were an AI would use the (imperceptible) flickering of a fluorescent lightbulb and a camera to transmit information across such an "air gap".
There's a comic out right now positing that a sufficiently intelligent AI with appropriate access could use imperceptible (to us) vibrations from mechanical computing parts like spinning rust HDD's etc.
It's a throwaway mechanic in the comic, but it seems plausible.
You don't need an AI to come up with remote sensing or air gap traversal capabilities though.
Note for example TEMPEST surveillance, or using a distant laser to pickup speech in a room based on window vibrations. Air-gap traversal is easily done by exploiting human weaknesses (e.g. curiousity to pick up a USB drive to see what's on it), and was successfully done by Stuxnet.
The bias is a handicap, the looking for beauty, symmetry, a explanation, a story, its all googles upon googles of warping lenses and funhouse-mirrors, hiding and preventing the perception of truth.
Zero is taught routinely to primary schoolers today, but it has been a hard thing to come with for scholar who struggled to nail as smooth as a concept as we know it now.
The bias toward familiarity is detrimental to edge research, but on the other hand if no one smooth the baseline, most advanced knownledge will remain just that and will never reach their full utility to humans. Finding the proper set of concepts that makes it click can be very complicated. Finding a communicable simple thought framework to let other also enjoy it and leverage on it to go further can be at least as hard.
Maybe not so much the implications. If our science is defined by symmetry, beauty, anything - and it is, because so much of physics is literally about looking for symmetries of various kinds - why are we ignoring the loud hints from ML solutions that this is a limiting heuristic?
> why are we ignoring the loud hints from ML solutions that this is a limiting heuristic?
This comes up a lot and always strikes me as rather anti-science, even anti-rationality in general. To speed run the typical progression of this argument, someone says alchemy and astrology occasionally "work" too if you're determined to ignore the failures. This point is then shot down by a recap about the success of QM despite Einstein's objections, success of the standard model even with lots of quasi-empiricism etc, etc.
Structurally though.. if you want to claim that the universe is fundamentally weird and unknowable, it's very easy to argue this, because you can always ignore the success of past theory and formalisms by saying that "it was nice while it lasted but we've squeezed all the juice out of that and are in a new regime now". Next you challenge your detractors to go ahead and produce a clean beautiful symmetric theory of everything to prove you wrong. That's just rhetoric though, and arguments from model/information/complexity theory etc about fundamental limits on what's computable and decidable and compressible would be much more satisfying and convincing. When does finding a complicated thing that works actually rule out a simpler model that you've missed? https://en.wikipedia.org/wiki/Minimum_description_length#MDL...
Because you can never be sure with the ML stuff. Perhaps it was 1 iteration away from finding a solution that was better and also symmetric. Perhaps it it a great, but not optimal, local maxima.
Lacking symmetry, it's extremely hard to understand how the antenna actually works (i.e. why those six bends, as opposed to any other random six bends).
My best guess is that the edges are oriented such that at the tested frequencies they cause constructive interference inside the antenna therefore boosting the signal. The orientation is weird because that's probably the best way to make it work in all directions, if the edges were in a flat plane, the constructive interference would only work in a single direction.
I mean sure, but how do you figure out what directions and angles to bend it in? I don't know much about signals and radio and stuff, but it feels to me like this could only be achieved through trial and error until the ideal was found, which is what evolutionary arguments are designed for.
I've had bugs go away when I added `print("working fine until here")` to the preceding line. So if someone told me "this line is needed but I don't know why", I wouldn't even blink.
Referring to this type of optimization program just as “AI” in an age where nearly everyone will misinterpret that to mean “transformer-based language model” seems really sloppy
Referring to this type of optimization as AI in the age where nearly everybody is looking to fund transformer-based language models and nobody is looking to fund this kind of optimization is just common sense though.
I use "ML" when talking about more traditional/domain specific approaches, since for whatever reason LLMs haven't hijacked that term in the same way. Seems to work well enough to avoid ambiguity.
But I'm not paid by the click, so different incentives.
AI for attempts at general intelligence. (Not just LLMs, which already have a name … “LLM”.)
ML for any iterative inductive design of heuristical or approximate relationships, from data.
AI would fall under ML, as the most ambitious/general problems. And likely best be treated as time (year) relative, i.e. a moving target, as the quality of general models to continue improve in breadth and depth.
Not the person you're replying to, but there are tons of models that aren't neural networks. Triplebyte used to use random forests [1] to make a decision to pass or fail a candidate given a set of interview scores. There are a bunch of others, though, like naive Bayes [2] or k-nearest-neighbors [3]. These approaches tend to need a lot less of a training set and a lot less compute than neural networks, at the cost of being substantially less complex in their reasoning (but you don't always need complexity).
Correct, "an editorially independent online publication launched by the Simons Foundation in 2012 to enhance public understanding of science" shouldn't be doing marketing and contributing to the problem.
This exact kind of sloppy equivocation does seem to be one of the major PR strategies that tries to justify the massive investment in and sloppy rollout of transformer-based language models when large swaths of the public have turned against this (probably even more than is actually warranted)
I know, but can we blame the masses for misunderstanding AI when they are deliberately misinformed that transformers are the universe of AI? I think not!
Web 3(.0) always makes me think of the time around 14 years ago when Mark Zuckerberg publicly lightly roasted my room mate for asking for his predictions on Web 4.0 and 5.0.
Thinking "nearly everyone" has that precise definition of AI seems way more sloppy. Most people haven't even heard of OpenAI and ChatGPT still, but among people who have, they've probably heard stories about AI in science fiction. My definition of AI is any advanced computer processing, generative or otherwise, that's happened since we got enough computing power and RAM to do something about it, aka lately.
>Most people haven't even heard of OpenAI and ChatGPT still
What? I literally don't know a single person anymore who doesn't know what chatGPT is. In this I include several elderly people, a number of older children and a whole bunch of adults with exactly zero tech-related background at all. Far from it being only known to some, unless you're living in a place with essentially no internet access to begin with, chances are most people around you know about chatGPT at least.
For OpenAI, different story, but it's hardly little-known. Let's not grossly understate the basic ability of most people to adapt to technology. This site seems to take that to nearly pathological levels.
not an LLM, in case you're wondering. From the PyTheus paper:
> Starting from a dense or fully connected graph, PyTheus uses gradient descent combined with topological optimization to find minimal graphs corresponding to some target quantum experiment
"It added an additional three-kilometer-long ring between the main interferometer and the detector to circulate the light before it exited the interferometer’s arms."
Isn't that a delay line? The benefit being that when the undelayed and delayed signals are mixed, the phase shift you're looking for is amplified.
Article mentions that if students present these designs, they’d be dismissed as ridiculously. But when AI present them, they’re taken seriously.
I wonder how many times these designes were dismissed because humans who think out of the box too much are dismissed. It seems that students are encouraged NOT to do so, severely limiting how far out they can explore.
Across basically all fields you have to first show that you can think inside the box before you are allowed to bring out-of-the-box ideas. Once you have shown that you mastered the craft and understood the rules you can get creative, but before that creativity is rarely valued. Doesn't matter if you are an academic or an artist, the same rules apply
I'm guessing AI gets the benefit of the doubt here because its ideas will be interesting and publishable no matter the outcome
You can do all the 'proving your chops' and in-the-box thinking in the world and still get ostracized for your creative insight.
* Semmelweis - medicine. Demonstrated textbook obstetric technique at Vienna General Hospital, then produced statistically impeccable data showing that hand-washing slashed puerperal fever mortality. Colleagues drove him out of the profession, and he died in an asylum.
* Barbara McClintock - genetics. Member of the National Academy, meticulous corn geneticist; her discovery of “jumping genes” (transposons) was ignored for 30 years and derided as “mysticism.”
* Georg Cantor - mathematics. Earned a Ph.D. and published dozens of orthodox papers before writing on transfinite numbers; was then declared “a corrupter of youth”. Career was blocked, contributing to a breakdown.
* Douglas Engelbart - computer science. Published conventional reports for years. When he presented the mouse, hypertext, and videoconferencing in “The Mother of All Demos” (1968), ARPA funding was slashed and he was professionally sidelined for the next twenty years.
Then you've got Stravinsky, Van Gogh, Caravaggio, James Joyce; all who displayed perfect 'classical' techniques before doing their own thing.
In economics you've got Joan Robinson and Elinor Ostrom.
And let's not forget Galileo. I'd even put Assange in this list.
So, "following the rules" before attempting to revolutionize your field doesn't seem to actually help all that much. This is a major problem, consistent across many centuries and cultures, which ought to be recognized more.
Its a cost risk analysis. We have tried letting studdents do whatever and most of the time it went nowhere, so we ended up with a more rational system (with many caveats) where experiments are proposed and people with good insights and sense of whether it might even work approve it before running it.
AI is going through the wild phase were people are allowing it to test, as soon as the limits are understood the framework of limitations and the rational system built around will inevitably happen.
The "AI" here is not the same "AI" as claude, Grok or OpenAI. It's just an optimization algorithm that tries different things in parallel until it finds a better solution to inform the next round.
The standard term is Machine Learning (ML). It's not artificial intelligence (AI) because there is no sense of attempting to manipulate concepts (e.g., the classic symbolic conceptual manipulation algorithms, or modern foundation models).
"AI comes up with a bizarre short-form generative video genre that addicts user in seconds - but it works!" I'm guessing we're only a year or two away.
That’s how we become numb to the progress. Like think of this in the context of a decade ago. The news would’ve been amazing.
Imagine these headlines mutating slowly into “all software engineering performed by AI at certain company” and we will just dismiss it as generic because being employed and programming with keyboards is old fashioned. Give it twenty years and I bet this is the future.
You're taking intelligently designed specialized optimization algorithms like the one in this article and trying to use their credibility and success to further inflate the hype of general-purpose LLMs that had nothing to do with this discovery.
Are you insane? How is it hype if I said something cautionary and extremely negative? Why don’t you read what I wrote carefully before saying something that lacks awareness.
My commentary is negative and against AI. The commentary is on the genericness of the comment and how repeated inundation of hype has made us numb to hype.
A decade ago it wouldn't have been called AI, and it probably shouldn't be called AI today because it's absurdly misleading. It's a python program that "uses gradient descent combined with topological optimization to find minimal graphs corresponding to some target quantum experiment".
Of course today call something "AI" and suddenly interest, and presumably grant opportunities, increase by a few orders of magnitude.
That’s been called AI for about thirty years as far as I am aware. I’m pretty sure I first ran into it studying AI at uni in the 90s, reading Norvig’s Artificial Intelligence: A Modern Approach. This is just the AI Effect at work.
It is a simple iterative algorithm that goes from one point to the next. It doesn't even have memory of previous steps (caveat, the authors used BFGS which approximates the Hessian with previous gradient iterates, but this is still not AI). There is no finding weights or any such thing.
If every for loop is AI, then we might as well call everything AI. Can you pass me the AI, please?
Gradient descent is used in machine learning, which is a field in AI, to train models (eg. neural networks) on data. You get some data and use gradient descent to pick the parameters (eg. neural network weights) to minimise the error on that training data. You can then use your trained model by putting other data into it and getting its outputs.
The researchers in this article didn't do that. They used gradient descent to choose from a set of experiments. The choice of experiment was the end result and the direct output of the optimisation. Nothing was "learned" or "trained".
Gradient descent and other optimisation tools are used in machine learning, but long predate machine learning and are used in many other fields. Taking "AI" to include "anything that uses gradient descent" would just render an already heavily abused term almost entirely meaningless.
Hahah, if you're going to go that route you may as well call all of math "AI", which is probably where we're headed anyhow! Gradient descent is used in training LLM systems, but it's no more "AI" itself than e.g. a quadratic regression is.
Neural networks are on the hype now, but it doesn't mean that there was no AI before them. It was, it struggled to solve some problems, and to some of them it found solutions. Today people tend to reject everything that is not neural net as not "AI". If it is not neural net, then it is not AI, but general CS. However AI research generated a ton of algorithms for searching, and while gradient descent (I think) was not invented as a part of AI research, AI research adapted the idea to discrete spaces in multiple ways.
OTOH, AI is very much a search in multidimensional spaces, it is so into it, that it would probably make sense to say that gradient descent is an AI tool. Not because it is used to train neural networks, but because the specialty of AI is a search in multidimensional spaces. People probably wouldn't agree, like they don't agree that Fundamental Theorem of Algebra is not of algebra (and not fundamental btw). But the disagreement is not about the deep meaning of the theorem or gradient descent, but about tradition and "we always did it this way".
The AI rediscovered an interferometer technique the Russian's found decades ago, optimized a graph in an unusual way and came up with a formula to better fit a dark matter plot.
Ehhhhh, I'll say it's substantive and not just pure hype.
Yes the AI "resurfaced" the work, but it also incorporated the Russian's theory into the practical design. At least enough to say "hey make sure you look at this" - this means the system produced a workable-something w/ X% improvement, or some benefit that the researchers took it seriously and investigated. Obviously, that yielded an actual design with 10-15% improvement and a "wish we had this earlier" statement.
AFAICT the "AI" didn't "pay attention to the work" either. They built a representation of a set of possible experiments, defined an objective function quantifying what they wanted to optimise and used gradient descent to find the best experiment according to that objective function.
If I've understood it right, calling this AI is a stretch and arguably even misleading. Gradient descent is the primary tool of machine learning, but this isn't really using it the way machine learning uses it. It's more just an application of gradient descent to an optimisation problem.
The article and headline make it sound like they asked an LLM to make an experiment and it used some obscure Russian technique to make a really cool one. That isn't true at all. The algorithm they used had no awareness of the Russian research, or of language, or experimental design. It wasn't "trained" in any sense. It was just a gradient descent program. It's the researchers that recognised the Russian technique when analyzing the experiment the optimiser chose.
The discovering itself doesn’t seem like the interesting part. If the discovery wasn’t in the training data then it’s a sign AI can produce novel scientific research / experiments.
Your exchange has made me wonder. Yes, whatever AI produces is not genuine stuff. But there is something we could call "Shakespeare-ness", and maybe it is quantifiable.
How would a realistic Turing test for "Shakespeare-ness" look like?
Big experts on Shakespeare likely remember (at least vaguely) all his sonnets, so they cannot be part of a blinded study ("Did Shakespeare write this or no?"), because they would realize that they have never seen those particular lines, and answer based on their knowledge.
Maybe asking more general English Lit teachers could work.
Extra Terrible Lines are indeed fun. We've had 9 months of development since then, though; maybe it would make sense to repeat those experiments twice a year.
IIRC Scott Alexander is doing something similar with his "AI draws nontrivial prompts" bet, and the difference to last year's results was striking.
Also, this really needs blinding, otherwise the temptation to show off one's sophistication and subtlety is big. Remember how oenologists consistently fail to distinguish between a USD 20 and a USD 2000 wine bottle when blinded.
AI companies stole massive amounts of information from every book they could get. Do you really believe there's any research they don't have input into their training sets?
These days, it feels like “AI” basically just means neural network-based models—especially large autoregressive ones. Even convolutional neural networks probably don’t count as “real AI” anymore in most people’s eyes. Funny how things change. Not long ago, search algorithms like A* were considered the cutting edge of AI.
"it had no sense of symmetry, beauty, anything. It was just a mess."
Reminds me of the square packing problem, with the absurdly looking solution for packing the 17 squares.
It also reminds me of edge cases in software engineering. When I let an LLM write code, I'm often confused how it starts out, thinking, I would have done it more elegantly. However, I quickly notice, that the AI handled a few edge cases I only would habe caught in testing.
Am I understanding the article correctly that the created a quantum playground, and then set thein algorithm to work optimizing the design within the playgrounds' constranits? That's pretty cool, especially for doing graph optimization. I'd be curious to know how compute intensive it was.
This AI-designed experiment is pretty cool. It seemed kind of weird at first, but since it actually works, it’s worth paying attention to. AI feels more like a powerful tool that helps us think outside the box and come up with fresh ideas. Is AI more of a helper or a creator when it comes to research?
They should stop optimizing their Company Share Option Plans and get back to work!
(It was a gradient descent optimizer, so probably unconstrained optimization rather than a Constraint Satisfaction Optimization Problem, but it might have had constraints.)
Impressive results, I remember reading about AI-generated microstrip RF filters not too long ago, and someone already mentioned evolved antenna systems. We are suffering from a severe case of calling gradient descent AI at the moment, but if it gets more money into actual research instead of LLM slop, I'm all for it.
This is the kind of thing I like to see AI being used for. That said, as is noted in the article, this has not yet led to new physics or any indication of new physics.
The article is misleading and badly written. None of the mentioned works seem to have used language or knowledge based models.
It looks like all the results were driven by optimization algorithms, and yet the writing describes AI 'using' concepts and "tricks". This type of language is entirely inappropriate and misleading when describing these more classical (if advanced) optimization algorithms.
Looking at the paper in the first example, they used an advanced gradient descent based optimization algorithm, yet the article describes "that the AI was probably using some esoteric theoretical principles that Russian physicists had identified decades ago to reduce quantum mechanical noise."
Ridiculous, and highly misleading. There is no conceptual manipulation or intuition being used by the AI algorithm! It's an optimization algorithm searching a human coded space using a human coded simulator.
I agree with your point, but I think it's worth noting that there's a real problem of language today both in popular and scientific communication. On the one hand, in popular understanding, there's the importance of clearly separating the era of "machine learning" as let's say Netflix recommendations from the qualitative leap of modern AI, most obviously LLMs. This article clearly draws on the latter association and really leads to confusion, most glaringly in the remark you note that the AI probably took up some forgotten Russian text etc.
However, scientifically, I think there's a real challenge to clearly delineate from the standpoint of 2025 what all should fall under the concept of AI -- we really lose something if "AI" comes to mean only LLMs. Everyone can agree that numeric methods in general should not be classed as AI, but it's also true that the scientific-intellectual lineage that leads to modern AI is for many decades indistinguishable from what would appear to be simply optimization problems or the history of statistics (see especially the early work of Paul Werbos where backpropagation is developed almost directly from Bellman's Dynamic Programming [1]). The classical definition would be that AI pursues goals under uncertainty with at least some learned or search‑based policy (paradigmatically but not exclusively gradient-descent of loss function), which is correct but perhaps fails to register the qualitative leap achieved in recent years.
Regardless -- and while still affirming that the OP itself makes serious errors -- I think it's hard to find a definition of AI that is not simply "LLMs" under which the methods of the actual paper cited [2] would not fall.
[1] His dissertation was re-published as The Roots of Backpropagation. Especially in the Soviet Union, important not least for Kolmogorov and Vapnik, AI was indistinguishable from an approach to optimization problems. It was only in the west where "AI" was taken to be a question of symbolic reasoning etc, which turned out to have been an unsuccessful research trajectory (cf the "AI winter").
[2] https://arxiv.org/pdf/2312.04258
I would distinguish between:
- methods that were devised with domain knowledge (= numerical methods)
- generic methods that rely on numerical brute forcing to interpolate general behaviour (= AI)
The qualitative leap is that numerical brute forcing is at a stage where it can be applied to useful enough generic models.
There's a fundamental difference between any ML based method and, say, classic optimization. Let's take a simple gradient descent. This solves a very specific (if general) class of problems: min_x f(x) where f is differentiable. Since f is differentiable, someone had the (straightforward) idea of using its gradient to figure out where to go. The gradient is the direction of greatest ascent, so -grad(f) comes as a good guess of where to go to decrease f. But this is local information, only valid at (or rather in the vicinity of) a point. Hence, short of improving the descent direction (which other methods do, like quasi-Newton methods, which allow a "larger vicinity" of descent direction pertinence), the best you can do is iterate along x - h grad(f) at various h and find one that is optimal in some sense. How this is optimal is all worked out by hand: it should provide sufficient decrease, while still giving you some room for progression (not too low a gradient), in the case of the Wolfe-Armijo rules, for example.
These are all unimportant details, the point is the algorithms are devised by carefully examining the objects at play (here, differentiable functions), and how best to exploit their behaviour. These algorithms are quite specific; some assume the function is twice differentiable, others that it is Lipschitzian and you know the constant, in others you don't know the constant, or the function is convex...
Now in AI, generally speaking, you define a parametric function family (the parameters are called weights) and you fit that family of functions so that it maps inputs to desired ouputs (called training). This is really meta-algorithmics, in a sense. No domain knowledge required to devise an algorithm that solves, say, the heat equation (though it will do so badly) or can reproduce some probability distribution. Under the assumption that your parametric function family is large enough that it can interpolate the behaviour you're looking after, of course. (correct me on this paragraph if I'm wrong)
To summarize, in my (classic numerics trained) mind, classic numerics is devising methods that apply to specific cases and require knowledge of the objects at play, and AI is devising general interpolators that can fit to varied behaviour given enough CPU (or GPU as it were) time.
So, this article is clearly not describing AI as people usually mean it in academia, at least. I'll bet you a $100 the authors of the software they used don't describe it as AI.
I think it's pretty clear that they suspect the mechanism underlying the model's output is the same as the mechanism underlying said theoretical principles, not that the AI was literally manipulating the concepts in some abstract sense.
I don't really get your rabid dismissal. Why does it matter that they are using optimisation models and not LLMs? Nobody in the article is claiming to have used LLMs. In fact the only mention of it is lower down where someone says they hope it will lead to advances in automatic hypothesis generation. Like, fair enough?
They write the "AI was probably using some esoteric theoretical principles." That is a direct quote of the article.
If it was an LLM based model this could be a correct statement, and it would suggest a groundbreaking achievement: the AI collated esoteric research, interpreting it correctly and used that conceptual understanding to suggest a novel experiment. This might sound far fetched, but we already have LLM based systems doing similar... Their written statement is plausible given the current state of hype (and also a plausible, though ground breaking, given the current state of research).
In reality, the statement is incorrect. The models did not 'use' any concepts (and the only way to know that the article is wrong is to actually bother to consult the original paper, which I did).
The distinction matters: they implied something ground breaking, when the reality is cool, but by no means unprecedented.
Tldr: using concepts is not something classic ML algorithms do. They thus directly erroneously imply (a groundbreaking) foundation model based (or similar) approach. I care because I don't like people being mislead.
I think you're taking the statement way too literally. It's very clear to me what they are trying to communicate there - sure, you can read all sorts of things into a sentence like that if you try, but let's assume the best in people when there are unknowns, not the worst?
Again, the authors never said anything about language models. That's entirely on you.
What makes it clear to you that they don't mean what they explicitly write? What are you defending and why?
Philosophical discussions aside, it is entirely possible for current AI to use concepts (but the research they are describing does not employ that kind of AI).
I also think most lay people seeing the term AI are likely to think of something like ChatGPT.
It is a) literally incorrect what they write, and b) highly misleading to a lay person (who will likely think of something like ChatGPT when they read the term AI). Why are you defending their poor writing?
> What makes it clear to you that they don't mean what they explicitly write?
Because that's how language works - it's inherently ambiguous, and we interpret things in the way that makes the most sense to us. Your interpretation makes no sense to me, and requires a whole host of assumptions that aren't present in the article at all (and are otherwise very unlikely, like an AI that can literally work at the level of concepts).
> Why are you defending their poor writing?
I'm defending them because I don't think it's poor writing.
There are two ways to interpret the sentence we are discussing:
A: a grammatically false statement, saying that "the AI used theory", when they mean that "the AI's design can be understood using theory" (or more sloppy "that the design uses the theory").
B: a grammatically valid if contentious statement about an LLM based system (e.g., something like the AI Scientist paper) parsing theory and that being used to create the experiment design.
As I have explained, B is a perfectly valid interpretation, given the current state of the art. It is also the likely interpretation of a lay person, who is mainly exposed to hype and AI systems like chatGPT.
You favor interpretation A (as do I, knowing that interpretation B implies groundbreaking work).
Regardless, they a) introduce needless ambiguity that is likely to mislead a large proportion of readers. And b) of they are not actively misleading then they have written something grammatically incorrect.
Both findings mean that the article is a sloppy and bad piece of writing.
This particular sentence is also only a particular example of how the article is likely to mislead.
Right - I hate that "AI" is just being used as (at best) a replacement term for ML, and very misleading for the public who is being encouraged to believe that some general purpose AGI-like capabiity is behind things like this.
The article is so dumbed down that it's not clear if there is even any ML involved or if this is just an evaluation of combinatorial experimental setups.
> The outputs that the thing was giving us were really not comprehensible by people,
> Adhikari’s team realized that the AI was probably using some esoteric theoretical principles that Russian physicists had identified decades ago to reduce quantum mechanical noise.
I'll chalk this one up to the Russians, not "AI".
This doesn't even seem to be ML, though.
Not-so-many years ago, this kind of work developing optimization algorithms would have been called optimization algorithms, not AI.
> We develop Urania, a highly parallelized hybrid local-global optimization algorithm, sketched in Fig. 2(a). It starts from a pool of thousands of initial conditions of the UIFO, which are either entirely random initializations or augmented with solutions from different frequency ranges. Urania starts 1000 parallel local optimizations that minimize the objective function using an adapted version of the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm. BFGS is a highly efficient gradient-descent optimizer that approximates the inverse Hessian matrix. For each local optimization, Urania chooses a target from the pool according to a Boltzmann distribution, which weights better-performing setups in the pool higher and adds a small noise to escape local minima.
https://journals.aps.org/prx/abstract/10.1103/PhysRevX.15.02...
This irks me to no end, why not just call it applied mathematics algorithms to not use specific terms, rather than AI. Is grep AI? Is your web browser AI?
> Initially, the AI’s designs seemed outlandish. “The outputs that the thing was giving us were really not comprehensible by people,” Adhikari said. “They were too complicated, and they looked like alien things or AI things. Just nothing that a human being would make, because it had no sense of symmetry, beauty, anything. It was just a mess.”
This description reminds me of NASA’s evolved antennae from a couple of decades ago. It was created by genetic algorithms:
https://en.wikipedia.org/wiki/Evolved_antenna
There was something similar about using evolutionary algorithms to produce the design for a mechanical piece used to link two cables or anchor a bridge’s cable, optimizing for weight and strength.
The design seemed alien and somewhat organic, but I can’t seem to find it now.
“Typology optimization” is probably what you’re thinking of. All current versions of it result in this similar blobby-spider-web vaguely alien and somewhat organic structures.
Looking at things like bicycles designed this way leaves me suspicious that it doesn’t actually have the power to derive interesting insights about material properties. I suspect future versions may end up starting to look more mechanical as it discovers that, for example, something under tension should be a straight line.
Shape optimization can be done under constraints of manufacturability, which would avoid exotic shapes that can't be worked with.
Why would that be the case, though? It's conceivable that optimal shapes be very different from what our intuitions suggest.
I had it stored in my knowledgebase as "alien design" ;) https://medium.com/intuitionmachine/the-alien-look-of-deep-l...
That evolved antenna looked like something cobbled together by a drunk spider
or anyone trying to get a decent signal on their TV
Reminds me a bit of chess engines that crush the best humans with ease but play moves that human players can identify as "engine moves". In chess the environment is fixed by the rules so I'd assume this deeper understanding of underlying patterns is only amplified in more open environments.
That reminds me of this article:
https://www.damninteresting.com/on-the-origin-of-circuits/
They used genetic algorithms to evolve digital circuits directly on FPGAs. The resulting design exploited things like electromagnetic interference to end up with a circuit much more efficient than a human could've created.
In my mind this brings some interesting consequences for 'AI apocalypse' theories. If the AI understand everything, even an air gap might not be enough to contain it, since it might be able to repurpose some of its hardware for wireless communication in ways that we can't even imagine.
In practice, we'll just let that AI have a direct internet connection, and also give it enough access to push code straight to prod. For the good measure.
Don't forget about full control of the terminators.
I don't remember the title, but someone wrote a story were an AI would use the (imperceptible) flickering of a fluorescent lightbulb and a camera to transmit information across such an "air gap".
Not unheard of, there was a paper about doing keylogging through the tiny EM fluctuations from a computer's power supply for example: https://ieeexplore.ieee.org/document/10197022
Or data exfiltration through fan noise (60 bits/min): https://www.sciencedirect.com/science/article/abs/pii/S01674...
Or data transfer between computers using only speakers: https://arxiv.org/abs/1803.03422
The list goes on.
There's a comic out right now positing that a sufficiently intelligent AI with appropriate access could use imperceptible (to us) vibrations from mechanical computing parts like spinning rust HDD's etc.
It's a throwaway mechanic in the comic, but it seems plausible.
In certain places the power companies are/were passing time information throughout the whole grid - https://www.nist.gov/publications/time-and-frequency-electri...
That's not a comic, and it's not artificial superintelligence: https://arxiv.org/abs/1606.05915
Whatever AI comes up with by 2030 is going to be much more clever and unexpected.
You don't need an AI to come up with remote sensing or air gap traversal capabilities though.
Note for example TEMPEST surveillance, or using a distant laser to pickup speech in a room based on window vibrations. Air-gap traversal is easily done by exploiting human weaknesses (e.g. curiousity to pick up a USB drive to see what's on it), and was successfully done by Stuxnet.
I remember reading about this, fun times.
The bias is a handicap, the looking for beauty, symmetry, a explanation, a story, its all googles upon googles of warping lenses and funhouse-mirrors, hiding and preventing the perception of truth.
Zero is taught routinely to primary schoolers today, but it has been a hard thing to come with for scholar who struggled to nail as smooth as a concept as we know it now.
The bias toward familiarity is detrimental to edge research, but on the other hand if no one smooth the baseline, most advanced knownledge will remain just that and will never reach their full utility to humans. Finding the proper set of concepts that makes it click can be very complicated. Finding a communicable simple thought framework to let other also enjoy it and leverage on it to go further can be at least as hard.
[mandatory GA antenna post requirement satisfied]
That evolved antenna is a piece of wire with exactly 6 bends. It's extremely simple, the exact opposite of a hard to understand mess.
This physics experiment:
> Just nothing that a human being would make, because it had no sense of symmetry, beauty, anything. It was just a mess.
NASA describing their antenna:
> It has an unusual organic looking structure, one that expert antenna designers would not likely produce.
— https://ntrs.nasa.gov/citations/20060024675
The parallel seems obvious to me.
Maybe not so much the implications. If our science is defined by symmetry, beauty, anything - and it is, because so much of physics is literally about looking for symmetries of various kinds - why are we ignoring the loud hints from ML solutions that this is a limiting heuristic?
> why are we ignoring the loud hints from ML solutions that this is a limiting heuristic?
This comes up a lot and always strikes me as rather anti-science, even anti-rationality in general. To speed run the typical progression of this argument, someone says alchemy and astrology occasionally "work" too if you're determined to ignore the failures. This point is then shot down by a recap about the success of QM despite Einstein's objections, success of the standard model even with lots of quasi-empiricism etc, etc.
Structurally though.. if you want to claim that the universe is fundamentally weird and unknowable, it's very easy to argue this, because you can always ignore the success of past theory and formalisms by saying that "it was nice while it lasted but we've squeezed all the juice out of that and are in a new regime now". Next you challenge your detractors to go ahead and produce a clean beautiful symmetric theory of everything to prove you wrong. That's just rhetoric though, and arguments from model/information/complexity theory etc about fundamental limits on what's computable and decidable and compressible would be much more satisfying and convincing. When does finding a complicated thing that works actually rule out a simpler model that you've missed? https://en.wikipedia.org/wiki/Minimum_description_length#MDL...
Because you can never be sure with the ML stuff. Perhaps it was 1 iteration away from finding a solution that was better and also symmetric. Perhaps it it a great, but not optimal, local maxima.
Lacking symmetry, it's extremely hard to understand how the antenna actually works (i.e. why those six bends, as opposed to any other random six bends).
My best guess is that the edges are oriented such that at the tested frequencies they cause constructive interference inside the antenna therefore boosting the signal. The orientation is weird because that's probably the best way to make it work in all directions, if the edges were in a flat plane, the constructive interference would only work in a single direction.
I mean sure, but how do you figure out what directions and angles to bend it in? I don't know much about signals and radio and stuff, but it feels to me like this could only be achieved through trial and error until the ideal was found, which is what evolutionary arguments are designed for.
I love the idea of faith based technology, where it just works but nobody is capable of comprehending why or how.
I've had bugs go away when I added `print("working fine until here")` to the preceding line. So if someone told me "this line is needed but I don't know why", I wouldn't even blink.
This is also quite common in medicine...
Praise the Omnissiah!
Go ahead then, explain to us how the exact values of the 6 angles make it work so well
Hexagons for the win!
Referring to this type of optimization program just as “AI” in an age where nearly everyone will misinterpret that to mean “transformer-based language model” seems really sloppy
Referring to this type of optimization as AI in the age where nearly everybody is looking to fund transformer-based language models and nobody is looking to fund this kind of optimization is just common sense though.
You are both right. Because the term "AI" is so vague and can mean so many things, it will be used and abused in various ways.
For me, when someone says, "I'm working on AI", it's almost meaningless. What are you doing, actually?
[dead]
I think it's actually this repo:
https://github.com/artificial-scientist-lab/GWDetectorZoo/
Nothing remotely LLM-ish, but I'm glad they used the term AI here.
How can one article be expected to fix the problem of people sloppily using “AI” when they mean LLM or something like that?
I use "ML" when talking about more traditional/domain specific approaches, since for whatever reason LLMs haven't hijacked that term in the same way. Seems to work well enough to avoid ambiguity.
But I'm not paid by the click, so different incentives.
I like that.
AI for attempts at general intelligence. (Not just LLMs, which already have a name … “LLM”.)
ML for any iterative inductive design of heuristical or approximate relationships, from data.
AI would fall under ML, as the most ambitious/general problems. And likely best be treated as time (year) relative, i.e. a moving target, as the quality of general models to continue improve in breadth and depth.
Generative AI vs artificial neural network is my go-to (though ML is definitely shorter than ANN, lol).
Huge amounts of ml have nothing to do with ANNs and transformers are ANNs.
I stand corrected! What are your go-tos?
Not the person you're replying to, but there are tons of models that aren't neural networks. Triplebyte used to use random forests [1] to make a decision to pass or fail a candidate given a set of interview scores. There are a bunch of others, though, like naive Bayes [2] or k-nearest-neighbors [3]. These approaches tend to need a lot less of a training set and a lot less compute than neural networks, at the cost of being substantially less complex in their reasoning (but you don't always need complexity).
[1] https://en.wikipedia.org/wiki/Random_forest
[2] https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Trainin...
[3] https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
Just don't use the term AI. It has no well defined meaning and is mostly intended as a marketing term
So, "don't do marketing" is your advice?
Correct, "an editorially independent online publication launched by the Simons Foundation in 2012 to enhance public understanding of science" shouldn't be doing marketing and contributing to the problem.
Just do not use AI for anything except LLMs anymore. Same way that crypto scam has taken the word crypto.
crypto must now be named cryptography and AI must now be named ML to avoid giving the scammers and hypers good press.
> AI must now be named ML
You just made a lot of 20th century AI researchers cry.
Ocaml users too. And Haskell.
Yep. I dislike it just as much as ceding crypto, but at the end of the day language changes, and clarity matters.
I think image and video generation that aren't based on LLMs can also use the term AI without causing confusion.
By doing its part and using the term correctly.
The real problem is not people using the term incorrectly, it's papers and marketing material using the term incorrectly.
Lets be real here, the people with the money bags don't care either.
This exact kind of sloppy equivocation does seem to be one of the major PR strategies that tries to justify the massive investment in and sloppy rollout of transformer-based language models when large swaths of the public have turned against this (probably even more than is actually warranted)
Yea, I can tolerate it when random business people do it. But scientists/tech people should know better.
While nowadays misleading as a title, I found the term being used in the traditional sense refreshing.
I know, but can we blame the masses for misunderstanding AI when they are deliberately misinformed that transformers are the universe of AI? I think not!
That's how I feel about Web 3.0...
Web 3(.0) always makes me think of the time around 14 years ago when Mark Zuckerberg publicly lightly roasted my room mate for asking for his predictions on Web 4.0 and 5.0.
Thinking "nearly everyone" has that precise definition of AI seems way more sloppy. Most people haven't even heard of OpenAI and ChatGPT still, but among people who have, they've probably heard stories about AI in science fiction. My definition of AI is any advanced computer processing, generative or otherwise, that's happened since we got enough computing power and RAM to do something about it, aka lately.
Then that definition is at odds with how the field has used it for many decades.
You can have your own definition of words but it makes it harder to communicate.
>Most people haven't even heard of OpenAI and ChatGPT still
What? I literally don't know a single person anymore who doesn't know what chatGPT is. In this I include several elderly people, a number of older children and a whole bunch of adults with exactly zero tech-related background at all. Far from it being only known to some, unless you're living in a place with essentially no internet access to begin with, chances are most people around you know about chatGPT at least.
For OpenAI, different story, but it's hardly little-known. Let's not grossly understate the basic ability of most people to adapt to technology. This site seems to take that to nearly pathological levels.
I'll bet that almost everyone who reads Quanta Magazine knows what they mean by AI.
Absolutely agree.
not an LLM, in case you're wondering. From the PyTheus paper:
> Starting from a dense or fully connected graph, PyTheus uses gradient descent combined with topological optimization to find minimal graphs corresponding to some target quantum experiment
This sounds similar to evolved antennas https://en.wikipedia.org/wiki/Evolved_antenna
There are a few things like that where we can throw AI at a problem is generating something better, even if we don't know why exactly it's better yet.
"It added an additional three-kilometer-long ring between the main interferometer and the detector to circulate the light before it exited the interferometer’s arms."
Isn't that a delay line? The benefit being that when the undelayed and delayed signals are mixed, the phase shift you're looking for is amplified.
Sounds like ring lasers. Not really an unusual concept to increase sensitivity.
Article mentions that if students present these designs, they’d be dismissed as ridiculously. But when AI present them, they’re taken seriously.
I wonder how many times these designes were dismissed because humans who think out of the box too much are dismissed. It seems that students are encouraged NOT to do so, severely limiting how far out they can explore.
Across basically all fields you have to first show that you can think inside the box before you are allowed to bring out-of-the-box ideas. Once you have shown that you mastered the craft and understood the rules you can get creative, but before that creativity is rarely valued. Doesn't matter if you are an academic or an artist, the same rules apply
I'm guessing AI gets the benefit of the doubt here because its ideas will be interesting and publishable no matter the outcome
You can do all the 'proving your chops' and in-the-box thinking in the world and still get ostracized for your creative insight.
* Semmelweis - medicine. Demonstrated textbook obstetric technique at Vienna General Hospital, then produced statistically impeccable data showing that hand-washing slashed puerperal fever mortality. Colleagues drove him out of the profession, and he died in an asylum.
* Barbara McClintock - genetics. Member of the National Academy, meticulous corn geneticist; her discovery of “jumping genes” (transposons) was ignored for 30 years and derided as “mysticism.”
* Georg Cantor - mathematics. Earned a Ph.D. and published dozens of orthodox papers before writing on transfinite numbers; was then declared “a corrupter of youth”. Career was blocked, contributing to a breakdown.
* Douglas Engelbart - computer science. Published conventional reports for years. When he presented the mouse, hypertext, and videoconferencing in “The Mother of All Demos” (1968), ARPA funding was slashed and he was professionally sidelined for the next twenty years.
Then you've got Stravinsky, Van Gogh, Caravaggio, James Joyce; all who displayed perfect 'classical' techniques before doing their own thing.
In economics you've got Joan Robinson and Elinor Ostrom.
And let's not forget Galileo. I'd even put Assange in this list.
So, "following the rules" before attempting to revolutionize your field doesn't seem to actually help all that much. This is a major problem, consistent across many centuries and cultures, which ought to be recognized more.
Its a cost risk analysis. We have tried letting studdents do whatever and most of the time it went nowhere, so we ended up with a more rational system (with many caveats) where experiments are proposed and people with good insights and sense of whether it might even work approve it before running it.
AI is going through the wild phase were people are allowing it to test, as soon as the limits are understood the framework of limitations and the rational system built around will inevitably happen.
[dead]
The "AI" here is not the same "AI" as claude, Grok or OpenAI. It's just an optimization algorithm that tries different things in parallel until it finds a better solution to inform the next round.
> It's just an optimization algorithm that tries different things in parallel until it finds a better solution to inform the next round.
... which is AI. AI existed long before GPTs were invented, and when neural networks were left unexplored as the necessary compute power wasn't there.
The standard term is Machine Learning (ML). It's not artificial intelligence (AI) because there is no sense of attempting to manipulate concepts (e.g., the classic symbolic conceptual manipulation algorithms, or modern foundation models).
The meaning of AI has been replaced through sheer popularity. Just like you can't go around calling people "gay" anymore when you mean "jolly, happy."
Feels like we're going to see a lot of headlines like this in the future.
"AI comes up with bizarre ___________________, but it works!"
We've seen this for a while, just not as often: antennas, IC, FPGA design, small mechanical things, ...
"AI comes up with a bizarre short-form generative video genre that addicts user in seconds - but it works!" I'm guessing we're only a year or two away.
Entering the "hold my beer" era of AI creativity
... sometimes.
That’s how we become numb to the progress. Like think of this in the context of a decade ago. The news would’ve been amazing.
Imagine these headlines mutating slowly into “all software engineering performed by AI at certain company” and we will just dismiss it as generic because being employed and programming with keyboards is old fashioned. Give it twenty years and I bet this is the future.
> Like think of this in the context of a decade ago. The news would’ve been amazing.
People have been posing examples of similar "weird non-human design" results throughout here that are more than a decade old.
You're taking intelligently designed specialized optimization algorithms like the one in this article and trying to use their credibility and success to further inflate the hype of general-purpose LLMs that had nothing to do with this discovery.
Are you insane? How is it hype if I said something cautionary and extremely negative? Why don’t you read what I wrote carefully before saying something that lacks awareness.
My commentary is negative and against AI. The commentary is on the genericness of the comment and how repeated inundation of hype has made us numb to hype.
Twenty bucks says it isn't.
A decade ago it wouldn't have been called AI, and it probably shouldn't be called AI today because it's absurdly misleading. It's a python program that "uses gradient descent combined with topological optimization to find minimal graphs corresponding to some target quantum experiment".
Of course today call something "AI" and suddenly interest, and presumably grant opportunities, increase by a few orders of magnitude.
That’s been called AI for about thirty years as far as I am aware. I’m pretty sure I first ran into it studying AI at uni in the 90s, reading Norvig’s Artificial Intelligence: A Modern Approach. This is just the AI Effect at work.
https://en.wikipedia.org/wiki/AI_effect
Gradient descent is a learning algorithm. This is AI.
Gradient descent is not a learning algorithm.
It is a simple iterative algorithm that goes from one point to the next. It doesn't even have memory of previous steps (caveat, the authors used BFGS which approximates the Hessian with previous gradient iterates, but this is still not AI). There is no finding weights or any such thing.
If every for loop is AI, then we might as well call everything AI. Can you pass me the AI, please?
Gradient descent is used in machine learning, which is a field in AI, to train models (eg. neural networks) on data. You get some data and use gradient descent to pick the parameters (eg. neural network weights) to minimise the error on that training data. You can then use your trained model by putting other data into it and getting its outputs.
The researchers in this article didn't do that. They used gradient descent to choose from a set of experiments. The choice of experiment was the end result and the direct output of the optimisation. Nothing was "learned" or "trained".
Gradient descent and other optimisation tools are used in machine learning, but long predate machine learning and are used in many other fields. Taking "AI" to include "anything that uses gradient descent" would just render an already heavily abused term almost entirely meaningless.
Hahah, if you're going to go that route you may as well call all of math "AI", which is probably where we're headed anyhow! Gradient descent is used in training LLM systems, but it's no more "AI" itself than e.g. a quadratic regression is.
Neural networks are on the hype now, but it doesn't mean that there was no AI before them. It was, it struggled to solve some problems, and to some of them it found solutions. Today people tend to reject everything that is not neural net as not "AI". If it is not neural net, then it is not AI, but general CS. However AI research generated a ton of algorithms for searching, and while gradient descent (I think) was not invented as a part of AI research, AI research adapted the idea to discrete spaces in multiple ways.
OTOH, AI is very much a search in multidimensional spaces, it is so into it, that it would probably make sense to say that gradient descent is an AI tool. Not because it is used to train neural networks, but because the specialty of AI is a search in multidimensional spaces. People probably wouldn't agree, like they don't agree that Fundamental Theorem of Algebra is not of algebra (and not fundamental btw). But the disagreement is not about the deep meaning of the theorem or gradient descent, but about tradition and "we always did it this way".
More hype than substance unfortunately.
The AI rediscovered an interferometer technique the Russian's found decades ago, optimized a graph in an unusual way and came up with a formula to better fit a dark matter plot.
Ehhhhh, I'll say it's substantive and not just pure hype.
Yes the AI "resurfaced" the work, but it also incorporated the Russian's theory into the practical design. At least enough to say "hey make sure you look at this" - this means the system produced a workable-something w/ X% improvement, or some benefit that the researchers took it seriously and investigated. Obviously, that yielded an actual design with 10-15% improvement and a "wish we had this earlier" statement.
No one was paying attention to the work before.
AFAICT the "AI" didn't "pay attention to the work" either. They built a representation of a set of possible experiments, defined an objective function quantifying what they wanted to optimise and used gradient descent to find the best experiment according to that objective function.
If I've understood it right, calling this AI is a stretch and arguably even misleading. Gradient descent is the primary tool of machine learning, but this isn't really using it the way machine learning uses it. It's more just an application of gradient descent to an optimisation problem.
The article and headline make it sound like they asked an LLM to make an experiment and it used some obscure Russian technique to make a really cool one. That isn't true at all. The algorithm they used had no awareness of the Russian research, or of language, or experimental design. It wasn't "trained" in any sense. It was just a gradient descent program. It's the researchers that recognised the Russian technique when analyzing the experiment the optimiser chose.
The discovering itself doesn’t seem like the interesting part. If the discovery wasn’t in the training data then it’s a sign AI can produce novel scientific research / experiments.
It's not that kind of AI. We know that these algorithms can produce novel solutions. See https://arxiv.org/abs/2312.04258, specifically "Urania".
This is monkeys and typewriters.
It's like seeing things in clouds or tea leaves.
If the "monkeys with typewriters" produces a Shakespear sonnet faster than he is reincarnated, it's a useful resource.
At least, that's the thinking.
It is 100% impossible for AIs to create a Shakespeare sonnet. They can create a pastiche of a sonnet, which is completely different.
It can’t be a Shakespeare sonnet if Shakespeare didn’t write it
Yes. Just like any pair of glasses the sun decides to wear would be sunglasses.
Your exchange has made me wonder. Yes, whatever AI produces is not genuine stuff. But there is something we could call "Shakespeare-ness", and maybe it is quantifiable.
How would a realistic Turing test for "Shakespeare-ness" look like?
Big experts on Shakespeare likely remember (at least vaguely) all his sonnets, so they cannot be part of a blinded study ("Did Shakespeare write this or no?"), because they would realize that they have never seen those particular lines, and answer based on their knowledge.
Maybe asking more general English Lit teachers could work.
https://garymarcus.substack.com/p/on-hype-and-the-unbearable...
Extra Terrible Lines are indeed fun. We've had 9 months of development since then, though; maybe it would make sense to repeat those experiments twice a year.
IIRC Scott Alexander is doing something similar with his "AI draws nontrivial prompts" bet, and the difference to last year's results was striking.
Also, this really needs blinding, otherwise the temptation to show off one's sophistication and subtlety is big. Remember how oenologists consistently fail to distinguish between a USD 20 and a USD 2000 wine bottle when blinded.
That’s the looooong game on both counts.
'tis the patient plot on either side, where time doth weave its cunning, deep and wide.
AI companies stole massive amounts of information from every book they could get. Do you really believe there's any research they don't have input into their training sets?
These days, it feels like “AI” basically just means neural network-based models—especially large autoregressive ones. Even convolutional neural networks probably don’t count as “real AI” anymore in most people’s eyes. Funny how things change. Not long ago, search algorithms like A* were considered the cutting edge of AI.
Feels like we're entering a new kind of scientific method. Not sure if that's thrilling or terrifying, but definitely fascinating
"it had no sense of symmetry, beauty, anything. It was just a mess."
Reminds me of the square packing problem, with the absurdly looking solution for packing the 17 squares.
It also reminds me of edge cases in software engineering. When I let an LLM write code, I'm often confused how it starts out, thinking, I would have done it more elegantly. However, I quickly notice, that the AI handled a few edge cases I only would habe caught in testing.
Guess, we should take a hint!
Am I understanding the article correctly that the created a quantum playground, and then set thein algorithm to work optimizing the design within the playgrounds' constranits? That's pretty cool, especially for doing graph optimization. I'd be curious to know how compute intensive it was.
This AI-designed experiment is pretty cool. It seemed kind of weird at first, but since it actually works, it’s worth paying attention to. AI feels more like a powerful tool that helps us think outside the box and come up with fresh ideas. Is AI more of a helper or a creator when it comes to research?
AFAICT "The AI" (which is never actually described in the article) is a CSOP solver.
They should stop optimizing their Company Share Option Plans and get back to work!
(It was a gradient descent optimizer, so probably unconstrained optimization rather than a Constraint Satisfaction Optimization Problem, but it might have had constraints.)
Impressive results, I remember reading about AI-generated microstrip RF filters not too long ago, and someone already mentioned evolved antenna systems. We are suffering from a severe case of calling gradient descent AI at the moment, but if it gets more money into actual research instead of LLM slop, I'm all for it.
> We are suffering from a severe case of calling gradient descent AI at the moment,
We’ve been doing that for decades, it’s just more recently that it’s come with so much more funding.
I still call computers "adding machines." Total fad devices.
This is not "AI", it's non-linear optimization...
We all do math down here.
Non-linear optimization is classic AI, ie. searching through logic and symbolic computation.
"Modern" AI is just fuzzy logic, connecting massive probabilities to find patterns.
This is the kind of thing I like to see AI being used for. That said, as is noted in the article, this has not yet led to new physics or any indication of new physics.