FastHTML page

Munich 1991: The Roots of the Current AI Boom (people.idsia.ch)

189 points by tosh 3 days ago | 80 comments

• HarHarVeryFunny 6 hours ago

The current AI boom has more to do with NVIDIA, and the popularity of computer gaming giving us GPU compute, than who was using neural networks back in 1990's.

More specifically, it was really AlexNet, the 2012 ImageNet entry, running on two NVIDIA GTX 580's, that highlighted the practicality and utility of running large scale neural nets on affordable hardware. CUDA had been released in 2006, but cuDNN (the CUDA library for neural nets) didn't come out until 2014 - after AlexNet had already kickstarted the demand.

What followed from AlexNet was a few years of intense competition on the ImageNet benchmark, and larger and larger/deeper neural nets (CNNs), which gave rise to a lot of the algorithms and concepts still used today such as residual connections (originally from ResNet), ADAM (training algorithm), ReLU/etc, normalization, dropout, etc... all the fundamentals that made building large neural nets possible.

Schmidhuber's continual reminding everyone that he was working on neural nets back in the 1990s is beyond tiresome. Yes, he should have been recognized alongside Hinton/Bengio/LeCun as one of the pioneers, but time for him to get over it.

• nextos 21 minutes ago

I agree. I also think it's about the hardware and, obviously, recognizing AD as the fundamental primitive.

Particular architectures don't matter so much yet. It's quite possible that S3-Mamba or xLSTM could be used in lieu of transformers and we would still have LLMs.

• LogicFailsMe 5 hours ago

And Google's acquisition of DNN Research to get the ball rolling with conv nets and AI moneyball, followed by the acquisition of Deepmind. Schmidhuber IMO *has* been recognized as one of the 4 horseman and rightly so, but what has he done lately? Just noticed they now say the 3 godfathers of AI. This is what people hate about academia. It's not academia itself, it's the mean girl politics that emerge from the tenure system. And at this point, tenure should be abolished IMO having been utterly weaponized to defend the status quo.

• alephnerd an hour ago

> The current AI boom has more to do with NVIDIA, and the popularity of computer gaming giving us GPU compute, than who was using neural networks back in 1990's

I disagree. But more critically, I'd argue it's the legacy of the PDP project that led to what became foundation models today.

• HarHarVeryFunny 15 minutes ago

The PDP project was very early - relevant in term of neural net history of course, but hard to see much there relevant to today's large models other than Hinton's reinvention of SGD as an alternative to the layer-wise training that was then the norm.

One interesting thing to note from the PDP handbook are mentions by LeCun and Hinton of what would later be called CNNs, which LeCun claims to have invented. It seems that Hinton deserves just as much credit as LeCun, and in any case these are discussed just as locally connected models using shared weights as an optimization.

• AndrewKemendo 3 hours ago

This is well put.

2012 really fundamentally changed everything for the AI community, I’d argue because tensorflow/keras/pytorch followed and that made the infrastructure accessible for distributed training.

• MeteorMarc 9 hours ago

Also see Schmidhuber's take on the Hinton + Hopfield Nobel prize: https://people.idsia.ch/~juergen/physics-nobel-2024-plagiari...

• Hoasi 8 hours ago

Not that surprising since the whole LLM ecosystem is based on plagiarism.

• h8hawk 8 hours ago

It's sad that he is the only one speaking out about Hinton. This whole Hinton glorification seems like it's being pushed by an agenda. I'm not sure if he would receive this much attention if he held a different view (closer to LeCun or Ng), rather than these Effective Altruism takes on current AI.

• ks2048 2 hours ago

I don't associate Hinton will Effective Altruism. He did switch to focus on warning of the dangers of AI, but that was after he already was established as the father (or one of them) of deep learning.

• larodi 4 hours ago

Read through his papers, and these are substantial accusations. Perhaps Hinton et al. should investigate, and either a) correct themselves by properly citing these Munich researchers; or b) proove they did not base their work (unlikely) on these papers;

And then, as a whole, this weighs in favor of European scholars and also should properly inform the funding of similar research in the EU.

Writing the last in the light of a month-and-a-half wait (to date) for EuroHPC to process their own form where we submitted a funding request by no less than University + Private Company already established in the area + 4 alumni, two PHDs and one postdoc. Zero response since.

• practal 9 hours ago

TU Munich and Nipkow, Makarius et.al. are also at the center of the influential Isabelle theorem prover. TU Munich is cool :-)

• cold_harbor 5 hours ago

worth separating: LSTM (Hochreiter & Schmidhuber 1997) is ironclad and widely cited. the transformer attention priority claims are far shakier. conflating them is how Schmidhuber undermines himself

• HarHarVeryFunny 2 hours ago

Yes, and notable how Alex Graves, one of Schmidhuber's students, later at DeepMind, doesn't even mention Schmidhuber in his historical overview of attention mechanisms "Attention and Memory in Deep Learning".

https://www.youtube.com/watch?v=AIiwuClvH6k

When it comes to attention, details matter, since the idea itself is obvious - weighted inputs, and implicit attention is present in every neural network - this is what weights are.

The specific form of attention used by the Transformer is key-based associative attention, aka "Bahdanau attention" introduced in Bahdanau's paper "Neural Machine Translation by Jointly Learning to Align and Translate". It's perhaps worth noting that the word "attention" is barely even mentioned in this paper, other than noting that this weighted input mechanism can be seen as a form of attention (presumably mentioned since attention was at that time a recurring theme in various types of neural network).

Bahdanau attention - not just the general concept of attention - seems to be a very critical piece of the Transformer architecture since this this is what allows the Transformer to find things in context and is behind the "induction head" mechanism that appears central to how Transformers operate.

• jcattle 10 hours ago

There's this crowd on HN which is very vocal against academia. From what I've seen, the main points are that academia isn't efficient, most of the science coming out of academia is useless and that the whole system is just a waste of taxpayers money. Instead, what is often argued, all good research is done in private labs. Then pointing to SpaceX, Moderna, OpenAI, Google, etc.

And while it is very true that often the research coming out of Academia is useless, what is always neglected are the roots of the research done in private labs.

When Jürgen Schmidhuber and team published their work on Neural Nets back in 1991 it was also useless. Unless you had a supercomputer and very, very deep pockets you were not going to do anything with what came out of their lab.

But still, 30 years later here we are, standing on top of the shoulders of this useless research.

• yorwba 9 hours ago

Like half of what Schmidhuber is always complaining about is that (except for LSTMs) people aren't standing on the shoulders of his research very much. They try to solve some of the same problems people have always wanted to solve, try some of the same approaches people always tend to try, and then tinker until it works. At no point do they consult Schmidhuber's decade-old papers where he tried something kind of similar but didn't get very impressive results, and hence they also do not think to cite him. Then he comes out of the woodwork to assert priority.

• romaniv 3 hours ago

What you're describing is people who fail to follow the most basic principles of academic research. (Check existing academic literature, mention and give credit to prior work.) This would be fine if these people didn't claim to be doing scientific research, didn't boast their academic credentials, didn't publish their finding as original work and didn't demand credit for their work in academia. Of course, they do all of these things. They benefit from a system they're actively denigrating (and in some ways degrading).

To put it more simply, people with academic credentials should not demand acknowledgement of their current intellectual work while denigrating and ridiculing the importance of very similar work done in the past.

• fantod 3 hours ago

It's not just the papers. It's also the students and their students, many of whom ended up working in the top labs today.

• suddenlybananas 9 hours ago

You can be influenced downstream by papers you haven't personally read.

• bonzini 9 hours ago

Shane Legg was in Schmidhuber's lab at IDSIA before being one of the founders of DeepMind, so he probably read the papers personally and knows what influenced him or not...

• gillesjacobs 8 hours ago

Of course, but if you haven't read them you also shouldn't cite them.

And that's where Schmidhuber goes off the rails: publicly shaming published papers into citing you isn't good academic practice. It's bullying.

• psb217 7 hours ago

"if you haven't read them you also shouldn't cite them" -- this is wildly incorrect in an academic context. If I'm using ResNets, I should cite the original ResNet paper, even if I haven't read it. If I'm using Transformers, I should cite the original Transformer paper, even if I haven't read it. If my work is a direct extension of method B, and method B is a direct extension of method A, I should cite the source of A, even if I haven't read it.

You can't claim independence from past work simply because you didn't look directly at it. The job of an academic researcher is to know the landscape of relevant ideas, where they come from, where they're going, and to hopefully contribute a few new good ones.

Citation chains should extend back from your work, along a reasonable line conceptual inheritance, back to a reasonable point of origin. Schmidhuber has different definitions for both of these reasonables than the bulk of the ML research community, to a point that makes him difficult to satisfy.

• robotresearcher 3 minutes ago

Your Paper C does not need to cite Paper A unless you are discussing some aspect of it that Paper B is not. Otherwise you inherit the A citation via B.

Spamming citations is unnecessary.

• jasonhong 6 hours ago

It's worth pointing out that sometimes, some papers just become part of the general context of things and are no longer explicitly cited. Or people cite textbooks or general survey papers instead.

For example, take a look at Albert Einstein's Google Scholar profile. He's not the top cited physicist. Not even close. It's because other researchers don't explicitly cite his papers. https://scholar.google.com/citations?user=qc6CJjYAAAAJ&hl=en...

Same with Tim Berners-Lee and the World Wide Web. Imagine if his original paper were cited every time someone deployed a web site.

• jagged-chisel 4 hours ago

You make it sound like all original ideas from academia must be cited all the time, even if that was not the source of someone’s inspiration.

If I’m in the private sector, and I rediscover something from first principles, it is not my responsibility to go search all academia to see if someone’s done it before so I can cite their work.

If I rely on a code library that doesn’t explicitly cite papers it was built on, it is also not my responsibility to go find all the papers that it might’ve been built from and cite those papers.

• inigyou 7 hours ago

You should read those papers then

• dividedbyzero 7 hours ago

> Of course, but if you haven't read them you also shouldn't cite them.

But if you build on them you should have read them. I don't know about the specifics and I don't know if Schmidhuber is out of line or not, and citations and impact factors are a terrible mess, but generally speaking, you are responsible for finding and reading and citing any related work that needs to be cited, and if you work on neural networks in an academic context you probably have been forced to read that particular one at some point. Citation obligations don't just disappear because you don't want to do the research.

• elorant 8 hours ago

I do a lot of work that is based on academic research, aka building a proprietary sparse embedding model. My issue with academia is that they don’t bother to solve the practical issues. They tell you how to build a PPMI model, but what about hitting a database that’s 500TB to find co-occurrence numbers? This isn’t even touched so you’d then have to go and invent a bazillion of algorithms yourself to make your life easier. So while the bedrock is based on academic research and we thank them for that, scaling anything requires a lot of work in uncharted territories.

• candu 6 hours ago

Well, yeah. That's why we have "research & development" as a term.

What you're referring to is the "development" part of that. In some sense: the job you have _exists precisely because it's not part of the research phase_, and it's equally as valuable as the research part. Research is the proof of concept; development is scaling up and making production-ready and finding small efficiencies and so on.

From an industry perspective, it's tempting to conflate these, because that's what industry research labs are designed to do: integrated R&D. But that is not at all how academic research labs work.

• jhbadger 7 hours ago

But that isn't the purpose of academia -- the purpose of it is to discover new phenomena not to make products. It is true that there is a lot of work to turn a new advance into a product whether it is software or turning biological knowledge into a drug, but without discovery of new phenomena new products will come to a halt. While it is true that some corporate labs, most famously Bell Labs in its heyday, but also for example IBM's T.J. Watson and Xerox's PARC did do basic research besides product-focused work, this is pretty rare because it is hard to justify the cost of something that may only be practical in decades and often help your competitors as much as yourself.

• tchalla 3 hours ago

> My issue with academia is that they don’t bother to solve the practical issues. They tell you how to build a PPMI model, but what about hitting a database that’s 500TB to find co-occurrence numbers?

Soon we will also blame academia for not providing iOS and android apps

• erispoe 22 minutes ago

Yeah, that's your job.

• gessha 6 hours ago

I jest but database design is its own sub field of computer science, maybe look into their papers?

• elorant 5 hours ago

I did that too. Ending up building my own reverse index with a fixed-size vocabulary. But that's my issue, you start building one product and you end-up building ten in the process to solve all edge cases because no one bothered to research how things scale.

• jimbokun 2 hours ago

Well it sounds like you did?

• genxy 2 hours ago

The science is on them, the engineering is on you.

• utopiah 6 hours ago

The practical issue of academia is epistemological. It's about learning how a phenomenon came to exists. If you are looking for efficiency the field of academia related to learning how to do so is computational complexity and it works quite well.

The goal of academia isn't to be practical, "only" learning.

• fedeb95 6 hours ago

I think most people forget the graph-like nature of scientific research. You don't have n useful papers and m useless ones by themselves, you have an interconnection of those. There may be isolated cliques of uselessness, but there isn't a clear correlation between academia and private research.

Many ideas come from philosophy, which many find useless.

Heraclitus discovered change back in ancient Greek, I don't know where we would be in scientific research without that (deliberately ignoring the debate about the originality of what we know about Heraclitus work). I bet his contemporaries found his "research" useless.

• ACCount37 9 hours ago

Where is "this crowd" that you are talking about?

The closest to that that I've seen is that traditional academia approaches are too far removed from practical applications for highly applied fields like software engineering, or too slow for fast-moving fields like modern day ML (thus, all the preprints).

• jillesvangurp 4 hours ago

Private labs feed off academia. Without academics to staff them, they'd get a lot less far.

I used to work at Nokia Research when they still made phones. Probably the closest thing Europe had to Silicon Valley twenty years ago. Except it was in Helsinki. Lots of stuff got invented there. Nokia didn't really manage to capitalize on its own inventions of course. Or rather it got caught up in its own clumsy attempts throwing babies out of the window by the bucket load. But others sure did. A lot of modern smart phones still have tech in them that Nokia pioneered before either Google or Apple shipped a smart phone.

At the time there was a lot of talk about the demise of industrial research labs. Bell labs (now actually owned by Nokia!), Xerox PARC, IBM, and all the other big US labs that produced amazing stuff are former shadows of themselves. There is some truth in that

But you could argue that Google and Apple picked up some of the slack. And the current AI boom came out of Google cherry picking all the best universities for their AI talent and putting them all together in a research group that then got free reign. Like Nokia, that involved a lot of ejecting of babies with the bath water. But it seems to have spawned lots of new startups that can trace their roots back to that research group in Google.

• tcp_handshaker 8 hours ago

I think most of criticism of academia is about the rampant fraud and unreproducible results, due to the way the incentives are structured.

• FrustratedMonky 5 hours ago

It's like the old saying "only 10% of my marketing budget is making a difference, I just don't know which 10%"

You don't know ahead of time, where the breakthrough will come from.

There is ton of research that sits on the shelf, and then years later, it gets re-combined with some other useless research, and boom, some big breakthrough.

This current attitude of all research is worthless, so it should be cancelled, is shooting our future selves in the face.

• contingencies 6 hours ago

Every western academic nearly systematically ignores eastern science and philosophy: classicism means "western European". Never mind Europe only flourished intellectually post Islam, which imported the science and engineering of China and India, critically including printing and zero[0]. IMHO this is why distaste for academia grows: it's based on appeals to authority which are demonstrably farcically misplaced. Alternatively stated: the emperor has no clothes, much less silk or paper!

Just as the Dewey Decimal System really only served the purpose of providing the facetious nominal linearization of an arbitrary depth ontological oversimplification, so too humans are much more like random pattern matching machines than festidious sense-makers glued to absolutes derived from false appeals to static mono-perspective ontological hierarchies. The same is becoming lived experience in the LLM age, although the tiktokked youth apparently cannot string ten words together or focus longer than three seconds to attest, I'd wager they can feel it. Are we losing something by rejecting the habit of rigorously manually tending to spurious and temporary ontologies? Yes. Is it necessarily a loss in the long term? Probably not, in the same way we no longer write long-form letters or leave calling cards. Are we gaining something in response? Yes, at a minimum much stronger cross-pollination between ivory towers by fearless exploratory pragmatists who disrespect the would-be scope of nominal professions in favor of holistic thinking... both AI and human.

[0] https://en.wikipedia.org/wiki/Science_and_Civilisation_in_Ch...

• wolfi1 8 hours ago

and you still need tons of money

• MrBuddyCasino 8 hours ago

This is a straw-man if I ever saw one.

Practically no one is against hard science research, properly conducted. The issues are rampant fraud / p-hacking / unreproducible garbage mixed with an unhealthy dose of ideological monoculture and indoctrination, garnished with rising tuition prices while sitting on huge endowments in case of the Ivy Leagues.

• eru 7 hours ago

> Practically no one is against hard science research, properly conducted.

As long as you do that with your own money (or money got freely given from other people), sure.

If you use taxpayer money, that's a different game.

• MrBuddyCasino 6 hours ago

There is a long list of grievances I have regarding the (mis-) use of taxpayer money, and funding the hard sciences is way, way down. I can’t even see it from where I stand.

• jcattle 8 hours ago

Yes all good points showing issues that academia has at the moment.

However I often see this going from "there's issues" to discounting academia altogether and positioning private labs as a good or only alternative.

After all, most people in the open science collaboration which published the seminal paper kicking off the replication crisis were from academia.

• MrBuddyCasino 7 hours ago

Yes there is no substitute for academia. Monopolist's research labs get close (Bell Labs etc), but they tend to be more "applied".

• mschuster91 5 hours ago

> From what I've seen, the main points are that academia isn't efficient, most of the science coming out of academia is useless and that the whole system is just a waste of taxpayers money. Instead, what is often argued, all good research is done in private labs. Then pointing to SpaceX, Moderna, OpenAI, Google, etc.

Well... that's "starve the beast" in action. A lot of things we take for granted, that underpin our modern ways of life, came to be due to government investing. Laser, radar, microwaves, the early Internet, that all was military R&D.

"Unfortunately" (well, for the rich and the MIC, at least) there is no way for people to siphon off money in government-funded research, so once the libertarian/small-state BS completely took over following the collapse of the USSR, a lot of that got torn down or supplemented with enough bureaucracy to make Germans cry... and that's why reusable rockets were not invented at NASA but at SpaceX instead.

• tsunamifury 4 hours ago

Reusable rockets were not invented by nasa because their mission was exploration not commercialization.

• genxy an hour ago

Reusable rockets were "invented" by Lars Blackmore when he was working at JPL (Jet Propulsion Lab). I say invented because like anything in the evolution of engineering, credit is messy.

https://en.wikipedia.org/wiki/Jet_Propulsion_Laboratory

> Founded in 1936 by California Institute of Technology (Caltech) researchers, the laboratory is now owned and sponsored by NASA and administered and managed by Caltech.

Minimum-Landing-Error Powered-Descent Guidance for Mars Landing Using Convex Optimization http://larsjamesblackmore.com/BlackmoreEtAlJGCD10.pdf

Elon originally wanted parachutes and was convinced by Lars to go with self landing rockets.

• mschuster91 4 hours ago

Cheap reusable rockets allow for a lot more research for a lot less money.

Unfortunately, as the early history of SpaceX shows, it required a lot of failures to learn from to design the current crop of rockets. And that's the advantage that private R&D has... as long as the person in charge has money, failure is an option, because in anything publicly funded, any failure will relentlessly be blamed on the currently governing party by the opposition.

• pembrook 7 hours ago

I feel like you're constructing a strawman to argue against. I visit this site almost daily and the prevailing sentiment is usually the polar opposite of what you're suggesting.

If sentiment on HN were as you say, how could your pro-academia and anti-big tech comment be sitting at the top as the most upvoted comment?

• trashburger 7 hours ago

This article, too, was originally discovered by Jürgen Schmidhuber in 1991!

• trilogic 4 hours ago

> it is easy to forget that the foundations of this trillion-dollar industry were laid down over 30 years ago in Munich

Yes is very easy to forget, cause the trillion is not being made in Europe. If it was really conceived in Munich (like the maps that got stolen also), it show how incompetent is Europe to keep it´s technology and protect European companies.

It is painful to read this article.

• Sharlin 3 hours ago

Somehow "protecting companies" by keeping basic research, done openly at a university lab, from being "stolen"? What?

It's like saying it's painful that the Web was invented in Europe and opened for everybody rather than being kept at CERN to protect European companies.

• trilogic an hour ago

Right, is totally fine to create new inventions, but let others take credit and financial benefits. It is our duty to protect and get the benefits of European inventions, especially the ones financed with public tax. Open for everybody means benefits for everybody.

• emmelaich 10 hours ago

https://en.wikipedia.org/wiki/J%C3%BCrgen_Schmidhuber

• davidw 43 minutes ago

This sort of seems like a pattern in CS - someone creates something and then it blows up 20 or 30 years later when the world is ready for it.

• gillesjacobs 8 hours ago

Which work has more value: the abstract description of a catalogue of potential model architectures or their validated application trained on real data?

In the Schmidhuber case their is 20 years and a chain of countless other works in between the two.

• throwa356262 4 hours ago

Hot take:

The real root of the current AI boom is a master thesis from university of Toronto.

The thesis demonstrated that neural networks much longer than before could be trained by simply having a random fraction of the neurons excluded during forward and back propagation.

That's how we got practical deep neural networks. Without that we would still be in AI winter.

• jacknews 11 hours ago

Surely the roots, if we skip over the early preceptron work', are in backpropagation and Hinton, and the work going on at Edinburgh and elsewhere in the 80s.

Indeed I remember buying a set of three conference-papers-as-books around that time, titled Artificial Neural Networks .. proceedings of the whatever the conference was.

No doubt Schmidhuber made important contributions, but I see him pop up claiming to be the 'root' of it all every couple of years.

• h8hawk 10 hours ago

Hinton did not invent backpropagation.

related paragraph from Wikipedia:

Modern backpropagation was first published by Seppo Linnainmaa as "reverse mode of automatic differentiation" (1970)[26] for discrete connected networks of nested differentiable functions.[27][28][29]

In 1982, Paul Werbos applied backpropagation to MLPs in the way that has become standard.

• ogrisel 9 hours ago

Paul Werbos did not apply backprop to MLPs as cleanly described in Hinton's paper, but rather to some kind of autoregressive non-linear parametrized functions with a much more specific application scope.

Both papers are direct applications of the chain rule applied to estimate the gradient of a multivariate function.

• hyttioaoa 10 hours ago

That's what bugs me about him. So much work has gone into today's models that calling his contributions "the root" isn't really warranted. He's always complaining that Hinton, LeCun, and Bengio get more credit than they deserve, and now he's over-claiming himself.

• BoredPositron 9 hours ago

Both can be right.

• HarHarVeryFunny 5 hours ago

They could be, but they really aren't.

Name a single aspect of something modern like the Transformer architecture or how it is trained, that is even indirectly attributable to Schmidhuber.

No doubt he'd be jumping up and down wanting to take credit for residual connections, but where was Schmidhuber in the ImageNet era when everyone else was discovering how to build deep neural nets? Why didn't Schmidhuber invent ResNets, but instead waited until someone else (Kaiming He) did, then claim credit for it?

I'll bet Schmidhuber isn't done with yet ... when someone eventually comes up with an architecture for AGI, Schmidhuber will come out of the woodwork and point to a note he made on a napkin in 1800 that predicted it all.

• emil-lp 10 hours ago

Surely the roots go back to Turing, Gödel, Hilbert, Frege, Leibniz, Aristoteles.

• jongjong 7 hours ago

It's crazy to think that if Elon Musk hadn't mentioned Schmidhuber, most people would have no idea.

It's nauseating how all the researchers who happened to work for big tech got tons of media coverage but Schmidhuber and his team were getting zero coverage yet they made massive contributions. I bet there are many others not mentioned.

Nobody even knows about Frank Rosenblatt. It's insane how distorted our perception of innovation is.

Even science has been corrupted. It makes one doubt every story we're told about who invented what.

• gom_jabbar 6 hours ago

Yes, Rosenblatt is another good example. I recently looked deeper into the development of the perceptron and it's absolutely fascinating.

• ks2048 3 hours ago

> Nobody even knows about Frank Rosenblatt.

Very Trump-like statement - "Not many people know this, but ...". Yes, I lot of people know this. Any class that even says a little about the history of NNs will talk about Rosenblatt and the Perceptron.

• gom_jabbar 2 hours ago

> Any class that even says a little about the history of NNs will talk about Rosenblatt and the Perceptron.

Sure. I think it starts to get more interesting when the influences that Rosenblatt explicitly cites in his seminal Perceptron paper (e.g. Hayek) become part of the discussion (which rarely happens in my experience).

• storus 6 hours ago

Instead of focusing on the future, EU is busy rewriting history to please some eccentric researcher that claims he invented it all.

• greggoB 5 hours ago

How does the EU feature in TFA exactly?

• storus 4 hours ago

There seems to be a coordinated push around Schmidhuber all around media in the EU, even LinkedIn is full of "random" posts about him in the past week.

• logicchains 3 hours ago

You clearly aren't familiar with Schmidhuber if this kind of thing seems new to you. It's basically his thing.

• impossiblefork 6 hours ago

Schmidhuber isn't in the EU, nor Switzerland at the moment.

• greenavocado 4 hours ago

Schmidhuber will NEVER stop trying to aggressively preserve his relevance and its endlessly amusing. Good for him.

• sagex 8 hours ago

I believe invention of Transformers and especially Attention mechanism do have influence from past research but its not definitely only the Schmidhuber's work. Said that, if we remove the papers mentioned by Schmidhuber from history, I am quite certain that there will be no influence in the discovery of Transformers, hence his works can not be the root. He has to grow up and accept that work and equations can appear similar, looking at inverse squared law and saying Newton stole that from someone is being dishonest.