The limits of deep learning

Indeed, we may already be running into scaling limits in deep learning, perhaps already approaching a point of diminishing returns. In the last several months, research from DeepMind and elsewhere on models even larger than GPT-3 have shown that scaling starts to falter on some measures, such as toxicity, truthfulness, reasoning, and common sense.12 A 2022 paper from Google concludes that making GPT-3-like models bigger makes them more fluent, but no more trustworthy.13

This seems to be what our minds are really, really good at, abstracting sensory data or concepts into a thing and then using that thing as a short-hand to reason.

Manipulating symbols has been essential to computer science since the beginning, at least since the pioneer papers of Alan Turing and John von Neumann, and is still the fundamental staple of virtually all software engineering—yet is treated as a dirty word in deep learning. … To think that we can simply abandon symbol-manipulation is to suspend disbelief. … Ultimately, it means two things: having sets of symbols (essentially just patterns that stand for things) to represent information, and processing (manipulating) those symbols in a specific way, using something like algebra (or logic, or computer programs) to operate over those symbols. 

Symbols still far outstrip current neural networks in many fundamental aspects of computation. They are much better positioned to reason their way through complex scenarios,22 can do basic operations like arithmetic more systematically and reliably, and are better able to precisely represent relationships between parts and wholes (essential both in the interpretation of the 3-D world and the comprehension of human language). They are more robust and flexible in their capacity to represent and query large-scale databases. Symbols are also more conducive to formal verification techniques, which are critical for some aspects of safety and ubiquitous in the design of modern microprocessors. To abandon these virtues rather than leveraging them into some sort of hybrid architecture would make little sense.

This makes sense, you wouldn’t predict the next word / symbol to do math.

Deep learning on its own continues to struggle even in domains as orderly as arithmetic.21 A hybrid system may have more power than either system on its own.

Hybrid models are going to get us closer to AGI…

Artur Garcez and Luis Lamb wrote a manifesto for hybrid models in 2009, called Neural-Symbolic Cognitive Reasoning. And some of the best-known recent successes in board-game playing (Go, Chess, and so forth, led primarily by work at Alphabet’s DeepMind) are hybrids. AlphaGo used symbolic-tree search, an idea from the late 1950s (and souped up with a much richer statistical basis in the 1990s) side by side with deep learning; classical tree search on its own wouldn’t suffice for Go, and nor would deep learning alone. DeepMind’s AlphaFold2, a system for predicting the structure of proteins from their nucleotides, is also a hybrid model, one that brings together some carefully constructed symbolic ways of representing the 3-D physical structure of molecules, with the awesome data-trawling capacities of deep learning.

From and his conversation with Ezra Klein is good as well.

Leave a reply:

Your email address will not be published.

Site Footer