Transact-SQL
Reinforcement Learning
R Programming
React Native
Python Design Patterns
Python Pillow
Python Turtle
Verbal Ability
Interview Questions
Company Questions
Cloud Computing
Data Science
Machine Learning
Data Structures
Operating System
Computer Network
Compiler Design
Computer Organization
Discrete Mathematics
Ethical Hacking
Computer Graphics
Software Engineering
Web Technology
Cyber Security
C Programming
Data Mining
Data Warehouse
Everyone thinks they know but no one can agree. And thatâs a problem.
AI is sexy, AI is cool. AI is entrenching inequality, upending the job market, and wrecking education. AI is a theme-park ride, AI is a magic trick. AI is our final invention, AI is a moral obligation. AI is the buzzword of the decade, AI is marketing jargon from 1955. AI is humanlike, AI is alien. AI is super-smart and as dumb as dirt. The AI boom will boost the economy, the AI bubble is about to burst. AI will increase abundance and empower humanity to maximally flourish in the universe. AI will kill us all.
What the hell is everybody talking about?
Artificial intelligence is the hottest technology of our time. But what is it? It sounds like a stupid question, but itâs one thatâs never been more urgent. Hereâs the short answer: AI is a catchall term for a set of technologies that make computers do things that are thought to require intelligence when done by people. Think of recognizing faces, understanding speech, driving cars, writing sentences, answering questions, creating pictures. But even that definition contains multitudes.
And that right there is the problem. What does it mean for machines to understand speech or write a sentence? What kinds of tasks could we ask such machines to do? And how much should we trust the machines to do them?
As this technology moves from prototype to product faster and faster, these have become questions for all of us. But (spoilers!) I donât have the answers. I canât even tell you what AI is. The people making it donât know what AI is either. Not really. âThese are the kinds of questions that are important enough that everyone feels like they can have an opinion,â says Chris Olah, chief scientist at the San Franciscoâbased AI lab Anthropic. âI also think you can argue about this as much as you want and thereâs no evidence thatâs going to contradict you right now.â
But if youâre willing to buckle up and come for a ride, I can tell you why nobody really knows, why everybody seems to disagree, and why youâre right to care about it.
Letâs start with an offhand joke.
Back in 2022, partway through the first episode of Mystery AI Hype Theater 3000 , a party-pooping podcast in which the irascible cohosts Alex Hanna and Emily Bender have a lot of fun sticking âthe sharpest needlesââ into some of Silicon Valleyâs most inflated sacred cows, they make a ridiculous suggestion. Theyâre hate-reading aloud from a 12,500-word Medium post by a Google VP of engineering, Blaise AgĂźera y Arcas, titled â Can machines learn how to behave? â AgĂźera y Arcas makes a case that AI can understand concepts in a way thatâs somehow analogous to the way humans understand conceptsâconcepts such as moral values. In short, perhaps machines can be taught to behave.Â
Hanna and Bender are having none of it. They decide to replace the term âAIââ with âmathy mathââyou know, just lots and lots of math.
The irreverent phrase is meant to collapse what they see as bombast and anthropomorphism in the sentences being quoted. Pretty soon Hanna, a sociologist and director of research at the Distributed AI Research Institute, and Bender, a computational linguist at the University of Washington (and internet-famous critic of tech industry hype), open a gulf between what AgĂźera y Arcas wants to say and how they choose to hear it.
âHow should AIs, their creators, and their users be held morally accountable?â asks AgĂźera y Arcas.
How should mathy math be held morally accountable? asks Bender.
âThereâs a category error here,â she says. Hanna and Bender donât just reject what AgĂźera y Arcas says; they claim it makes no sense. âCan we please stop it with the âan AIâ or âthe AIsâ as if they are, like, individuals in the world?â Bender says.
It might sound as if theyâre talking about different things, but theyâre not. Both sides are talking about large language models, the technology behind the current AI boom. Itâs just that the way we talk about AI is more polarized than ever. In May, OpenAI CEO Sam Altman teased the latest update to GPT-4 , his companyâs flagship model, by tweeting , âFeels like magic to me.â
Thereâs a lot of road between math and magic.
AI has acolytes, with a faith-like belief in the technologyâs current power and inevitable future improvement. Artificial general intelligence is in sight, they say; superintelligence is coming behind it. And it has heretics, who pooh-pooh such claims as mystical mumbo-jumbo.
The buzzy popular narrative is shaped by a pantheon of big-name players, from Big Tech marketers in chief like Sundar Pichai and Satya Nadella to edgelords of industry like Elon Musk and Altman to celebrity computer scientists like Geoffrey Hinton . Sometimes these boosters and doomers are one and the same, telling us that the technology is so good itâs bad .
As AI hype has ballooned, a vocal anti-hype lobby has risen in opposition, ready to smack down its ambitious, often wild claims. Pulling in this direction are a raft of researchers, including Hanna and Bender, and also outspoken industry critics like influential computer scientist and former Googler Timnit Gebru and NYU cognitive scientist Gary Marcus. All have a chorus of followers bickering in their replies.
In short, AI has come to mean all things to all people, splitting the field into fandoms. It can feel as if different camps are talking past one another, not always in good faith.
Maybe you find all this silly or tiresome. But given the power and complexity of these technologiesâwhich are already used to determine how much we pay for insurance, how we look up information, how we do our jobs, etc. etc. etc.âitâs about time we at least agreed on what it is weâre even talking about.
Yet in all the conversations Iâve had with people at the cutting edge of this technology, no one has given a straight answer about exactly what it is theyâre building. (A quick side note: This piece focuses on the AI debate in the US and Europe, largely because many of the best-funded, most cutting-edge AI labs are there. But of course thereâs important research happening elsewhere, too, in countries with their own varying perspectives on AI, particularly China.) Partly, itâs the pace of development. But the science is also wide open. Todayâs large language models can do amazing things . The field just canât find common ground on whatâs really going on under the hood .
These models are trained to complete sentences. They appear to be able to do a lot moreâfrom solving high school math problems to writing computer code to passing law exams to composing poems. When a person does these things, we take it as a sign of intelligence. What about when a computer does it? Is the appearance of intelligence enough?
These questions go to the heart of what we mean by âartificial intelligence,â a term people have actually been arguing about for decades. But the discourse around AI has become more acrimonious with the rise of large language models that can mimic the way we talk and write with thrilling/chilling (delete as applicable) realism.
We have built machines with humanlike behavior but havenât shrugged off the habit of imagining a humanlike mind behind them. This leads to over-egged evaluations of what AI can do; it hardens gut reactions into dogmatic positions, and it plays into the wider culture wars between techno-optimists and techno-skeptics.
Add to this stew of uncertainty a truckload of cultural baggage, from the science fiction that Iâd bet many in the industry were raised on, to far more malign ideologies that influence the way we think about the future. Given this heady mix, arguments about AI are no longer simply academic (and perhaps never were). AI inflames peopleâs passions and makes grownups call each other names.
âItâs not in an intellectually healthy place right now,â Marcus says of the debate. For years Marcus has pointed out the flaws and limitations of deep learning, the tech that launched AI into the mainstream, powering everything from LLMs to image recognition to self-driving cars. His 2001 book The Algebraic Mind argued that neural networks, the foundation on which deep learning is built, are incapable of reasoning by themselves. (Weâll skip over it for now, but Iâll come back to it later and weâll see just how much a word like âreasoningâ matters in a sentence like this.)
Marcus says that he has tried to engage Hintonâwho last year went public with existential fears about the technology he helped inventâin a proper debate about how good large language models really are. âHe just wonât do it,â says Marcus. âHe calls me a twit.â (Having talked to Hinton about Marcus in the past, I can confirm that. âChatGPT clearly understands neural networks better than he does,â Hinton told me last year.) Marcus also drew ire when he wrote an essay titled âDeep learning is hitting a wall.â Altman responded to it with a tweet : âGive me the confidence of a mediocre deep learning skeptic.â
At the same time, banging his drum has made Marcus a one-man brand and earned him an invitation to sit next to Altman and give testimony last year before the US Senateâs AI oversight committee.
And thatâs why all these fights matter more than your average internet nastiness. Sure, there are big egos and vast sums of money at stake. But more than that, these disputes matter when industry leaders and opinionated scientists are summoned by heads of state and lawmakers to explain what this technology is and what it can do (and how scared we should be). They matter when this technology is being built into software we use every day, from search engines to word-processing apps to assistants on your phone. AI is not going away. But if we donât know what weâre being sold, whoâs the dupe?
âIt is hard to think of another technology in history about which such a debate could be hadâa debate about whether it is everywhere, or nowhere at all,â Stephen Cave and Kanta Dihal write in Imagining AI , a 2023 collection of essays about how different cultural beliefs shape peopleâs views of artificial intelligence. âThat it can be held about AI is a testament to its mythic quality.â
Above all else, AI is an ideaâan idealâshaped by worldviews and sci-fi tropes as much as by math and computer science. Figuring out what we are talking about when we talk about AI will clarify many things. We wonât agree on them, but common ground on what AI is would be a great place to start talking about what AI should be .
In late 2022, soon after OpenAI released ChatGPT , a new meme started circulating online that captured the weirdness of this technology better than anything else. In most versions , a Lovecraftian monster called the Shoggoth, all tentacles and eyeballs, holds up a bland smiley-face emoji as if to disguise its true nature. ChatGPT presents as humanlike and accessible in its conversational wordplay, but behind that façade lie unfathomable complexitiesâand horrors. (âIt was a terrible, indescribable thing vaster than any subway trainâa shapeless congeries of protoplasmic bubbles,â H.P. Lovecraft wrote of the Shoggoth in his 1936 novella At the Mountains of Madness .) Â
For years one of the best-known touchstones for AI in pop culture was The Terminator , says Dihal. But by putting ChatGPT online for free, OpenAI gave millions of people firsthand experience of something different. âAI has always been a sort of really vague concept that can expand endlessly to encompass all kinds of ideas,â she says. But ChatGPT made those ideas tangible: âSuddenly, everybody has a concrete thing to refer to.â What is AI? For millions of people the answer was now: ChatGPT.
The AI industry is selling that smiley face hard. Consider how The Daily Show recently skewered the hype, as expressed by industry leaders. Silicon Valleyâs VC in chief, Marc Andreessen: âThis has the potential to make life much better ⌠I think itâs honestly a layup.â Altman: âI hate to sound like a utopic tech bro here, but the increase in quality of life that AI can deliver is extraordinary.â Pichai: âAI is the most profound technology that humanity is working on. More profound than fire.â
Jon Stewart: âYeah, suck a dick, fire!â
But as the meme points out, ChatGPT is a friendly mask. Behind it is a monster called GPT-4, a large language model built from a vast neural network that has ingested more words than most of us could read in a thousand lifetimes. During training, which can last months and cost tens of millions of dollars, such models are given the task of filling in blanks in sentences taken from millions of books and a significant fraction of the internet. They do this task over and over again. In a sense, they are trained to be supercharged autocomplete machines. The result is a model that has turned much of the worldâs written information into a statistical representation of which words are most likely to follow other words, captured across billions and billions of numerical values.
Itâs mathâa hell of a lot of math. Nobody disputes that. But is it just that, or does this complex math encode algorithms capable of something akin to human reasoning or the formation of concepts?
Many of the people who answer yes to that question believe weâre close to unlocking something called artificial general intelligence , or AGI, a hypothetical future technology that can do a wide range of tasks as well as humans can. A few of them have even set their sights on what they call superintelligence , sci-fi technology that can do things far better than humans. This cohort believes AGI will drastically change the worldâbut to what end? Thatâs yet another point of tension. It could fix all the worldâs problemsâor bring about its doom.Â
kinda mad how the so called godfathers of AI managed to convince seemingly smart people within AI field & many regulators to buy into the absurd idea that a sophisticated curve fitting (to a dataset) machine can have the urge to exterminate humans â Abeba Birhane (@Abebab) June 30, 2024
Today AGI appears in the mission statements of the worldâs top AI labs. But the term was invented in 2007 as a niche attempt to inject some pizzazz into a field that was then best known for applications that read handwriting on bank deposit slips or recommended your next book to buy. The idea was to reclaim the original vision of an artificial intelligence that could do humanlike things (more on that soon).
It was really an aspiration more than anything else, Google DeepMind cofounder Shane Legg, who coined the term, told me last year: âI didnât have an especially clear definition.â
AGI became the most controversial idea in AI . Some talked it up as the next big thing: AGI was AI but, you know, much better . Others claimed the term was so vague that it was meaningless.
âAGI used to be a dirty word,â Ilya Sutskever told me, before he resigned as chief scientist at OpenAI.
But large language models, and ChatGPT in particular, changed everything. AGI went from dirty word to marketing dream.
Which brings us to what I think is one of the most illustrative disputes of the momentâone that sets up the sides of the argument and the stakes in play.Â
A few months before the public launch of OpenAIâs large language model GPT-4 in March 2023, the company shared a prerelease version with Microsoft, which wanted to use the new model to revamp its search engine Bing.
At the time, Sebastian Bubeck was studying the limitations of LLMs and was somewhat skeptical of their abilities. In particular, Bubeckâthe vice president of generative AI research at Microsoft Research in Redmond, Washingtonâhad been trying and failing to get the technology to solve middle school math problems. Things like: x â y = 0; what are x and y ? âMy belief was that reasoning was a bottleneck, an obstacle,â he says. âI thought that you would have to do something really fundamentally different to get over that obstacle.â
Then he got his hands on GPT-4. The first thing he did was try those math problems. âThe model nailed it,â he says. âSitting here in 2024, of course GPT-4 can solve linear equations. But back then, this was crazy. GPT-3 cannot do that.â
But Bubeckâs real road-to-Damascus moment came when he pushed it to do something new.
The thing about middle school math problems is that they are all over the internet, and GPT-4 may simply have memorized them. âHow do you study a model that may have seen everything that human beings have written?â asks Bubeck. His answer was to test GPT-4 on a range of problems that he and his colleagues believed to be novel.
Playing around with Ronen Eldan, a mathematician at Microsoft Research, Bubeck asked GPT-4 to give, in verse, a mathematical proof that there are an infinite number of primes.
Hereâs a snippet of GPT-4âs response: âIf we take the smallest number in S that is not in P / And call it p, we can add it to our set, donât you see? / But this process can be repeated indefinitely. / Thus, our set P must also be infinite, youâll agree.â
Cute, right? But Bubeck and Eldan thought it was much more than that. âWe were in this office,â says Bubeck, waving at the room behind him via Zoom. âBoth of us fell from our chairs. We couldnât believe what we were seeing. It was just so creative and so, like, you know, different.âÂ
The Microsoft team also got GPT-4 to generate the code to add a horn to a cartoon picture of a unicorn drawn in Latex, a word processing program. Bubeck thinks this shows that the model could read the existing Latex code, understand what it depicted, and identify where the horn should go.
âThere are many examples, but a few of them are smoking guns of reasoning,â he saysâreasoning being a crucial building block of human intelligence.
Bubeck, Eldan, and a team of other Microsoft researchers described their findings in a paper that they called â Spark s of artificial general intelligence â: âWe believe that GPT-4âs intelligence signals a true paradigm shift in the field of computer science and beyond.â When Bubeck shared the paper online, he tweeted : âtime to face it, the sparks of #AGI have been ignited.â
The Sparks paper quickly became infamousâand a touchstone for AI boosters. AgĂźera y Arcas and Peter Norvig, a former director of research at Google and coauthor of Artificial Intelligence: A Modern Approach , perhaps the most popular AI textbook in the world, cowrote an article called â Artificial General Intelligence Is Already Here .â Published in Noema , a magazine backed by an LA think tank called the Berggruen Institute, their argument uses the Sparks paper as a jumping-off point: âArtificial General Intelligence (AGI) means many different things to different people, but the most important parts of it have already been achieved by the current generation of advanced AI large language models,â they wrote. âDecades from now, they will be recognized as the first true examples of AGI.â
Since then, the hype has continued to balloon. Leopold Aschenbrenner, who at the time was a researcher at OpenAI focusing on superintelligence, told me last year: âAI progress in the last few years has been just extraordinarily rapid. Weâve been crushing all the benchmarks, and that progress is continuing unabated. But it wonât stop there. Weâre going to have superhuman models, models that are much smarter than us.â (He was fired from OpenAI in April because, he claims, he raised security concerns about the tech he was building and â ruffled some feathers .â He has since set up a Silicon Valley investment fund.)
In June, Aschenbrenner put out a 165-page manifesto arguing that AI will outpace college graduates by â2025/2026â and that âwe will have superintelligence, in the true sense of the wordâ by the end of the decade. But others in the industry scoff at such claims. When Aschenbrenner tweeted a chart to show how fast he thought AI would continue to improve given how fast it had improved in last few years, the tech investor Christian Keil replied that by the same logic, his baby son, who had doubled in size since he was born, would weigh 7.5 trillion tons by the time he was 10.
Itâs no surprise that âsparks of AGIâ has also become a byword for over-the-top buzz. âI think they got carried away,â says Marcus, speaking about the Microsoft team. âThey got excited, like âHey, we found something! This is amazing!â They didnât vet it with the scientific community.â Bender refers to the Sparks paper as a âfan fiction novella.â
Not only was it provocative to claim that GPT-4âs behavior showed signs of AGI, but Microsoft, which uses GPT-4 in its own products, has a clear interest in promoting the capabilities of the technology. âThis document is marketing fluff masquerading as research,â one tech COO posted on LinkedIn.
Some also felt the paperâs methodology was flawed. Its evidence is hard to verify because it comes from interactions with a version of GPT-4 that was not made available outside OpenAI and Microsoft. The public version has guardrails that restrict the modelâs capabilities, admits Bubeck. This made it impossible for other researchers to re-create his experiments.
One group tried to re-create the unicorn example with a coding language called Processing, which GPT-4 can also use to generate images . They found that the public version of GPT-4 could produce a passable unicorn but not flip or rotate that image by 90 degrees. It may seem like a small difference, but such things really matter when youâre claiming that the ability to draw a unicorn is a sign of AGI.
The key thing about the examples in the Sparks paper, including the unicorn, is that Bubeck and his colleagues believe they are genuine examples of creative reasoning. This means the team had to be certain that examples of these tasks, or ones very like them, were not included anywhere in the vast data sets that OpenAI amassed to train its model. Otherwise, the results could be interpreted instead as instances where GPT-4 reproduced patterns it had already seen.
Bubeck insists that they set the model only tasks that would not be found on the internet. Drawing a cartoon unicorn in Latex was surely one such task. But the internet is a big place. Other researchers soon pointed out that there are indeed online forums dedicated to drawing animals in Latex . âJust fyi we knew about this,â Bubeck replied on X. âEvery single query of the Sparks paper was thoroughly looked for on the internet.â
(This didnât stop the name-calling: âIâm asking you to stop being a charlatan,â Ben Recht, a computer scientist at the University of California, Berkeley, tweeted back before accusing Bubeck of âbeing caught flat-out lying.â)
Bubeck insists the work was done in good faith, but he and his coauthors admit in the paper itself that their approach was not rigorousânotebook observations rather than foolproof experiments.Â
Still, he has no regrets: âThe paper has been out for more than a year and I have yet to see anyone give me a convincing argument that the unicorn, for example, is not a real example of reasoning.â
Thatâs not to say he can give me a straight answer to the big questionâthough his response reveals what kind of answer heâd like to give. âWhat is AI?â Bubeck repeats back to me. âI want to be clear with you. The question can be simple, but the answer can be complex.â
âThere are many simple questions out there to which we still donât know the answer. And some of those simple questions are the most profound ones,â he says. âIâm putting this on the same footing as, you know, What is the origin of life? What is the origin of the universe? Where did we come from? Big, big questions like this.â
Before Bender became one of the chief antagonists of AIâs boosters, she made her mark on the AI world as a coauthor on two influential papers. (Both peer-reviewed, she likes to point outâunlike the Sparks paper and many of the others that get much of the attention.) The first, written with Alexander Koller, a fellow computational linguist at Saarland University in Germany, and published in 2020, was called â Climbing towards NLU â (NLU is natural-language understanding).
âThe start of all this for me was arguing with other people in computational linguistics whether or not language models understand anything,â she says. (Understanding, like reasoning, is typically taken to be a basic ingredient of human intelligence.)
Bender and Koller argue that a model trained exclusively on text will only ever learn the form of a language, not its meaning. Meaning, they argue, consists of two parts: the words (which could be marks or sounds) plus the reason those words were uttered. People use language for many reasons, such as sharing information, telling jokes, flirting, warning somebody to back off, and so on. Stripped of that context, the text used to train LLMs like GPT-4 lets them mimic the patterns of language well enough for many sentences generated by the LLM to look exactly like sentences written by a human. But thereâs no meaning behind them, no spark . Itâs a remarkable statistical trick, but completely mindless.
They illustrate their point with a thought experiment. Imagine two English-speaking people stranded on neighboring deserted islands. There is an underwater cable that lets them send text messages to each other. Now imagine that an octopus, which knows nothing about English but is a whiz at statistical pattern matching, wraps its suckers around the cable and starts listening in to the messages. The octopus gets really good at guessing what words follow other words. So good that when it breaks the cable and starts replying to messages from one of the islanders, she believes that she is still chatting with her neighbor. (In case you missed it, the octopus in this story is a chatbot.)
The person talking to the octopus would stay fooled for a reasonable amount of time, but could that last? Does the octopus understand what comes down the wire?Â
Imagine that the islander now says she has built a coconut catapult and asks the octopus to build one too and tell her what it thinks. The octopus cannot do this. Without knowing what the words in the messages refer to in the world, it cannot follow the islanderâs instructions. Perhaps it guesses a reply: âOkay, cool idea!â The islander will probably take this to mean that the person she is speaking to understands her message. But if so, she is seeing meaning where there is none. Finally, imagine that the islander gets attacked by a bear and sends calls for help down the line. What is the octopus to do with these words?
Bender and Koller believe that this is how large language models learn and why they are limited. âThe thought experiment shows why this path is not going to lead us to a machine that understands anything,â says Bender. âThe deal with the octopus is that we have given it its training data, the conversations between those two people, and thatâs it. But then hereâs something that comes out of the blue and it wonât be able to deal with it because it hasnât understood.â
The other paper Bender is known for, â On the Dangers of Stochastic Parrots ,â highlights a series of harms that she and her coauthors believe the companies making large language models are ignoring. These include the huge computational costs of making the models and their environmental impact; the racist, sexist, and other abusive language the models entrench; and the dangers of building a system that could fool people by âhaphazardly stitching together sequences of linguistic forms ⌠according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.â
Google senior management wasnât happy with the paper, and the resulting conflict led two of Benderâs coauthors, Timnit Gebru and Margaret Mitchell, to be forced out of the company, where they had led the AI Ethics team. It also made âstochastic parrotâ a popular put-down for large language modelsâand landed Bender right in the middle of the name-calling merry-go-round.
The bottom line for Bender and for many like-minded researchers is that the field has been taken in by smoke and mirrors: âI think that they are led to imagine autonomous thinking entities that can make decisions for themselves and ultimately be the kind of thing that could actually be accountable for those decisions.â
Always the linguist, Bender is now at the point where she wonât even use the term AI âwithout scare quotes,â she tells me. Ultimately, for her, itâs a Big Tech buzzword that distracts from the many associated harms. âIâve got skin in the game now,â she says. âI care about these issues, and the hype is getting in the way.â
AgĂźera y Arcas calls people like Bender âAI denialistsââthe implication being that they wonât ever accept what he takes for granted. Benderâs position is that extraordinary claims require extraordinary evidence, which we do not have.
But there are people looking for it, and until they find something clear-cutâsparks or stochastic parrots or something in betweenâtheyâd prefer to sit out the fight. Call this the wait-and-see camp.
As Ellie Pavlick, who studies neural networks at Brown University, tells me: âItâs offensive to some people to suggest that human intelligence could be re-created through these kinds of mechanisms.â
She adds, âPeople have strong-held beliefs about this issueâit almost feels religious. On the other hand, thereâs people who have a little bit of a God complex. So itâs also offensive to them to suggest that they just canât do it.â
Pavlick is ultimately agnostic. Sheâs a scientist, she insists, and will follow wherever the science leads. She rolls her eyes at the wilder claims, but she believes thereâs something exciting going on. âThatâs where I would disagree with Bender and Koller,â she tells me. âI think thereâs actually some sparksâmaybe not of AGI, but like, thereâs some things in there that we didnât expect to find.â
The problem is finding agreement on what those exciting things are and why theyâre exciting. With so much hype, itâs easy to be cynical.
Researchers like Bubeck seem a lot more cool-headed when you hear them out. He thinks the infighting misses the nuance in his work. âI donât see any problem in holding simultaneous views,â he says. âThere is stochastic parroting; there is reasoningâitâs a spectrum. Itâs very complex. We donât have all the answers.â
âWe need a completely new vocabulary to describe whatâs going on,â he says. âOne reason why people push back when I talk about reasoning in large language models is because itâs not the same reasoning as in human beings. But I think there is no way we can not call it reasoning. It is reasoning.â
Anthropicâs Olah plays it safe when pushed on what weâre seeing in LLMs, though his company, one of the hottest AI labs in the world right now, built Claude 3, an LLM that has received just as much hyperbolic praise as GPT-4 (if not more) since its release earlier this year.
âI feel like a lot of these conversations about the capabilities of these models are very tribal,â he says. âPeople have preexisting opinions, and itâs not very informed by evidence on any side. Then it just becomes kind of vibes-based, and I think vibes-based arguments on the internet tend to go in a bad direction.â
Olah tells me he has hunches of his own. âMy subjective impression is that these things are tracking pretty sophisticated ideas,â he says. âWe donât have a comprehensive story of how very large models work, but I think itâs hard to reconcile what weâre seeing with the extreme âstochastic parrotsâ picture.â
Thatâs as far as heâll go: âI donât want to go too much beyond what can be really strongly inferred from the evidence that we have.â
Last month, Anthropic released results from a study in which researchers gave Claude 3 the neural network equivalent of an MRI. By monitoring which bits of the model turned on and off as they ran it, they identified specific patterns of neurons that activated when the model was shown specific inputs.
Anthropic also reported patterns that it says correlate with inputs that attempt to describe or show abstract concepts. âWe see features related to deception and honesty, to sycophancy, to security vulnerabilities, to bias,â says Olah. âWe find features related to power seeking and manipulation and betrayal.â
ASK IT FOR A RECIPE pic.twitter.com/0ZM3uGRJi9 â heron (@iamaheron_) May 23, 2024
These results give one of the clearest looks yet at whatâs inside a large language model. Itâs a tantalizing glimpse at what look like elusive humanlike traits. But what does it really tell us? As Olah admits, they do not know what the model does with these patterns. âItâs a relatively limited picture, and the analysis is pretty hard,â he says.
Even if Olah wonât spell out exactly what he thinks goes on inside a large language model like Claude 3, itâs clear why the question matters to him. Anthropic is known for its work on AI safetyâmaking sure that powerful future models will behave in ways we want them to and not in ways we donât (known as âalignmentâ in industry jargon). Figuring out how todayâs models work is not only a necessary first step if you want to control future ones; it also tells you how much you need to worry about doomer scenarios in the first place. âIf you donât think that models are going to be very capable,â says Olah, âthen theyâre probably not going to be very dangerous.â
In a 2014 interview with the BBC that looked back on her career, the influential cognitive scientist Margaret Boden, now 87, was asked if she thought there were any limits that would prevent computers (or âtin cans,â as she called them) from doing what humans can do.
âI certainly donât think thereâs anything in principle,â she said. âBecause to deny that is to say that [human thinking] happens by magic, and I donât believe that it happens by magic.â
But, she cautioned, powerful computers wonât be enough to get us there: the AI field will also need âpowerful ideasâânew theories of how thinking happens, new algorithms that might reproduce it. âBut these things are very, very difficult and I see no reason to assume that we will one of these days be able to answer all of those questions. Maybe we will; maybe we wonât.âÂ
Boden was reflecting on the early days of the current boom, but this will-we-or-wonât-we teetering speaks to decades in which she and her peers grappled with the same hard questions that researchers struggle with today. AI began as an ambitious aspiration 70-odd years ago and we are still disagreeing about what is and isnât achievable, and how weâll even know if we have achieved it. Mostâif not allâof these disputes come down to this: We donât have a good grasp on what intelligence is or how to recognize it. The field is full of hunches, but no one can say for sure.
Weâve been stuck on this point ever since people started taking the idea of AI seriously. Or even before that, when the stories we consumed started planting the idea of humanlike machines deep in our collective imagination. The long history of these disputes means that todayâs fights often reinforce rifts that have been around since the beginning, making it even more difficult for people to find common ground.
To understand how we got here, we need to understand where weâve been. So letâs dive into AIâs origin storyâone that also played up the hype in a bid for cash.
The computer scientist John McCarthy is credited with coming up with the term âartificial intelligenceâ in 1955 when writing a funding application for a summer research program at Dartmouth College in New Hampshire.
The plan was for McCarthy and a small group of fellow researchers, a whoâs-who of postwar US mathematicians and computer scientistsâor âJohn McCarthy and the boys,â as Harry Law, a researcher who studies the history of AI at the University of Cambridge and ethics and policy at Google DeepMind, puts itâto get together for two months (not a typo) and make some serious headway on this new research challenge theyâd set themselves.
âThe study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it,â McCarthy and his coauthors wrote. âAn attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.â
That list of things they wanted to make machines doâwhat Bender calls âthe starry-eyed dreamââhasnât changed much. Using language, forming concepts, and solving problems are defining goals for AI today. The hubris hasnât changed much either: âWe think that a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer,â they wrote. That summer, of course, has stretched to seven decades. And the extent to which these problems are in fact now solved is something that people still shout about on the internet.Â
But whatâs often left out of this canonical history is that artificial intelligence almost wasnât called âartificial intelligenceâ at all.
More than one of McCarthyâs colleagues hated the term he had come up with. âThe word âartificialâ makes you think thereâs something kind of phony about this,â Arthur Samuel, a Dartmouth participant and creator of the first checkers-playing computer, is quoted as saying in historian Pamela McCorduckâs 2004 book Machines Who Think . The mathematician Claude Shannon, a coauthor of the Dartmouth proposal who is sometimes billed as âthe father of the information age,â preferred the term âautomata studies.â Herbert Simon and Allen Newell, two other AI pioneers, continued to call their own work âcomplex information processingâ for years afterwards.
In fact, âartificial intelligenceâ was just one of several labels that might have captured the hodgepodge of ideas that the Dartmouth group was drawing on. The historian Jonnie Penn has identified possible alternatives that were in play at the time, including âengineering psychology,â âapplied epistemology,â âneural cybernetics,â ânon-numerical computing,â âneuraldynamics,â âadvanced automatic programming,â and âhypothetical automata.â This list of names reveals how diverse the inspiration for their new field was, pulling from biology, neuroscience, statistics, and more. Marvin Minsky, another Dartmouth participant, has described AI as a âsuitcase wordâ because it can hold so many divergent interpretations.
But McCarthy wanted a name that captured the ambitious scope of his vision. Calling this new field âartificial intelligenceâ grabbed peopleâs attentionâand money. Donât forget: AI is sexy, AI is cool.
In addition to terminology, the Dartmouth proposal codified a split between rival approaches to artificial intelligence that has divided the field ever sinceâa divide Law calls the âcore tension in AI.â
McCarthy and his colleagues wanted to describe in computer code âevery aspect of learning or any other feature of intelligenceâ so that machines could mimic them. In other words, if they could just figure out how thinking workedâthe rules of reasoningâand write down the recipe, they could program computers to follow it. This laid the foundation of what came to be known as rule-based or symbolic AI (sometimes referred to now as GOFAI, âgood old-fashioned AIâ). But coming up with hard-coded rules that captured the processes of problem-solving for actual, nontrivial problems proved too hard.
The other path favored neural networks, computer programs that would try to learn those rules by themselves in the form of statistical patterns. The Dartmouth proposal mentions it almost as an aside (referring variously to âneuron netsâ and ânerve netsâ). Though the idea seemed less promising at first, some researchers nevertheless continued to work on versions of neural networks alongside symbolic AI. But it would take decadesâplus vast amounts of computing power and much of the data on the internetâbefore they really took off. Fast-forward to today and this approach underpins the entire AI boom.
The big takeaway here is that, just like todayâs researchers, AIâs innovators fought about foundational concepts and got caught up in their own promotional spin. Even team GOFAI was plagued by squabbles. Aaron Sloman, a philosopher and fellow AI pioneer now in his late 80s, recalls how âold friendsâ Minsky and McCarthy âdisagreed stronglyâ when he got to know them in the â70s: âMinsky thought McCarthyâs claims about logic could not work, and McCarthy thought Minskyâs mechanisms could not do what could be done using logic. I got on well with both of them, but I was saying, âNeither of you have got it right.ââ (Sloman still thinks no one can account for the way human reasoning uses intuition as much as logic, but thatâs yet another tangent!)
As the fortunes of the technology waxed and waned, the term âAIâ went in and out of fashion. In the early â70s, both research tracks were effectively put on ice after the UK government published a report arguing that the AI dream had gone nowhere and wasnât worth funding. All that hype, effectively, had led to nothing. Research projects were shuttered, and computer scientists scrubbed the words âartificial intelligenceâ from their grant proposals.
When I was finishing a computer science PhD in 2008, only one person in the department was working on neural networks. Bender has a similar recollection: âWhen I was in college, a running joke was that AI is anything that we havenât figured out how to do with computers yet. Like, as soon as you figure out how to do it, it wasnât magic anymore, so it wasnât AI.â
But that magicâthe grand vision laid out in the Dartmouth proposalâremained alive and, as we can now see, laid the foundations for the AGI dream.
In 1950, five years before McCarthy started talking about artificial intelligence, Alan Turing had published a paper that asked: Can machines think? To address that question, the famous mathematician proposed a hypothetical test, which he called the imitation game. The setup imagines a human and a computer behind a screen and a second human who types questions to each. If the questioner cannot tell which answers come from the human and which come from the computer, Turing claimed, the computer may as well be said to think.
What Turing sawâunlike McCarthyâs crewâwas that thinking is a really difficult thing to describe. The Turing test was a way to sidestep that problem. âHe basically said: Instead of focusing on the nature of intelligence itself, Iâm going to look for its manifestation in the world. Iâm going to look for its shadow ,â says Law.
In 1952, BBC Radio convened a panel to explore Turingâs ideas further. Turing was joined in the studio by two of his Manchester University colleaguesâprofessor of mathematics Maxwell Newman and professor of neurosurgery Geoffrey Jeffersonâand Richard Braithwaite, a philosopher of science, ethics, and religion at the University of Cambridge.
Braithwaite kicked things off: âThinking is ordinarily regarded as so much the specialty of man, and perhaps of other higher animals, the question may seem too absurd to be discussed. But of course, it all depends on what is to be included in âthinking.ââ
The panelists circled Turingâs question but never quite pinned it down.
When they tried to define what thinking involved, what its mechanisms were, the goalposts moved. âAs soon as one can see the cause and effect working themselves out in the brain, one regards it as not being thinking but a sort of unimaginative donkey work,â said Turing.
Here was the problem: When one panelist proposed some behavior that might be taken as evidence of thoughtâreacting to a new idea with outrage, sayâanother would point out that a computer could be made to do it.
As Newman said, it would be easy enough to program a computer to print âI donât like this new program.â But he admitted that this would be a trick.
Exactly, Jefferson said: He wanted a computer that would print âI donât like this new programâ because it didnât like the new program. In other words, for Jefferson, behavior was not enough. It was the process leading to the behavior that mattered.
But Turing disagreed. As he had noted, uncovering a specific processâthe donkey work, to use his phraseâdid not pinpoint what thinking was either. So what was left?
âFrom this point of view, one might be tempted to define thinking as consisting of those mental processes that we donât understand,â said Turing. âIf this is right, then to make a thinking machine is to make one which does interesting things without our really understanding quite how it is done.â
It is strange to hear people grapple with these ideas for the first time. âThe debate is prescient,â says Tomer Ullman, a cognitive scientist at Harvard University. âSome of the points are still aliveâperhaps even more so. What they seem to be going round and round on is that the Turing test is first and foremost a behaviorist test.â
For Turing, intelligence was hard to define but easy to recognize. He proposed that the appearance of intelligence was enoughâand said nothing about how that behavior should come about.
And yet most people, when pushed, will have a gut instinct about what is and isnât intelligent. There are dumb ways and clever ways to come across as intelligent. In 1981, Ned Block, a philosopher at New York University, showed that Turingâs proposal fell short of those gut instincts. Because it said nothing of what caused the behavior, the Turing test can be beaten through trickery (as Newman had noted in the BBC broadcast).
âCould the issue of whether a machine in fact thinks or is intelligent depend on how gullible human interrogators tend to be?â asked Block. (Or as computer scientist Mark Reidl has remarked : âThe Turing test is not for AI to pass but for humans to fail.â)
Imagine, Block said, a vast look-up table in which human programmers had entered all possible answers to all possible questions. Type a question into this machine, and it would look up a matching answer in its database and send it back. Block argued that anyone using this machine would judge its behavior to be intelligent: âBut actually, the machine has the intelligence of a toaster,â he wrote. âAll the intelligence it exhibits is that of its programmers.â
Block concluded that whether behavior is intelligent behavior is a matter of how it is produced, not how it appears. Blockâs toasters, which became known as Blockheads, are one of the strongest counterexamples to the assumptions behind Turingâs proposal.
The Turing test is not meant to be a practical metric, but its implications are deeply ingrained in the way we think about artificial intelligence today. This has become particularly relevant as LLMs have exploded in the past several years. These models get ranked by their outward behaviors, specifically how well they do on a range of tests. When OpenAI announced GPT-4, it published an impressive-looking scorecard that detailed the modelâs performance on multiple high school and professional exams. Almost nobody talks about how these models get those results.
Thatâs because we donât know. Todayâs large language models are too complex for anybody to say exactly how their behavior is produced. Researchers outside the small handful of companies making those models donât know whatâs in their training data; none of the model makers have shared details. That makes it hard to say what is and isnât a kind of memorizationâa stochastic parroting. But even researchers on the inside, like Olah, donât know whatâs really going on when faced with a bridge-obsessed bot.
This leaves the question wide open: Yes, large language models are built on mathâbut are they doing something intelligent with it?
And the arguments begin again.
âMost people are trying to armchair through it,â says Brown Universityâs Pavlick, meaning that they are arguing about theories without looking at whatâs really happening. âSome people are like, âI think itâs this way,â and some people are like, âWell, I donât.â Weâre kind of stuck and everyoneâs unsatisfied.â
Bender thinks that this sense of mystery plays into the mythmaking. (âMagicians do not explain their tricks,â she says.) Without a proper appreciation of where the LLMâs words come from, we fall back on familiar assumptions about humans, since that is our only real point of reference. When we talk to another person, we try to make sense of what that person is trying to tell us. âThat process necessarily entails imagining a life behind the words,â says Bender. Thatâs how language works.
âThe parlor trick of ChatGPT is so impressive that when we see these words coming out of it, we do the same thing instinctively,â she says. âItâs very good at mimicking the form of language. The problem is that we are not at all good at encountering the form of language and not imagining the rest of it.â
For some researchers, it doesnât really matter if we canât understand the how . Bubeck used to study large language models to try to figure out how they worked, but GPT-4 changed the way he thought about them. âIt seems like these questions are not so relevant anymore,â he says. âThe model is so big, so complex, that we canât hope to open it up and understand whatâs really happening.â
But Pavlick, like Olah, is trying to do just that. Her team has found that models seem to encode abstract relationships between objects, such as that between a country and its capital. Studying one large language model, Pavlick and her colleagues found that it used the same encoding to map France to Paris and Poland to Warsaw. That almost sounds smart, I tell her. âNo, itâs literally a lookup table,â she says.
But what struck Pavlick was that, unlike a Blockhead, the model had learned this lookup table on its own. In other words, the LLM figured out itself that Paris is to France as Warsaw is to Poland. But what does this show? Is encoding its own lookup table instead of using a hard-coded one a sign of intelligence? Where do you draw the line?
âBasically, the problem is that behavior is the only thing we know how to measure reliably,â says Pavlick. âAnything else requires a theoretical commitment, and people donât like having to make a theoretical commitment because itâs so loaded.â
Not all people. A lot of influential scientists are just fine with theoretical commitment. Hinton, for example, insists that neural networks are all you need to re-create humanlike intelligence. âDeep learning is going to be able to do everything,â he told MIT Technology Review in 2020 .Â
Itâs a commitment that Hinton seems to have held onto from the start. Sloman, who recalls the two of them arguing when Hinton was a graduate student in his lab, remembers being unable to persuade him that neural networks cannot learn certain crucial abstract concepts that humans and some other animals seem to have an intuitive grasp of, such as whether something is impossible. We can just see when somethingâs ruled out, Sloman says. âDespite Hintonâs outstanding intelligence, he never seemed to understand that point. I donât know why, but there are large numbers of researchers in neural networks who share that failing.â
And then thereâs Marcus, whose view of neural networks is the exact opposite of Hintonâs. His case draws on what he says scientists have discovered about brains.
Brains, Marcus points out, are not blank slates that learn fully from scratchâthey come ready-made with innate structures and processes that guide learning. Itâs how babies can learn things that the best neural networks still canât, he argues.
âNeural network people have this hammer, and now everything is a nail,â says Marcus. âThey want to do all of it with learning, which many cognitive scientists would find unrealistic and silly. Youâre not going to learn everything from scratch.â
Not that Marcusâa cognitive scientistâis any less sure of himself. âIf one really looked at whoâs predicted the current situation well, I think I would have to be at the top of anybodyâs list,â he tells me from the back of an Uber on his way to catch a flight to a speaking gig in Europe. âI know that doesnât sound very modest, but I do have this perspective that turns out to be very important if what youâre trying to study is artificial intelligence.â
Given his well-publicized attacks on the field, it might surprise you that Marcus still believes AGI is on the horizon. Itâs just that he thinks todayâs fixation on neural networks is a mistake. âWe probably need a breakthrough or two or four,â he says. âYou and I might not live that long, Iâm sorry to say. But I think itâll happen this century. Maybe weâve got a shot at it.â
Over Dor Skulerâs shoulder on the Zoom call from his home in Ramat Gan, Israel, a little lamp-like robot is winking on and off while we talk about it. âYou can see ElliQ behind me here,â he says. Skulerâs company, Intuition Robotics, develops these devices for older people, and the designâpart Amazon Alexa, part R2-D2âmust make it very clear that ElliQ is a computer. If any of his customers show signs of being confused about that, Intuition Robotics takes the device back, says Skuler.
ElliQ has no face, no humanlike shape at all. Ask it about sports, and it will crack a joke about having no hand-eye coordination because it has no hands and no eyes. âFor the life of me, I donât understand why the industry is trying to fulfill the Turing test,â Skuler says. âWhy is it in the best interest of humanity for us to develop technology whose goal is to dupe us?â
Instead, Skulerâs firm is betting that people can form relationships with machines that present as machines. âJust like we have the ability to build a real relationship with a dog,â he says. âDogs provide a lot of joy for people. They provide companionship. People love their dogâbut they never confuse it to be a human.â
ElliQâs users, many in their 80s and 90s, refer to the robot as an entity or a presenceâsometimes a roommate. âTheyâre able to create a space for this in-between relationship, something between a device or a computer and something thatâs alive,â says Skuler.
But no matter how hard ElliQâs designers try to control the way people view the device, they are competing with decades of pop culture that have shaped our expectations. Why are we so fixated on AI thatâs humanlike? âBecause itâs hard for us to imagine something else,â says Skuler (who indeed refers to ElliQ as âsheâ throughout our conversation). âAnd because so many people in the tech industry are fans of science fiction. They try to make their dream come true.â
How many developers grew up today thinking that building a smart machine was seriously the coolest thingâif not the most important thingâthat they could possibly do?
It was not long ago that OpenAI launched its new voice-controlled version of ChatGPT with a voice that sounded like Scarlett Johansson, after which many peopleâincluding Altmanâflagged the connection to Spike Jonzeâs 2013 movie Her .
Science fiction co-invents what AI is understood to be. As Cave and Dihal write in Imagining AI : âAI was a cultural phenomenon long before it was a technological one.â
Stories and myths about remaking humans as machines have been around for centuries. People have been dreaming of artificial humans for probably as long as they have dreamed of flight, says Dihal. She notes that Daedalus, the figure in Greek mythology famous for building a pair of wings for himself and his son, Icarus, also built what was effectively a giant bronze robot called Talos that threw rocks at passing pirates.
The word robot comes from robota , a term for âforced laborâ coined by the Czech playwright Karel Äapek in his 1920 play Rossumâs Universal Robots . The âlaws of roboticsâ outlined in Isaac Asimovâs science fiction, forbidding machines from harming humans, are inverted by movies like The Terminator , which is an iconic reference point for popular fears about real-world technology. The 2014 film Ex Machina is a dramatic riff on the Turing test. Last yearâs blockbuster The Creator imagines a future world in which AI has been outlawed because it set off a nuclear bomb, an event that some doomers consider at least an outside possibility.
Cave and Dihal relate how another movie, 2014âs Transcendence , in which an AI expert played by Johnny Depp gets his mind uploaded to a computer, served a narrative pushed by ur-doomers Stephen Hawking, fellow physicist Max Tegmark, and AI researcher Stuart Russell. In an article published in the Huffington Post on the movieâs opening weekend, the trio wrote: âAs the Hollywood blockbuster Transcendence debuts this weekend with ⌠clashing visions for the future of humanity, itâs tempting to dismiss the notion of highly intelligent machines as mere science fiction. But this would be a mistake, and potentially our worst mistake ever.â
Right around the same time, Tegmark founded the Future of Life Institute, with a remit to study and promote AI safety. Deppâs costar in the movie, Morgan Freeman, was on the instituteâs board, and Elon Musk, who had a cameo in the film, donated $10 million in its first year. For Cave and Dihal, Transcendence is a perfect example of the multiple entanglements between popular culture, academic research, industrial production, and âthe billionaire-funded fight to shape the future.â
On the London leg of his world tour last year, Altman was asked what heâd meant when he tweeted : âAI is the tech the world has always wanted.â Standing at the back of the room that day, behind an audience of hundreds, I listened to him offer his own kind of origin story: âI was, like, a very nervous kid. I read a lot of sci-fi. I spent a lot of Friday nights home, playing on the computer. But I was always really interested in AI and I thought itâd be very cool.â He went to college, got rich, and watched as neural networks became better and better. âThis can be tremendously good but also could be really bad. What are we going to do about that?â he recalled thinking in 2015. âI ended up starting OpenAI.â
Okay, you get it: No one can agree on what AI is. But what everyone does seem to agree on is that the current debate around AI has moved far beyond the academic and the scientific. There are political and moral components in playâwhich doesnât help with everyone thinking everyone else is wrong.
Untangling this is hard. It can be difficult to see whatâs going on when some of those moral views take in the entire future of humanity and anchor them in a technology that nobody can quite define.
But we can't just throw our hands up and walk away. Because no matter what this technology is, itâs coming, and unless you live under a rock, youâll use it in one form or another. And the form that technology takesâand the problems it both solves and createsâwill be shaped by the thinking and the motivations of people like the ones you just read about. In particular, by the people with the most power, the most cash, and the biggest megaphones.
Which leads me to the TESCREALists. Wait, come back! I realize itâs unfair to introduce yet another new concept so late in the game. But to understand how the people in power may mold the technologies they build, and how they explain them to the worldâs regulators and lawmakers, you need to really understand their mindset.
Gebru, who founded the Distributed AI Research Institute after leaving Google, and Ămile Torres, a philosopher and historian at Case Western Reserve University, have traced the influence of several techno-utopian belief systems on Silicon Valley. The pair argue that to understand whatâs going on with AI right nowâboth why companies such as Google DeepMind and OpenAI are in a race to build AGI and why doomers like Tegmark and Hinton warn of a coming catastropheâthe field must be seen through the lens of what Torres has dubbed the TESCREAL framework .
The clunky acronym (pronounced tes-cree-all ) replaces an even clunkier list of labels: transhumanism , extropianism , singularitarianism , cosmism , rationalism , effective altruism , and longtermism . A lot has been written (and will be written) about each of these worldviews, so Iâll spare you here. (There are rabbit holes within rabbit holes for anyone wanting to dive deeper. Pick your forum and pack your spelunking gear.)
This constellation of overlapping ideologies is attractive to a certain kind of galaxy-brain mindset common in the Western tech world. Some anticipate human immortality; others predict humanityâs colonization of the stars. The common tenet is that an all-powerful technologyâAGI or superintelligence, choose your teamâis not only within reach but inevitable. You can see this in the do-or-die attitude thatâs ubiquitous inside cutting-edge labs like OpenAI: If we donât make AGI, someone else will.
Whatâs more, TESCREALists believe that AGI could not only fix the worldâs problems but level up humanity. âThe development and proliferation of AIâfar from a risk that we should fearâis a moral obligation that we have to ourselves, to our children and to our future,â Andreessen wrote in a much-dissected manifesto last year. I have been told many times over that AGI is the way to make the world a better placeâby Demis Hassabis , CEO and cofounder of Google DeepMind; by Mustafa Suleyman , CEO of the newly minted Microsoft AI and another cofounder of DeepMind; by Sutskever , Altman , and more.
But as Andreessen noted, itâs a yin-yang mindset. The flip side of techno-utopia is techno-hell. If you believe that you are building a technology so powerful that it will solve all the worldâs problems, you probably also believe thereâs a non-zero chance it will all go very wrong. When asked at the World Government Summit in February what keeps him up at night, Altman replied: âItâs all the sci-fi stuff.â
Itâs a tension that Hinton has been talking up for the last year. Itâs what companies like Anthropic claim to address. Itâs what Sutskever is focusing on in his new lab , and what he wanted a special in-house team at OpenAI to focus on last year before disagreements over the way the company balanced risk and reward led most members of that team to leave.
Sure, doomerism is part of the spin. (âClaiming that you have created something that is super-intelligent is good for sales figures,â says Dihal. âItâs like, âPlease, someone stop me from being so good and so powerful.ââ) But boom or doom, exactly what (and whose) problems are these guys supposedly solving? Are we really expected to trust what they build and what they tell our leaders?
Gebru and Torres (and others) are adamant: No, we should not. They are highly critical of these ideologies and how they may influence the development of future technology, especially AI. Fundamentally, they link several of these worldviewsâwith their common focus on âimprovingâ humanityâto the racist eugenics movements of the 20th century.
One danger, they argue, is that a shift of resources toward the kind of technological innovations that these ideologies demand, from building AGI to extending life spans to colonizing other planets, will ultimately benefit people who are Western and white at the cost of billions of people who arenât. If your sight is set on fantastical futures, itâs easy to overlook the present-day costs of innovation, such as labor exploitation, the entrenchment of racist and sexist bias, and environmental damage. Â
âAre we trying to build a tool thatâs useful to us in some way?â asks Bender, reflecting on the casualties of this race to AGI. If so, whoâs it for, how do we test it, how well does it work? âBut if what weâre building it for is just so that we can say that weâve done it, thatâs not a goal that I can get behind. Thatâs not a goal thatâs worth billions of dollars.â
Bender says that seeing the connections between the TESCREAL ideologies is what made her realize there was something more to these debates. âTangling with those people wasââ she stops. âOkay, thereâs more here than just academic ideas. Thereâs a moral code tied up in it as well.â
Of course, laid out like this without nuance, it doesnât sound as if weâas a society, as individualsâare getting the best deal. It also all sounds rather silly. When Gebru described parts of the TESCREAL bundle in a talk last year, her audience laughed. Itâs also true that few people would identify themselves as card-carrying students of these schools of thought, at least in their extremes.
But if we donât understand how those building this tech approach it, how can we decide what deals we want to make? What apps we decide to use, what chatbots we want to give personal information to, what data centers we support in our neighborhoods, what politicians we want to vote for?
It used to be like this: There was a problem in the world, and we built something to fix it. Here, everything is backward: The goal seems to be to build a machine that can do everything, and to skip the slow, hard work that goes into figuring out what the problem is before building the solution.
And as Gebru said in that same talk, âA machine that solves all problems: if thatâs not magic, what is it?â
When asked outright what AI is, a lot of people dodge the question. Not Suleyman. In April, the CEO of Microsoft AI stood on the TED stage and told the audience what heâd told his six-year-old nephew in response to that question. The best answer he could give, Suleyman explained, was that AI was âa new kind of digital speciesââa technology so universal, so powerful, that calling it a tool no longer captured what it could do for us.
âOn our current trajectory, we are heading toward the emergence of something we are all struggling to describe, and yet we cannot control what we donât understand,â he said. âAnd so the metaphors, the mental models, the namesâthese all matter if we are to get the most out of AI whilst limiting its potential downsides.â
Language matters! I hope thatâs clear from the twists and turns and tantrums weâve been through to get to this point. But I also hope youâre asking: Whose language? And whose downsides? Suleyman is an industry leader at a technology giant that stands to make billions from its AI products. Describing the technology behind those products as a new kind of species conjures something wholly unprecedented, something with agency and capabilities that we have never seen before. That makes my spidey sense tingle. You?
I canât tell you if thereâs magic here (ironically or not). And I canât tell you how math can realize what Bubeck and many others see in this technology (no one can yet). Youâll have to make up your own mind. But I can pull back the curtain on my own point of view.
Writing about GPT-3 back in 2020, I said that the greatest trick AI ever pulled was convincing the world it exists. I still think that: We are hardwired to see intelligence in things that behave in certain ways, whether itâs there or not. In the last few years, the tech industry has found reasons of its own to convince us that AI exists, too. This makes me skeptical of many of the claims made for this technology.
With large language modelsâvia their smiley-face masksâwe are confronted by something weâve never had to think about before. âItâs taking this hypothetical thing and making it really concrete,â says Pavlick. âIâve never had to think about whether a piece of language required intelligence to generate because Iâve just never dealt with language that didnât.â
AI is many things. But I donât think itâs humanlike. I donât think itâs the solution to all (or even most) of our problems. It isnât ChatGPT or Gemini or Copilot. It isnât neural networks. Itâs an idea, a vision, a kind of wish fulfillment. And ideas get shaped by other ideas, by morals, by quasi-religious convictions, by worldviews, by politics, and by gut instinct. âArtificial intelligenceâ is a helpful shorthand to describe a raft of different technologies. But AI is not one thing; it never has been, no matter how often the branding gets seared into the outside of the box.Â
âThe truth is these wordsââintelligence, reasoning, understanding, and moreââwere defined before there was a need to be really precise about it,â says Pavlick. âI donât really like when the question becomes âDoes the model understandâyes or no?â because, well, I donât know. Words get redefined and concepts evolve all the time.â
The next big thing is AI tools that can do more complex tasks. Hereâs how they will work.
AlphaProof and AlphaGeometry 2 are steps toward building systems that can reason, which could unlock exciting new capabilities.
Found everywhere from airplanes to grocery stores, prepared meals are usually packed by hand. AI-powered robotics is changing that.
The voice-enabled chatbot will be available to a small group of people today, and to all ChatGPT Plus users in the fall.Â
Get the latest updates from mit technology review.
Discover special offers, top stories, upcoming events, and more.
Thank you for submitting your email!
It looks like something went wrong.
Weâre having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at [email protected] with a list of newsletters youâd like to receive.
Humans and machines: a match made in productivity  heaven. Our species wouldnât have gotten very far without our mechanized workhorses. From the wheel that revolutionized agriculture to the screw that held together increasingly complex construction projects to the robot-enabled assembly lines of today, machines have made life as we know it possible. And yet, despite their seemingly endless utility, humans have long feared machinesâmore specifically, the possibility that machines might someday acquire human intelligence  and strike out on their own.
Sven Blumberg is a senior partner in McKinsey’s Düsseldorf office; Michael Chui is a partner at the McKinsey Global Institute and is based in the Bay Area office, where Lareina Yee is a senior partner; Kia Javanmardian is a senior partner in the Chicago office, where Alex Singla , the global leader of QuantumBlack, AI by McKinsey, is also a senior partner; Kate Smaje and Alex Sukharevsky are senior partners in the London office.
But we tend to view the possibility of sentient machines with fascination as well as fear. This curiosity has helped turn science fiction into actual science. Twentieth-century theoreticians, like computer scientist and mathematician Alan Turing, envisioned a future where machines could perform functions faster than humans. The work of Turing and others soon made this a reality. Personal calculators became widely available in the 1970s, and by 2016, the US census showed that 89 percent of American households had a computer. Machinesâ smart machines at thatâare now just an ordinary part of our lives and culture.
Those smart machines are also getting faster and more complex. Some computers have now crossed the exascale threshold, meaning they can perform as many calculations in a single second as an individual could in 31,688,765,000 years . And beyond computation, which machines have long been faster at than we have, computers and other devices are now acquiring skills and perception that were once unique to humans and a few other species.
QuantumBlack, McKinseyâs AI arm, helps companies transform using the power of technology, technical expertise, and industry experts. With thousands of practitioners at QuantumBlack (data engineers, data scientists, product managers, designers, and software engineers) and McKinsey (industry and domain experts), we are working to solve the worldâs most important AI challenges. QuantumBlack Labs is our center of technology development and client innovation, which has been driving cutting-edge advancements and developments in AI through locations across the globe.
AI is a machineâs ability to perform the cognitive functions we associate with human minds, such as perceiving, reasoning, learning, interacting with the environment, problem-solving, and even exercising creativity. Youâve probably interacted with AI even if you donât realize itâvoice assistants like Siri and Alexa are founded on AI technology, as are some customer service chatbots that pop up to help you navigate websites.
Applied AI âsimply, artificial intelligence applied to real-world problemsâhas serious implications for the business world. By using artificial intelligence, companies have the potential to make business more efficient and profitable. But ultimately, the value of AI isnât in the systems themselves. Rather, itâs in how companies use these systems to assist humansâand their ability to explain to shareholders and the public what these systems doâin a way that builds trust and confidence.
For more about AI, its history, its future, and how to apply it in business, read on.
Learn more about QuantumBlack, AI by McKinsey .
What is machine learning.
Machine learning is a form of artificial intelligence that can adapt to a wide range of inputs, including large sets of historical data, synthesized data, or human inputs. (Some machine learning algorithms are specialized in training themselves to detect patterns; this is called deep learning . See Exhibit 1.) These algorithms can detect patterns and learn how to make predictions and recommendations by processing data, rather than by receiving explicit programming instruction. Some algorithms can also adapt in response to new data and experiences to improve over time.
The volume and complexity of data that is now being generated, too vast for humans to process and apply efficiently, has increased the potential of machine learning, as well as the need for it. In the years since its widespread deployment, which began in the 1970s, machine learning has had an impact on a number of industries, including achievements in medical-imaging analysis  and high-resolution weather forecasting.
The volume and complexity of data that is now being generated, too vast for humans to process and apply efficiently, has increased the potential of machine learning, as well as the need for it.
Deep learning is a more advanced version of machine learning that is particularly adept at processing a wider range of data resources (text as well as unstructured data including images), requires even less human intervention, and can often produce more accurate results than traditional machine learning. Deep learning uses neural networksâbased on the ways neurons interact in the human brain âto ingest data and process it through multiple neuron layers that recognize increasingly complex features of the data. For example, an early layer might recognize something as being in a specific shape; building on this knowledge, a later layer might be able to identify the shape as a stop sign. Similar to machine learning, deep learning uses iteration to self-correct and improve its prediction capabilities. For example, once it âlearnsâ what a stop sign looks like, it can recognize a stop sign in a new image.
Case study: vistra and the martin lake power plant.
Vistra is a large power producer in the United States, operating plants in 12 states with a capacity to power nearly 20 million homes. Vistra has committed to achieving net-zero emissions by 2050. In support of this goal, as well as to improve overall efficiency, QuantumBlack, AI by McKinsey worked with Vistra to build and deploy an AI-powered heat rate optimizer (HRO) at one of its plants.
âHeat rateâ is a measure of the thermal efficiency of the plant; in other words, itâs the amount of fuel required to produce each unit of electricity. To reach the optimal heat rate, plant operators continuously monitor and tune hundreds of variables, such as steam temperatures, pressures, oxygen levels, and fan speeds.
Vistra and a McKinsey team, including data scientists and machine learning engineers, built a multilayered neural network model. The model combed through two yearsâ worth of data at the plant and learned which combination of factors would attain the most efficient heat rate at any point in time. When the models were accurate to 99 percent or higher and run through a rigorous set of real-world tests, the team converted them into an AI-powered engine that generates recommendations every 30 minutes for operators to improve the plantâs heat rate efficiency. One seasoned operations manager at the companyâs plant in Odessa, Texas, said, âThere are things that took me 20 years to learn about these power plants. This model learned them in an afternoon.â
Overall, the AI-powered HRO helped Vistra achieve the following:
Read more about the Vistra story here .
Generative AI (gen AI) is an AI model that generates content in response to a prompt. Itâs clear that generative AI tools like ChatGPT and DALL-E (a tool for AI-generated art) have the potential to change how a range of jobs  are performed. Much is still unknown about gen AIâs potential, but there are some questions we can answerâlike how gen AI models are built, what kinds of problems they are best suited to solve, and how they fit into the broader category of AI and machine learning.
For more on generative AI and how it stands to affect business and society, check out our Explainer â What is generative AI? â
The term âartificial intelligenceâ was coined in 1956 Â by computer scientist John McCarthy for a workshop at Dartmouth. But he wasnât the first to write about the concepts we now describe as AI. Alan Turing introduced the concept of the â imitation game â in a 1950 paper. Thatâs the test of a machineâs ability to exhibit intelligent behavior, now known as the âTuring test.â He believed researchers should focus on areas that donât require too much sensing and action, things like games and language translation. Research communities dedicated to concepts like computer vision, natural language understanding, and neural networks are, in many cases, several decades old.
MIT physicist Rodney Brooks shared details on the four previous stages of AI:
Symbolic AI (1956). Symbolic AI is also known as classical AI, or even GOFAI (good old-fashioned AI). The key concept here is the use of symbols and logical reasoning to solve problems. For example, we know a German shepherd is a dog , which is a mammal; all mammals are warm-blooded; therefore, a German shepherd should be warm-blooded.
The main problem with symbolic AI is that humans still need to manually encode their knowledge of the world into the symbolic AI system, rather than allowing it to observe and encode relationships on its own. As a result, symbolic AI systems struggle with situations involving real-world complexity. They also lack the ability to learn from large amounts of data.
Symbolic AI was the dominant paradigm of AI research until the late 1980s.
Neural networks (1954, 1969, 1986, 2012). Neural networks are the technology behind the recent explosive growth of gen AI. Loosely modeling the ways neurons interact in the human brain , neural networks ingest data and process it through multiple iterations that learn increasingly complex features of the data. The neural network can then make determinations about the data, learn whether a determination is correct, and use what it has learned to make determinations about new data. For example, once it âlearnsâ what an object looks like, it can recognize the object in a new image.
Neural networks were first proposed in 1943 in an academic paper by neurophysiologist Warren McCulloch and logician Walter Pitts. Decades later, in 1969, two MIT researchers mathematically demonstrated that neural networks could perform only very basic tasks. In 1986, there was another reversal, when computer scientist and cognitive psychologist Geoffrey Hinton and colleagues solved the neural network problem presented by the MIT researchers. In the 1990s, computer scientist Yann LeCun made major advancements in neural networksâ use in computer vision, while JĂźrgen Schmidhuber advanced the application of recurrent neural networks as used in language processing.
In 2012, Hinton and two of his students highlighted the power of deep learning. They applied Hintonâs algorithm to neural networks with many more layers than was typical, sparking a new focus on deep neural networks. These have been the main AI approaches of recent years.
Traditional robotics (1968). During the first few decades of AI, researchers built robots to advance research. Some robots were mobile, moving around on wheels, while others were fixed, with articulated arms. Robots used the earliest attempts at computer vision to identify and navigate through their environments or to understand the geometry of objects and maneuver them. This could include moving around blocks of various shapes and colors. Most of these robots, just like the ones that have been used in factories for decades, rely on highly controlled environments with thoroughly scripted behaviors that they perform repeatedly. They have not contributed significantly to the advancement of AI itself.
But traditional robotics did have significant impact in one area, through a process called âsimultaneous localization and mappingâ (SLAM). SLAM algorithms helped contribute to self-driving cars and are used in consumer products like vacuum cleaning robots and quadcopter drones. Today, this work has evolved into behavior-based robotics, also referred to as haptic technology because it responds to human touch.
Learn more about QuantumBlack, AI by McKinsey .
The term âartificial general intelligenceâ (AGI) was coined to describe AI systems that possess capabilities comparable to those of a human . In theory, AGI could someday replicate human-like cognitive abilities including reasoning, problem-solving, perception, learning, and language comprehension. But letâs not get ahead of ourselves: the key word here is âsomeday.â Most researchers and academics believe we are decades away from realizing AGI; some even predict we wonât see AGI this century, or ever. Rodney Brooks, an MIT roboticist and cofounder of iRobot, doesnât believe AGI will arrive until the year 2300 .
The timing of AGIâs emergence may be uncertain. But when it does emergeâand it likely willâitâs going to be a very big deal, in every aspect of our lives. Executives should begin working to understand the path to machines achieving human-level intelligence now and making the transition to a more automated world.
For more on AGI, including the four previous attempts at AGI, read our Explainer .
Narrow AI is the application of AI techniques to a specific and well-defined problem, such as chatbots like ChatGPT, algorithms that spot fraud in credit card transactions, and natural-language-processing engines that quickly process thousands of legal documents. Most current AI applications fall into the category of narrow AI. AGI is, by contrast, AI thatâs intelligent enough to perform a broad range of tasks.
AI is a big story for all kinds of businesses, but some companies are clearly moving ahead of the pack . Our state of AI in 2022 survey showed that adoption of AI models has more than doubled since 2017âand investment has increased apace. Whatâs more, the specific areas in which companies see value from AI have evolved, from manufacturing and risk to the following:
One group of companies is pulling ahead of its competitors. Leaders of these organizations consistently make larger investments in AI, level up their practices to scale faster, and hire and upskill the best AI talent. More specifically, they link AI strategy to business outcomes and â industrialize â AI operations by designing modular data architecture that can quickly accommodate new applications.
We have yet to see the longtail effect of gen AI models. This means there are some inherent risks involved in using themâboth known and unknown.
The outputs gen AI models produce may often sound extremely convincing. This is by design. But sometimes the information they generate is just plain wrong. Worse, sometimes itâs biased (because itâs built on the gender, racial, and other biases of the internet and society more generally).
It can also be manipulated to enable unethical or criminal activity. Since gen AI models burst onto the scene, organizations have become aware of users trying to âjailbreakâ the modelsâthat means trying to get them to break their own rules and deliver biased, harmful, misleading, or even illegal content. Gen AI organizations are responding to this threat in two ways: for one thing, theyâre collecting feedback from users on inappropriate content. Theyâre also combing through their databases, identifying prompts that led to inappropriate content, and training the model against these types of generations.
But awareness and even action donât guarantee that harmful content wonât slip the dragnet. Organizations that rely on gen AI models should be aware of the reputational and legal risks involved in unintentionally publishing biased, offensive, or copyrighted content.
These risks can be mitigated, however, in a few ways. âWhenever you use a model,â says McKinsey partner Marie El Hoyek, âyou need to be able to counter biases  and instruct it not to use inappropriate or flawed sources, or things you donât trust.â How? For one thing, itâs crucial to carefully select the initial data used to train these models to avoid including toxic or biased content. Next, rather than employing an off-the-shelf gen AI model, organizations could consider using smaller, specialized models. Organizations with more resources could also customize a general model based on their own data to fit their needs and minimize biases.
Itâs also important to keep a human in the loop (that is, to make sure a real human checks the output of a gen AI model before it is published or used) and avoid using gen AI models for critical decisions, such as those involving significant resources or human welfare.
It canât be emphasized enough that this is a new field. The landscape of risks and opportunities is likely to continue to change rapidly in the coming years. As gen AI becomes increasingly incorporated into business, society, and our personal lives, we can also expect a new regulatory climate to take shape. As organizations experimentâand create valueâwith these tools, leaders will do well to keep a finger on the pulse of regulation and risk.
The Blueprint for an AI Bill of Rights, prepared by the US government in 2022, provides a framework for how government, technology companies, and citizens can collectively ensure more accountable AI. As AI has become more ubiquitous, concerns have surfaced  about a potential lack of transparency surrounding the functioning of gen AI systems, the data used to train them, issues of bias and fairness, potential intellectual property infringements, privacy violations, and more. The Blueprint comprises five principles that the White House says should âguide the design, use, and deployment of automated systems to protect [users] in the age of artificial intelligence.â They are as follows:
At present, more than 60 countries or blocs have national strategies governing the responsible use of AIÂ (Exhibit 2). These include Brazil, China, the European Union, Singapore, South Korea, and the United States. The approaches taken vary from guidelines-based approaches, such as the Blueprint for an AI Bill of Rights in the United States, to comprehensive AI regulations that align with existing data protection and cybersecurity regulations, such as the EUâs AI Act, due in 2024.
There are also collaborative efforts between countries to set out standards for AI use. The USâEU Trade and Technology Council is working toward greater alignment between Europe and the United States. The Global Partnership on Artificial Intelligence, formed in 2020, has 29 members including Brazil, Canada, Japan, the United States, and several European countries.
Even though AI regulations are still being developed, organizations should act now to avoid legal, reputational, organizational, and financial risks. In an environment of public concern, a misstep could be costly. Here are four no-regrets, preemptive actions organizations can implement today:
Most organizations are dipping a toe into the AI poolânot cannonballing. Slow progress toward widespread adoption is likely due to cultural and organizational barriers. But leaders who effectively break down these barriers will be best placed to capture the opportunities of the AI era. Andâcruciallyâcompanies that canât take full advantage of AI are already being sidelined by those that can, in industries like auto manufacturing and financial services.
To scale up AI, organizations can make three major shifts :
Learn more about QuantumBlack, AI by McKinsey , and check out AI-related job opportunities if youâre interested in working at McKinsey.
Articles referenced:
This article was updated in April 2024; it was originally published in April 2023.
Related articles.
Problems in Artificial Intelligence (AI) come in different forms, each with its own set of challenges and potential for innovation. From image recognition to natural language processing, AI problems exhibit distinct characteristics that shape the strategies and techniques used to tackle them effectively. In this article, we delve into the fundamental characteristics of AI problems, providing light on what makes them so fascinating and formidable.
Table of Content
Addressing the challenges of ai problems, examples of ai applications and challenges across domains, characteristics of artificial intelligence problems – faqs.
Before exploring the characteristics, let’s clarify some essential AI concepts:
By understanding these key terminologies, we can better grasp the characteristics of AI problems and the techniques used to address them. These concepts form the foundation of AI problem-solving and provide the framework for developing innovative solutions to real-world challenges.
Let’s explore the core characteristics that differentiate AI problems:
These characteristics collectively shape the challenges and opportunities involved in developing and deploying AI systems across various domains and applications.
The characteristics of AI problems present unique challenges that require innovative approaches to solution development. Some of the key aspects to consider in tackling these challenges include:
By addressing these challenges through innovative methodologies and interdisciplinary collaboration, we can harness the full potential of AI to solve complex problems and drive societal progress.
Problem: A delivery robot navigating a busy warehouse to locate and retrieve a specific item.
Characteristics:
Problem: A sentiment analysis system in NLP classifying customer reviews as positive, negative, or neutral.
Problem: A medical image recognition system in Computer Vision designed to detect tumors in X-rays or MRI scans.
The premises of AI-based problems â complexity, uncertainty, subjectivity, and more, â bring an unavoidable difficulty to the table. These features must be known for building appropriate AI because this is necessary. Through the use of machine learning, probabilistic reasoning, and knowledge representation which are referred to as the tools in AI development alongside the ethical considerations, these designers and scientists can face such complexities well and give shape to AI in a way that will be beneficial to society.
The core characteristics of AI problems include complexity, uncertainty and ambiguity, lack of clear problem definition, non-linearity, dynamism, subjectivity, interactivity, context sensitivity, and ethical considerations.
Problem-solving in AI involves creating algorithms and methods that enable machines to imitate human capabilities of logical and reasonable thinking in certain situations.
Search space refers to the area where an agent involved in the problem-solving process can examine all the possible states or settings with the hope of discovering a solution.
AI algorithms are designed to handle unclear circumstances and make decisions based on imperfect data or noisy information.
Examples include robotics (e.g., delivery robots navigating busy warehouses), natural language processing (e.g., sentiment analysis of customer reviews), and computer vision (e.g., medical image recognition for detecting tumors).
Ethical considerations are crucial in AI development to address issues such as bias, justice, privacy, and responsibility, ensuring that AI technologies are deployed responsibly and ethically.
Similar reads.
When ChatGPT and other large language models began entering the mainstream two years ago, it quickly became apparent the technology could excel at certain business functions, yet it was less clear how well artificial intelligence could handle more creative tasks.
Sure, generative AI can summarize the content of an article, identify patterns in data, and produce derivative workâsay, a song in the style of Taylor Swift or a poem in the mood of Langston Hughesâbut can the technology develop truly innovative ideas?
Specifically, Harvard Business School Assistant Professor Jacqueline Ng Lane was determined to find out âhow AI handled open-ended problems that havenât been solved yetâthe kind where you need diverse expertise and perspectives to make progress.â
In a working paper published in the journal Organization Science , Lane and colleagues compare ChatGPTâs creative potential to crowdsourced innovations produced by people. Ultimately, the researchers found that both humans and AI have their strengthsâpeople contribute more novel suggestions while AI creates more practical solutionsâyet some of the most promising ideas are the ones people and machines develop together.
Lane cowrote the paper with LĂŠonard Bouissioux, assistant professor at the University of Washingtonâs Foster School of Business; Miaomiao Zhang, an HBS doctoral student, Karim Lakhani, the Dorothy & Michael Hintze Professor of Business Administration at HBS; and Vladimir Jacimovic, CEO and founder of ContinuumLab.ai and executive fellow at HBS.
Any innovation process usually starts with brainstorming, says Lane, whose research has long looked at how creative ideas are produced.
âYou start with defining the problem, then you generate ideas, then you evaluate them and choose which ones to implement.â
âItâs like a funnel,â she says. âYou start with defining the problem, then you generate ideas, then you evaluate them and choose which ones to implement.â
Research has shown that crowdsourcing can be an effective way to generate initial ideas. However, the approach can be time-consuming and expensive. Creative teams typically offer incentives to respondents for their ideas. Then teams often must wait for input and then comb through ideas to come up with the most promising leads.
An off-the-shelf large language model such as ChatGPT, however, is free or low cost for end users, and can generate an infinite number of ideas quickly, Lane says. But are the ideas any good?
To find out, Lane and her fellow researchers asked people to come up with business ideas for the sustainable circular economy, in which products are reused or recycled to make new products. They disseminated a request on an online platform, offering $10 for participating and $1,000 for the best idea. Hereâs part of their request:
We would like you to submit your circular economy idea, which can be a unique new idea or an existent idea that is used in the industry.
Here is an example: Car sharing in order to reduce the carbon footprint associated with driving. âŚ
Submit your real-life use cases on how companies can implement the circular economy in their businesses. New ideas are also welcome, even if they are âmoonshots.â
The researchers asked for ideas that would involve âsharing, leasing, reusing, repairing, refurbishing [or] recycling existing materials and products as long as possible.â Suggestions would be scored for uniqueness, environmental benefits, profit potential, and feasibility.
Some 125 people replied with contributions, offering insights from a variety of industries and professional backgrounds. One, for example, proposed a dynamic pricing algorithm for supermarkets to cut down on food waste, while another suggested a mobile app that could store receipts to reduce paper waste.
At the same time, the research team employed prompt engineering techniques to craft a variety of AI prompts. Using these carefully designed prompts, they generated several hundred additional solutions through ChatGPT. The team strategically modified their prompts to:
The team then recruited some 300 evaluators well-versed in the circular economy to evaluate a randomized selection of the ideas based on the scoring criteria.
The evaluators judged the human solutions as more novel, employing more unique âout of the boxâ thinking. However, they found the AI-generated ideas to be more valuable and feasible.
For example, one participant from Africa proposed creating interlocking bricks using foundry dust and waste plastic, creating a new construction material and cutting down on air pollution at the same time. âThe evaluators said, âWow, this is really innovative, but it would never work,ââ Lane says.
âWe were surprised at how powerful these technologies were.â
One ChatGPT response, meanwhile, created an idea to convert food waste into biogas, a renewable energy source that could be used for electricity and fertilizer. Not the most novel idea, the researchers noted, but one that could be implemented and might show a clear financial return.
âWe were surprised at how powerful these technologies were,â Lane says, âespecially in these early stages in the creative process.â
The âbestâ ideas, Lane says, may come from those in which humans and AI collaborate, with people engineering prompts and continually working with AI to develop more original ideas.
âWe consistently achieved higher quality results when AI would come up with an idea and then we had an instruction that said: Make sure before you create your next idea, itâs different from all the ones before it,â Lane explains.
Additional prompts increased the novelty of the ideas, generating everything from waste-eating African flies to beverage containers tracked by smart chips that instantly pay consumers for recycling them.
Based on the findings, the researchers suggest business leaders keep a few points in mind when implementing AI to develop creative solutions:
The most productive way to use generative AI, the research suggests, is to combine the novelty that people excel at with the practicality of the machine. Says Lane, âWe still need to put our minds toward being forward-looking and envisioning new things as we are guiding the outputs of AI to create the best solutions.â
How transparency sped innovation in a $13 billion wireless sector.
Information & contributors, bibliometrics & citations, view options.
Computing methodologies
Control methods
Search methodologies
General and reference
Document types
Reference works
Theory of computation
Design and analysis of algorithms
Algorithm design techniques
Dynamic programming
Methods and standards for research on explainable artificial intelligence: lessons from intelligent tutoring systems.
The DARPA Explainable Artificial Intelligence (AI) (XAI) Program focused on generating explanations for AI programs that use machine learning techniques. This article highlights progress during the DARPA Program (2017â2021) relative to research ...
Lessons learned in the work on intelligent tutoring systems that apply to system design in Explainable AI. image image
Artificial intelligence (AI) is the Science and Engineering domain concerned with the theory and practice of developing systems that exhibit the characteristics we associate with intelligence in human behavior. Starting with a brief history of ...
Information, published in.
McGraw-Hill Pub. Co.
Contributors, other metrics, bibliometrics, article metrics.
Login options.
Check if you have access through your login credentials or your institution to get full access on this article.
Share this publication link.
Copying failed.
Affiliations, export citations.
We are preparing your search results for download ...
We will inform you here when the file is ready.
Your file of search results citations is now ready.
Your search export query has expired. Please try again.
share this!
August 31, 2024
This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:
fact-checked
peer-reviewed publication
trusted source
by Scott Lilwall, Alberta Machine Intelligence Institute
A team of Alberta Machine Intelligence Institute (Amii) researchers has revealed more about a mysterious problem in machine learningâa discovery that might be a major step towards building advanced AI that can function effectively in the real world.
The paper, titled "Loss of Plasticity in Deep Continual Learning," is published in Nature . It was authored by Shibhansh Dohare, J. Fernando Hernandez-Garcia, Qingfeng Lan, Parash Rahman, as well as Amii Fellows & Canada CIFAR AI Chairs A. Rupam Mahmood and Richard S. Sutton.
In their paper, the team explores a vexing problem that has long been suspected in deep learning models but has not received much attention: for some reason, many deep learning agents engaged in continual learning lose the ability to learn and have their performance degrade drastically.
"We have established that there is definitely a problem with current deep learning," said Mahmood. "When you need to adapt continually, we have shown that deep learning eventually just stops working. So effectively you can't keep learning."
He points out that not only does the AI agent lose the ability to learn new things, but it also fails to relearn what it learned in the past after it is forgotten. The researchers dubbed this phenomenon "loss of plasticity," borrowing a term from neuroscience where plasticity refers to the brain's ability to adapt its structure and form new neural connections.
The researchers say that loss of plasticity is a major challenge to developing AI that can effectively handle the complexity of the world and would need to be solved to develop human-level artificial intelligence.
Many existing models aren't designed for continual learning. Sutton references ChatGPT as an example; it doesn't learn continually. Instead, its creators train the model for a certain amount of time. When training is over, the model is then deployed without further learning.
Even with this approach, merging new and old data into a model's memory can be difficult. Most of the time, it is more effective to just start from scratch, erasing the memory, and training the model on everything again. For large models like ChatGPT, that process can take a lot of time and cost millions of dollars each time.
It also limits the kind of things a model can do. For fast-moving environments that are constantly changing, like financial markets for instance, Sutton says continual learning is a necessity.
The first step to addressing loss of plasticity, according to the team, was to show that it happens and it matters. The problem is one that was "hiding in plain sight"âthere were hints suggesting that loss of plasticity could be a widespread problem in deep learning, but very little research had been done to actually investigate it.
Rahman says he first became interested in exploring the problem because he kept seeing hints of the issueâand that intrigued him.
"I'd be reading through a paper, and you'd see something in the appendices about how performance dropped off. And then you'd see it in another paper a while later," he said.
The research team designed several experiments to search for loss of plasticity in deep learning systems. In supervised learning, they trained networks in sequences of classification tasks. For example, a network would learn to differentiate between cats and dogs in the first task, then between beavers and geese on the second task, and so on for many tasks. They hypothesized that as the networks lost their ability to learn, their ability to differentiate would decrease in each subsequent task.
And that's exactly what happened.
"We used several different data sets to test, to show that it could be widespread. It really shows that it isn't happening in a little corner of deep learning, " Sutton said.
With the problem established, the researchers then had to ask: could it be solved? Was loss of plasticity an inherent issue for continual deep-learning networks, or was there a way to allow them to keep learning?
They found some hope in a method based on modifying one of the fundamental algorithms that make neural networks work: backpropagation.
Neural networks are built to echo the structure of the human brain: They contain units that can pass information and make connections with other units, just like neurons. Individual units can pass information along to other layers of units, which do the same. All of this contributes to the network's overall output.
However, when adapting the connection strength or "weights" of the network with backpropagation, a lot of the time these units will calculate outputs that don't actually contribute to learning. They also won't learn new outputs, so they will become dead weight to the network and stop contributing to the learning process.
Over long-term continual learning, as many as 90% of a network's units might become dead, Mahmood notes. And when enough stops contributing, the model loses plasticity.
So, the team came up with a modified method that they call "continual backpropagation."
Dohare says that it differs from backpropagation in a key way: While backpropagation randomly initializes the units only at the very beginning, continual backpropagation does so continually. Once in a while, during learning, it selects some of the useless units, like the dead ones, and reinitializes them with random weights. By using continual backpropagation, they find that models can continually learn much longer, sometimes seemingly indefinitely.
Sutton says that other researchers might come up with better solutions to tackle loss of plasticity, but their continual backprop approach at least shows the problem can be solved, and this tricky problem isn't inherent to deep networks.
He's hopeful that the team's work will bring more attention to loss of plasticity and encourage other researchers to examine the issue.
"We established this problem in a way that people sort of have to acknowledge it. The field is gradually getting more willing to acknowledge that deep learning, despite its successes, has fundamental issues that need addressing," he said. "So, we're hoping this will open up this question a little bit."
Explore further
Feedback to editors
3 hours ago
Aug 31, 2024
Aug 30, 2024
Aug 29, 2024
Related stories.
Aug 22, 2024
Oct 22, 2021
May 15, 2019
Sep 1, 2023
Nov 8, 2023
Aug 30, 2022
Let us know if there is a problem with our content.
Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).
Please select the most appropriate category to facilitate processing of your request
Thank you for taking time to provide your feedback to the editors.
Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.
Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Tech Xplore in any form.
This site uses cookies to assist with navigation, analyse your use of our services, collect data for ads personalisation and provide content from third parties. By using our site, you acknowledge that you have read and understand our Privacy Policy and Terms of Use .
Paul Solman Paul Solman
Ryan Connelly Holmes Ryan Connelly Holmes
Leave your feedback
The development of artificial intelligence is speeding up so quickly that it was addressed briefly at both Republican and Democratic conventions. Science fiction has long theorized about the ways in which machines might one day usurp their human overlords. As the capabilities of modern AI grow, Paul Solman looks at the existential threats some experts fear and that some see as hyperbole.
Notice: Transcripts are machine and human generated and lightly edited for accuracy. They may contain errors.
Geoff Bennett:
The development of artificial intelligence is speeding up so quickly that it was addressed briefly at both political conventions, including the Democratic gathering this week.
Of course, science fiction writers and movies have long theorized about the ways in which machines might one day usurp their human overlords.
As the capabilities of modern artificial intelligence grow, Paul Solman looks at the existential threats some experts fear and that some see as hyperbole.
Eliezer Yudkowsky, Founder, Machine Intelligence Research Institute:
From my perspective, there's inevitable doom at the end of this, where, if you keep on making A.I. smarter and smarter, they will kill you.
Paul Solman:
Kill you, me and everyone, predicts Eliezer Yudkowsky, tech pundit and founder back in the year 2000 of a nonprofit now called the Machine Intelligence Research Institute to explore the uses of friendly A.I.; 24 years later, do you think everybody's going to die in my lifetime, in your lifetime?
Eliezer Yudkowsky:
I would wildly guess my lifetime and even your lifetime.
Now, we have heard it before, as when the so-called Godfather of A.I., Geoffrey Hinton, warned Geoff Bennett last spring.
Geoffrey Hinton, Artificial Intelligence Pioneer:
The machines taking over is a threat for everybody. It's a threat for the Chinese and for the Americans and for the Europeans, just like a global nuclear war was.
And more than a century ago, the Czech play "R.U.R.," Rossum's Universal Robots, from which the word robot comes, dramatized the warning.
And since 1921 — that's more than 100 years ago — people have been imagining that the robots will become sentient and destroy us.
Jerry Kaplan, Author, "Generative Artificial Intelligence: What Everyone Needs to Know": That's right.
A.I. expert Stanford's Jerry Kaplan at Silicon Valley's Computer History Museum.
Jerry Kaplan:
That's created a whole mythology, which, of course, has played out in endless science fiction treatments.
Like the Terminator series.
Michael Biehn, Actor:
A new order of intelligence decided our fate in a microsecond, extermination.
Judgment Day forecast for 1997. But, hey, that's Hollywood. And look on the bright side, no rebel robots or even hoverboards or flying cars yet.
On the other hand, robots will be everywhere soon enough, as mass production drives down their cost. So will they soon turn against us?
I got news for you. There's no they there. They don't want anything. They don't need anything. We design and build these things to our own specifications. Now, that's not to say we can't build some very dangerous machines and some very dangerous tools.
Kaplan thinks what humans do with A.I. is much scarier than A.I. on its own, create super viruses, mega drones, God knows what else.
But whodunit aside, the big question still is, will A.I. bring doomsday?
A.I. Reid Hoffman avatar: I'd rate the existential threat of A.I. around a three or four out of 10.
That's the avatar of LinkedIn founder Reid Hoffman, to which we fed the question, 1 being no threat, 10 extinction. What does the real Reid Hoffman say?
Reid Hoffman, Creator, LinkedIn Corporation:
I'm going to go for two on that answer.
I'm going to tell you that your avatar said 3 to 4.
Reid Hoffman:
All right. Well, my avatar is trying to approximate something that is what kind of a bulk of educated, reasonable people would want to hear.
But, hey, even if you told me there was only a one in 10 chance of a ticking time bomb in my room, I would be out of there.
Yudkowsky started warning of rogue A.I. back in 2005. And since?
Things have gone a bit worse than hoped for. And, to be clear, the sting at the end of this is A.I. gets smarter than us, is poorly controlled, and probably humanity becomes collateral damage to its own expansion.
What's the doomsday scenario here?
It is smarter than humanity. From its perspective, it now wants to get independence of humanity. It doesn't want to be running on computers that require electricity that humans have to generate.
There are many more A.I. companies now than just OpenAI. And some of the people there are explicitly like, yes, sure like maybe it'll wipe out humanity, but I don't care about that.
And of all the world's problems, tribalism, climate change, et cetera, you think A.I. is a bigger problem?
I think climate change is unlikely to kill literally everyone, down to the last man, woman and child. At all of the top three companies, I expect the leaders would prefer not to wipe out humanity, but, as they see it, it's not up to them. They got to stay ahead of their competitors.
So what does A.I. itself think?
I asked Amica, a robot hooked up to ChatGPT. Destruction, 1 to 10? The response, a 3.
Amica, ChatGPT A.I.:
The potential for harmony and enhancement exists, but vigilance is paramount. Would you roll the dice on that?
I'm afraid we're all going to be rolling the dice on that. And I'm worried that in your answer, you're saying there's something like a 30 percent chance that A.I., artificial intelligence, will destroy humanity. Shouldn't I be scared?
Your concern is valid, Paul. The 30 percent isn't a direct prediction, but a cautious nod to the uncertainties and ethical dilemmas of handling powerful technology.
But I'm afraid that human beings may not be capable of protecting themselves from the risks of A.I.
That dread is profound. It's daunting, but not without hope. Humanity's resilience has often defied expectations.
Perhaps it's no surprise that the actual human who created ChatGPT, Sam Altman, thinks the same.
Sam Altman, CEO, OpenAI:
I believe that A.I. will be very net good, tremendously net good, but I think, like with any other tool, it'll be misused. Like, you can do great things with a hammer and you can, like, kill people with a hammer. I don't think that absolves us, or you all, or society from trying to mitigate the bad as much as we can and maximize the good.
And Reid Hoffman thinks we can maximize the good.
We have a portfolio risk. We have climate change as a possibility. We have pandemic as a possibility. We have nuclear war as a possibility. We have asteroids as a possibility. We have human world war as a possibility. We have all of these existential risks.
And you go, OK, A.I., is it also an additional existential risk? And the answer is, yes, potentially. But you look at its portfolio and say, what improves our overall portfolio? What reduces existential risk for humanity? And A.I. is one of the things that adds a lot in the positive column.
So, if you think, how do we prevent future natural or manmade pandemic, A.I. is the only way that I think can do that. And also, like, it might even help us with climate change things. So you go, OK, in the net portfolio, our existential risk may go down with A.I.
For the sake of us all, grownups, children, grandchildren, let's hope he's right.
For the "PBS News Hour" in Silicon Valley, Paul Solman.
Watch the Full Episode
Paul Solman has been a correspondent for the PBS News Hour since 1985, mainly covering business and economics.
Support Provided By: Learn more
Educate your inbox.
Subscribe to Here’s the Deal, our politics newsletter for analysis you won’t find anywhere else.
Thank you. Please check your inbox to confirm.
Two-dimensional (plane) elasticity equations in solid mechanics are solved numerically with the use of an ensemble of physics-informed neural networks (PINNs). The system of equations consists of the kinematic definitions, i.e. the strainâdisplacement relations, the equilibrium equations connecting a stress tensor with external loading forces and the isotropic constitutive relations for stress and strain tensors. Different boundary conditions for the strain tensor and displacements are considered. The proposed computational approach is based on principles of artificial intelligence and uses a developed open-source machine learning platform, scientific software Tensorflow, written in Python and Keras library, an application programming interface, intended for a deep learning. A deep learning is performed through training the physics-informed neural network model in order to fit the plain elasticity equations and given boundary conditions at collocation points. The numerical technique is tested on an example, where the exact solution is given. Two examples with plane stress problems are calculated with the proposed multi-PINN model. The numerical solution is compared with results obtained after using commercial finite element software. The numerical results have shown that an application of a multi-network approach is more beneficial in comparison with using a single PINN with many outputs. The derived results confirmed the efficiency of the introduced methodology. The proposed technique can be extended and applied to the structures with nonlinear material properties.
This is a preview of subscription content, log in via an institution to check access.
Subscribe and save.
Price includes VAT (Russian Federation)
Instant access to the full article PDF.
Rent this article via DeepDyve
Institutional subscriptions
Explore related subjects.
The entire software package is available upon request.
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D.G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., Zheng, X.: Tensorflow: a system for large-scale machine learning. In: 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX Association, Savannah, GA, pp. 265-283. URL https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi. (2016)
Baydin, A., Pearlmutter, B.A., Radul, A.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey. J. Mach. Learn. Res. 18 , 1 (2018)
MathSciNet  Google Scholar Â
Bazmara, M., Mianroodi, M., Silani, M.: Application of physics-informed neural networks for nonlinear buckling analysis of beams. Acta. Mech. Sin. 39 , 422438 (2023)
Article  MathSciNet  Google Scholar Â
Cai, S., Mao, Z., Wang, Z., Yin, M., Karniadakis, G.E.: Physics-informed neural networks (PINNs) for fluid mechanics: a review. Acta Mech. Sin. 37 , 1â12 (2022)
Chadha, C., He, J., Abueidda, D., Koric, S., Guleryuz, E., Jasiuk, I.: Improving the accuracy of the deep energy method. Acta Mech. 234 , 5975â5998 (2023)
Chen, Y., Lu, L., Karniadakis, G.E., Dal Negro, L.: Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 28 (8), 11618â11633 (2020)
Article  Google Scholar Â
Chollet, F.: Deep learning with python, Manning Publications Company, URL https://books.google.ca/books?id= Yo3CAQAACAAJ (2017)
Drosopoulos, G.A., Stavroulakis, G.E.: Non-linear mechanics for composite. In: Heterogeneous Structures, CRC Press, Taylor and Francis (2022)
Drosopoulos, G.A., Stavroulakis, G.E.: Data-driven computational homogenization using neural networks: FE2-NN application on damaged masonry. ACM J. Comput. Cult. Herit. 14 , 1â19 (2020)
Google Scholar Â
Faroughi, S., Darvishi, A., Rezaei, S.: On the order of derivation in the training of physics-informed neural networks: case studies for non-uniform beam structures. Acta Mech. 234 , 5673â5695 (2023)
Faroughi, S.A., Pawar, N., Fernandes, C., Raissi, M., Das, S., Kalantari, N.K., Mahjour, S.K.: Physics-guided, physics-informed, and physics-encoded neural networks. Sci. Comput. (2023). https://doi.org/10.48550/arXiv.2211.07377
Fletcher, R.: Practical methods of optimization, 2nd edn. John Wiley & Sons, New York (1987)
GĂźne, A., Baydin, G., Pearlmutter, B.A., Siskind, J.M.: Automatic differentiation in machine learning: a survey, J. Mach. Learn. Res. 18 , 1-43, (2018) URL http://www.jmlr.org/papers/volume18/17-468/17-468.pdf
Guo, M., Haghighat, E.: An energy-based error bound of physics-informed neural network solutions in elasticity. J. Eng. Mech. (2020). https://doi.org/10.48550/arXiv.2010.09088
Haghighat, E., Juanes, R.: Sciann: a keras/tensorflow wrapper for scientific computations and physics-informed deep learning using artificial neural networks. Comp. Meth. Appl. Mech. Eng. 373 , 113552 (2021)
Haghighat, E., Raissi, M., Moure, A., Gomez, H., and Juanes, R.: A deep learning framework for solution and discovery in solid mechanics. https://arxiv.org/abs/2003.02751 (2020)
Harandi, A., Moeineddin, A., Kaliske, M., Reese, S., Rezaei, S.: Mixed formulation of physics-informed neural networks for thermo-mechanically coupled systems and heterogeneous domains. Int J Numer. Methods Eng. 125 , e7388 (2024)
Kadeethum, T., Jorgensen, T., Nick, H.: Physics-informed neural networks for solving nonlinear diffusivity and Biotâs equations. PLoS ONE 15 , e0232683 (2020)
Karniadakis, G.E., Kevrekidis, I.G., Lu, L., et al.: Physics-informed machine learning. Nat. Rev. Phys. 3 , 422â440 (2021)
Katsikis, D., Muradova, A.D., Stavroulakis, G.S.: A gentle introduction to physics-informed neural networks, with applications in static rod and beam problems. Jr. Adv. Appl. & Comput. Math. 9 , 103â128 (2022)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, computer science, mathematics. In: The International Conference on Learning Representations (ICLR) (2015)
Kortesis, S., Panagiotopoulos, P.D.: Neural networks for computing in structural analysis: methods and prospects of applications. Int. Jr. Numeric. Meth. Eng. 36 , 2305â2318 (1993)
Kovachki, N., Lanthaler, S., Mishra, S.: On universal approximation and error bounds for Fourier neural operator. J. Mach. Learn. Res. 22 , 1â76 (2021)
Lagaris, E., Likas, A., Fotiadis, D.I.: Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 9 , 987â1000 (1998)
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., Anandkumar, A.: Fourier neural operator for parametric partial differential equations, arXiv preprint arXiv:2010.08895 (2020)
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45 , 503â528 (1989)
Lu, L., Jin, P., Pang, G., Zhang, Z., Karniadakis, G.E.: Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nat. Mach. Intell. 3 , 218â229 (2021). https://doi.org/10.1038/s42256-021-00302-5
Meade, A.J., Fernandez, A.A.: The numerical solution of linear ordinary differential equations by feed-forward neural networks. Math. Comput. Model. 19 , 1â25 (1994). https://doi.org/10.1016/0895-7177(94)90095-7
Muradova, A.D., Stavroulakis, G.E.: Physics-informed neural networks for elastic plate problems with bending and Winkler-type contact effects. J. Serb. Soc. Comput. Mech. 15 , 45â54 (2021)
Muradova, A.D, Stavroulakis, G.E.: Physics-informed Neural Networks for the solution of unilateral contact problems. In: Book of Proceedings, 13th International Congress on Mechanics HSTAM, pp. 451-459. (2022). https://hstam2022.eap.gr/book-of-proceedings/
Muradova, A.D., Stavroulakis, G.E.: The projective-iterative method and neural network estimation for buckling of elastic plates in nonlinear theory. Commun. Nonlin. Sci. Num. Sim. 12 , 1068â1088 (2007)
Niu, S., Zhang, E., Bazilevs, Y., Srivastava, V.: Modeling finite-strain plasticity using physics-informed neural network and assessment of the network performance. J. Mech. Phys. Solids 172 , 105177 (2023)
Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378 , 686â707 (2019)
Rezaei, S., Harandi, A., Moeineddin, A., Xu, B.-X., Reese, S.: A mixed formulation for physics-informed neural networks as a potential solver for engineering problems in heterogeneous domains: comparison with finite element method. Comput. Meth. Appl. Mech. Eng. 401 (Part B), 115616 (2022)
Ruder, S.: An overview of gradient descent optimization algorithms (2017) https://arxiv.org/abs/1609.04747
Shin, Y., Darbon, J., Karniadakis, G.E.: On the convergence of physics informed neural networks for linear second-order elliptic and parabolic type PDEs. Commun. Comput. Phys. 28 , 2042â2074 (2020)
Stavroulakis, G.E.: Inverse and crack identification problems in engineering mechanics. Springer, New York (2000)
Stavroulakis, G.E., Avdelas, A., Abdalla, K.M., Panagiotopoulos, P.D.: A neural network approach to the modelling, calculation and identification of semi-rigid connections in steel structures. J. Construct. Steel Res. 44 , 91â105 (1997)
Stavroulakis, G., Bolzon, G., Waszczyszyn, Z., Ziemianski, L.: Inverse analysis. In: Karihaloo, B., Ritchie, R.O., Milne, I. (eds) Comprehensive structural integrity, numerical and computational methods, vol. 3, Chap 13, pp. 685â718. Elsevier, Amsterdam (2003)
Tartakovsky, A. M., Marrero, C. O., Perdikaris, P., Tartakovsky, G. D., and Barajas- Solano D.: Learning parameters and constitutive relationships with physics informed deep neural networks. arXiv preprint arXiv:1808.03398 (2018)
Theocaris, P.S., Panagiotopoulos, P.D.: Plasticity including the Bauschinger effect, studied by a neural network approach. Acta Mech. 113 , 63â75 (1995). https://doi.org/10.1007/BF01212634
Waszczyszyn, Z., ZiemiaĹski, L.: Neural networks in the identification analysis of structural mechanics problems. In: MrĂłz, Z., Stavroulakis, G.E. (eds.) Parameter identification of materials and structures, CISM International Centre for Mechanical Sciences (Courses and Lectures), p. 469. Springer, Vienna (2005)
Xiao, Y., Wei, Z., Wang, Z.: A limited memory BFGS-type method for large-scale unconstrained optimization. Comput. Math. Appl. 56 , 1001â1009 (2008)
Yagawa, G., Oishi, A.: Computational mechanics with neural networks. Springer, New York (2021)
Book  Google Scholar Â
Download references
The work of A.D.M. and G.E.S. has been supported by the Project Safe-Aorta, which was implemented in the framework of the Action âFlagship actions in interdisciplinary scientific fields with a special focus on the productive fabricâ, through the National Recovery and Resilience Fund Greece 2.0 and funded by the European Union-NextGenerationEU (Project ID:TAEDR-0535983)
Authors and affiliations.
School of Production Engineering and Management, Institute of Computational Mechanics and Optimization, Technical University of Crete, Kounoupidiana, 73100, Chania, Crete, Greece
Aliki D. Mouratidou & Georgios E. Stavroulakis
Discipline of Civil Engineering, School of Engineering and Computing, University of Central Lancashire, Preston campus, Preston, PR1 2HE, UK
Georgios A. Drosopoulos
Discipline of Civil Engineering, School of Engineering, University of KwaZulu-Natal, Durban campus, Durban, 4041, South Africa
You can also search for this author in PubMed  Google Scholar
Correspondence to Aliki D. Mouratidou .
Conflict of interest.
The authors declare that they have no conflict of interest.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and permissions
Mouratidou, A.D., Drosopoulos, G.A. & Stavroulakis, G.E. Ensemble of physics-informed neural networks for solving plane elasticity problems with examples. Acta Mech (2024). https://doi.org/10.1007/s00707-024-04053-3
Download citation
Received : 26 February 2024
Revised : 27 May 2024
Accepted : 31 July 2024
Published : 29 August 2024
DOI : https://doi.org/10.1007/s00707-024-04053-3
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
IMAGES
VIDEO
COMMENTS
The problem-solving agent performs precisely by defining problems and several solutions. So we can say that problem solving is a part of artificial intelligence that encompasses a number of techniques such as a tree, B-tree, heuristic algorithms to solve a problem. We can also say that a problem-solving agent is a result-driven agent and always ...
In artificial intelligence (AI) and machine learning, an agent is an entity that perceives its environment, processes information and acts upon that environment to achieve specific goals. The process by which an agent formulates a problem is critical, as it lays the foundation for the agent's decision-making and problem-solving capabilities. This a
Artificial intelligence (AI) is technology that enables computers and machines to simulate human learning, comprehension, problem solving, decision making, creativity and autonomy. Applications and devices equipped with AI can see and identify objects. They can understand and respond to human language.
Problem solving, particularly in artificial intelligence, may be characterized as a systematic search through a range of possible actions in order to reach some predefined goal or solution. Problem-solving methods divide into special purpose and general purpose.
Artificial intelligence (AI) is the theory and development of computer systems capable of performing tasks that historically required human intelligence, such as recognizing speech, making decisions, and identifying patterns. AI is an umbrella term that encompasses a wide variety of technologies, including machine learning, deep learning, and ...
May 10, 2024. In artificial intelligence, a problem-solving agent refers to a type of intelligent agent designed to address and solve complex problems or tasks in its environment. These agents are a fundamental concept in AI and are used in various applications, from game-playing algorithms to robotics and decision-making systems.
I Artificial Intelligence. 1 Introduction ... 1. 2 Intelligent Agents ... 36. II Problem-solving. 3 Solving Problems by Searching ... 63. 4 Search in Complex Environments ... 110. 5 Adversarial Search and Games ... 146. 6 Constraint Satisfaction Problems ... 180. III Knowledge, reasoning, and planning.
The abstraction is useful if carrying out each of the actions in the solution is easier than the original problem. 3.2 Example Problems A standardized problem is intended to illustrate or exercise various problem-solving methods. It can be given a concise, exact description and hence is suitable as a benchmark for researchers to compare the ...
Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems.It is a field of research in computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. [1]
artificial intelligence (AI), the ability of a digital computer or computer-controlled robot to perform tasks commonly associated with intelligent beings. The term is frequently applied to the project of developing systems endowed with the intellectual processes characteristic of humans, such as the ability to reason, discover meaning, generalize, or learn from past experience.
Let's explain the concepts of problem, problem space, and search in the context of artificial intelligence: Problem: A problem is a specific task or challenge that requires finding a solution or ...
"Artificial intelligence (AI) is the design, implementation, and use of programs, machines, and systems that exhibit human intelligence, ... from defining the problem and picking a GPU or foundation model to production deployment and continual learning to user experience design." The LLM Bootcamp is an open course designed to teach you how ...
Problem-solving in AI is a multi-step process that allows you to tackle complex problems using various techniques and algorithms. By understanding and following these steps, you can effectively solve problems in the field of artificial intelligence. Step 1: Problem Definition. Problem definition is the first crucial step in problem-solving.
We would effectively be defining the phenomenon out of existence. A common definition of AI is that it is a technology that enables machines to imitate various complex human skills. This, however, does not give is much to go on. In fact, it does no more than render the term 'artificial intelligence' in different words.
The initial stage of problem-solving always involves setting a goal. This goal serves as a reference point, guiding the intelligent agent to act in a way that maximizes its performance measure ...
Artificial intelligence (AI) problem-solving often involves investigating potential solutions to problems through reasoning techniques, making use of polynomial and differential equations, and carrying them out and use modelling frameworks. A same issue has a number of solutions, that are all accomplished using an unique algorithm.
Here's the short answer: AI is a catchall term for a set of technologies that make computers do things that are thought to require intelligence when done by people. Think of recognizing faces ...
The term "artificial general intelligence" (AGI) was coined to describe AI systems that possess capabilities comparable to those of a human. In theory, AGI could someday replicate human-like cognitive abilities including reasoning, problem-solving, perception, learning, and language comprehension.
PROBLEM-SOLVING AND ARTIFICIAL INTELLIGENCE 31 The application of this definition may return different results (for the same molecule) with respect to structure elucidation and with regard to synthesis design; however, in both cases a distinctive attribute of this concept is its fuzzy character. Using a classical concept of the functional group ...
In addition to the limited definition of artificial intelligence above, there are four definitions that have emerged within the literature that, ... Faster processing speeds are thought to facilitate more efficient learning, problem-solving, and decision-making, which are key components of general intelligence. 13 Furthermore, Jensen ...
đSubscribe to our new channel:https://www.youtube.com/@varunainashots Breadth First Search (BFS): https://youtu.be/qul0f79gxGs Depth First Search (DFS): ht...
Key Terminologies in Artificial Intelligence Problems. Before exploring the characteristics, let's clarify some essential AI concepts: Problem-solving: Problem-solving is a process that is a solution provided to a complex problem or task. When dealing with AI, problem-solving involves creating algorithms and methods of artificial intelligence that will empower machines to imitate humans ...
Generative AI handles a variety of business tasks, but can it develop creative solutions to problems? Yes, although some of the best ideas emerge when humans and machines work together, according to research by Jacqueline Ng Lane, Karim Lakhani, Miaomiao Zhang, and colleagues.
Problem-Solving Methods in Artificial Intelligence. Computing methodologies. Artificial intelligence. Control methods. Search methodologies. General and reference. Document types. Reference works. Theory of computation. Design and analysis of algorithms. ... Artificial intelligence (AI) is the Science and Engineering domain concerned with the ...
Intelligence has been defined in many ways: the capacity for abstraction, logic, understanding, self-awareness, learning, emotional knowledge, reasoning, planning, creativity, critical thinking, and problem-solving.It can be described as the ability to perceive or infer information; and to retain it as knowledge to be applied to adaptive behaviors within an environment or context.
The researchers say that loss of plasticity is a major challenge to developing AI that can effectively handle the complexity of the world and would need to be solved to develop human-level artificial intelligence. Many existing models aren't designed for continual learning. Sutton references ChatGPT as an example; it doesn't learn continually.
The development of artificial intelligence is speeding up so quickly that it was addressed briefly at both Republican and Democratic conventions. Science fiction has long theorized about the ways ...
In this work, a computational intelligence scheme, based on an ensemble of PINNs, is applied to the problem of two-dimensional elasticity. The proposed computational approach is based on principles of artificial intelligence and uses Tensorflow, an open-source machine learning scientific software, and Keras library, an application programming ...