Large language models can do jaw-dropping things. But nobody knows exactly why.

MIT Technology Review – And that’s a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models…The biggest models are now so complex that researchers are studying them as if they were strange natural phenomena, carrying out experiments and trying to explain the results. Many of those observations fly in the face of classical statistics, which had provided our best set of explanations for how predictive models behave. So what, you might say. In the last few weeks, Google DeepMind has rolled out its generative models across most of its consumer apps. OpenAI wowed people with Sora, its stunning new text-to-video model. And businesses around the world are scrambling to co-opt AI for their needs. The tech works—isn’t that enough? But figuring out why deep learning works so well isn’t just an intriguing scientific puzzle. It could also be key to unlocking the next generation of the technology—as well as getting a handle on its formidable risks…”

Facebook LinkedIn

Posted in: AI, Cybercrime, Internet, Knowledge Management, Search Engines

Large language models can do jaw-dropping things. But nobody knows exactly why.

Thank you!