Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Rethinking Search: Making Experts out of Dilettantes

MIT Technology Review: “…a team of Google researchers has published a proposal for a radical redesign that throws out the ranking approach and replaces it with a single large AI language model—a future version of BERT or GPT-3. The idea is that instead of searching for information in a vast list of web pages, users would ask questions and have a language model trained on those pages answer them directly. The approach could change not only how search engines work, but how we interact with them. Many issues with existing language models will need to be fixed first. For a start, these AIs can sometimes generate biased and toxic responses to queries—a problem that researchers at Google and elsewhere have pointed out…Metzler and his colleagues are interested in a search engine that behaves like a human expert. It should produce answers in natural language, synthesized from more than one document, and back up its answers with references to supporting evidence, as Wikipedia articles aim to do. ..”

Source – Cornell University arXiv:2105.02274 – Rethinking Search: Making Experts out of Dilettantes. Authors: Donald Metzler, Yi Tay, Dara Bahri, Marc Najork: Abstract – When experiencing an information need, users want to engage with an expert, but often turn to an information retrieval system, such as a search engine, instead. Classical information retrieval systems do not answer information needs directly, but instead provide references to (hopefully authoritative) answers. Successful question answering systems offer a limited corpus created on-demand by human experts, which is neither timely nor scalable. Large pre-trained language models, by contrast, are capable of directly generating prose that may be responsive to an information need, but at present they are dilettantes rather than experts – they do not have a true understanding of the world, they are prone to hallucinating, and crucially they are incapable of justifying their utterances by referring to supporting documents in the corpus they were trained over. This paper examines how ideas from classical information retrieval and large pre-trained language models can be synthesized and evolved into systems that truly deliver on the promise of expert advice.”

Sorry, comments are closed for this post.