To replace indexes with a single, consolidated model, it must be possible for the model itself to have knowledge about the universe of document identifiers, in the same way that traditional indexes do. One way to accomplish this is to move away from traditional LMs and towards corpus models that jointly model term-term, term-document, and... See more
When experiencing an information need, users want to engage with a domain expert, but often turn to an information retrieval (IR) system, such as a search engine, instead. Classical information retrieval systems do not answer information needs directly, but instead provide references to (hopefully authoritative) answers.
[...] Today’s cutting edge IR systems are not fundamentally different than classical IR systems developed many decades ago. Indeed, a majority of today’s systems boil down to: (a) building an efficient queryable index for each document in the corpus, (b) retrieving a set of candidates for a given query, and (c) computing a relevance score for each... See more
Pre-trained language models (LM), by contrast, are capable of directly generating prose that may be responsive to an information need, but at present they are *dilettantes* rather than domain experts – they do not have a true understanding of the world, they are prone to hallucinating, and crucially they are incapable of justifying their utterances... See more
[Curator's note: there are numerous other technical challenges addressed with alternative prescriptions throughout the paper. These highlights are narrative-centric, and I invite you to review the paper if you are a keen technologist looking for answers to the following:- Zero- and Few-Shot Learning - Response Generation- Arithmetic, Logical,... See more