We envision using the same corpus model as a multi-task learner for multiple IR tasks. To this end, once a corpus model has been trained, it can of course be used for the most classical of all IR tasks – document retrieval. However, by leveraging recent advances in multi-task learning, such a model can very likely be applied to a diverse range of t... See more
[Curator's note: there are numerous other technical challenges addressed with alternative prescriptions throughout the paper. These highlights are narrative-centric, and I invite you to review the paper if you are a keen technologist looking for answers to the following:- Zero- and Few-Shot Learning - Response Generation- Arith... See more
[...] Today’s cutting edge IR systems are not fundamentally different than classical IR systems developed many decades ago. Indeed, a majority of today’s systems boil down to: (a) building an efficient queryable index for each document in the corpus, (b) retrieving a set of candidates for a given query, and (c) computing a relevance score for each ... See more
If all of these research ambitions were to come to fruition, the resulting system would be a very early version of the system that we envisioned in the introduction. That is, the resulting system would be able to provide domain expert answers to a wide range of information needs in a way that neither modern IR systems, question answering systems, o... See more
Building such domain experts would likely require developing an artificial general intelligence, which is beyond the scope of this paper. Instead, by “domain expert” we specifically mean that the system is capable of producing results (with or without actual “understanding”) that are of the same quality as a human expert in the given domain.
To replace indexes with a single, consolidated model, it must be possible for the model itself to have knowledge about the universe of document identifiers, in the same way that traditional indexes do. One way to accomplish this is to move away from traditional LMs and towards corpus models that jointly model term-term, term-document, and document-... See more