
Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language
Finds that the LLMs with LENS perform highly competitively with much bigger and much more sophisticated systems, without any multimodal training whatsoever.
https://t.co/GxxW9iOW8z... See more
Understanding AI | Lee Robinson
leerob.com
Open AI releases GPT-4V(ision) system card
paper: https://t.co/lWqSHhlCUP
GPT-4 with vision (GPT-4V) enables users to instruct GPT-4 to analyze image inputs provided by the user, and is the latest capability we are making broadly available. Incorporating additional modalities (such as image... See more