Models
Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more
Long-Context Retrieval Models with Monarch Mixer
Stable Beluga 2
Use Stable Chat (Research Preview) to test Stability AI's best language models for free
Model Description
Stable Beluga 2 is a Llama2 70B model finetuned on an Orca style Dataset
Use Stable Chat (Research Preview) to test Stability AI's best language models for free
Model Description
Stable Beluga 2 is a Llama2 70B model finetuned on an Orca style Dataset
stabilityai/StableBeluga2 · Hugging Face
DeepSeek Coder comprises a series of code language models trained from scratch on both 87% code and 13% natural language in English and Chinese, with each model pre-trained on 2T tokens. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on repo-level code corpus by employing a window size of 16K ... See more
DeepSeek Coder
LLaVA v1.5, a new open-source multimodal model stepping onto the scene as a contender against GPT-4 with multimodal capabilities. It uses a simple projection matrix to connect the pre-trained CLIP ViT-L/14 vision encoder with Vicuna LLM, resulting in a robust model that can handle images and text. The model is trained in two stages: first, updated ... See more
This AI newsletter is all you need #68
We are excited to release the first version of our multimodal assistant Yasa-1, a language assistant with visual and auditory sensors that can take actions via code execution.
We trained Yasa-1 from scratch, including pretraining base models from ground zero, aligning them, as well as heavily optimizing both our training and serving infrastructure.
... See more
We trained Yasa-1 from scratch, including pretraining base models from ground zero, aligning them, as well as heavily optimizing both our training and serving infrastructure.
... See more
Announcing our Multimodal AI Assistant - Reka AI
What is Pandalyst
Pandalyst is a general large language model specifically trained to process and analyze data using the pandas library.
How is Pandalyst
Pandalyst has strong generalization capabilities for data tables in different fields and different data analysis needs.
Why is Pandalyst
Pandalyst is open source and free to use, and its small paramete... See more
Pandalyst is a general large language model specifically trained to process and analyze data using the pandas library.
How is Pandalyst
Pandalyst has strong generalization capabilities for data tables in different fields and different data analysis needs.
Why is Pandalyst
Pandalyst is open source and free to use, and its small paramete... See more
pipizhao/Pandalyst-7B-V1.2 · Hugging Face
The text embedding set trained by Jina AI, Finetuner team.
Intended Usage & Model Info
jina-embeddings-v2-base-en is an English, monolingual embedding model supporting 8192 sequence length.
It is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of ALiBi to allow longer sequence length.
The backbone jina-bert-v2-... See more
Intended Usage & Model Info
jina-embeddings-v2-base-en is an English, monolingual embedding model supporting 8192 sequence length.
It is based on a Bert architecture (JinaBert) that supports the symmetric bidirectional variant of ALiBi to allow longer sequence length.
The backbone jina-bert-v2-... See more
jinaai/jina-embeddings-v2-base-en · Hugging Face
Glaive-coder-7b
Glaive-coder-7b is a 7B parameter code model trained on a dataset of ~140k programming related problems and solutions generated from Glaive’s synthetic data generation platform.
The model is fine-tuned on the CodeLlama-7b model.
Usage:
The model is trained to act as a code assistant, and can do both single instruction following and mult... See more
Glaive-coder-7b is a 7B parameter code model trained on a dataset of ~140k programming related problems and solutions generated from Glaive’s synthetic data generation platform.
The model is fine-tuned on the CodeLlama-7b model.
Usage:
The model is trained to act as a code assistant, and can do both single instruction following and mult... See more
glaiveai/glaive-coder-7b · Hugging Face
pair-preference-model-LLaMA3-8B by RLHFlow: Really strong reward model, trained to take in two inputs at once, which is the top open reward model on RewardBench (beating one of Cohere’s).
DeepSeek-V2 by deepseek-ai (21B active, 236B total param.): Another strong MoE base model from the DeepSeek team. Some people are questioning the very high MMLU sc... See more
DeepSeek-V2 by deepseek-ai (21B active, 236B total param.): Another strong MoE base model from the DeepSeek team. Some people are questioning the very high MMLU sc... See more