GitHub - deepseek-ai/DeepSeek-Coder: DeepSeek Coder: Let the Code Write Itself
DeepSeek Coder comprises a series of code language models trained from scratch on both 87% code and 13% natural language in English and Chinese, with each model pre-trained on 2T tokens. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on repo-level code corpus by employing a window size of 16K ... See more
DeepSeek Coder
Nicolay Gerold added
Deep-ML
deep-ml.comReplit AI is now free for all users . Over the past year, we’ve witnessed the transformative power of building software collaboratively with the power of AI. We believe AI will be part of every software developer’s toolkit and we’re excited to provide Replit AI for free to our 25+ million developer community.
To accompany AI for all, we’re releasin... See more
To accompany AI for all, we’re releasin... See more
Replit’s new AI Model now available on Hugging Face
Nicolay Gerold added
Our training data is pretty much from the same place as everybody else’s — which is pretty much the internet. Pretty much every big AI model just pulls off all the data it can, all the text it can, all the images it can. Scientifically speaking, we’re at an early point in the space, where everyone grabs everything they can, they dump it in a huge f... See more
The Verge • “An engine for the imagination”: an interview with David Holz, CEO of AI image generator Midjourney
sari added
Many of these projects are saving time by training on small, highly curated datasets. This suggests there is some flexibility in data scaling laws. The existence of such datasets follows from the line of thinking in Data Doesn't Do What You Think, and they are rapidly becoming the standard way to do training outside Google
semianalysis.com • Google "We Have No Moat, and Neither Does OpenAI"
The Software Pro's Best Kept Secret.
app.codecrafters.ioText embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more
Long-Context Retrieval Models with Monarch Mixer
Nicolay Gerold added