Collections

Prompt Engineering• 2

Nicolay Gerold

AI Template & SPR Library Featuring advanced prompts and SPRs 🟢 Website 🔵 LinkedIn 🔴 Patreon ⚪ Discord Prompt Engineering Advan

Simply adding "Repeat the question before answering it." somehow make the models answer the trick question correctly. Probable explanations: Repea

Command Line Mastery• 5

Nicolay Gerold

xo xo is a command-line tool to generate idiomatic code for different languages code based on a database schema or a custom query. Installing | B

We had a CLI tool, written mostly as a bunch of shell scripts, with a ton of available commands that performed all kinds of utility functions related

ShellCheck finds bugs in your shell scripts. You can cabal, apt, dnf, pkg or brew install it locally right now.

dasel Dasel (short for data-selector) allows you to query and modify data structures using selector strings. Comparable to jq / yq, but sup

Embeddings• 4

Nicolay Gerold

Nomic Atlas Python Client Explore, label, search and share massive datasets in your web browser. This repository contains Python bindings for working

Introduction This library provides utilities for generating and scoring text explanations of sparse autoencoder (SAE) features. The explainer and scor

Gemma Scope Tutorial This is a barebones tutorial on how to use Gemma Scope, Google DeepMind's suite of Sparse Autoencoders (SAEs) on every layer and

What is feder Feder is a JavaScript tool designed to aid in the comprehension of embedding vectors. It visualizes index files from Faiss, HNSWlib, and

AI on the Edge // Local First• 22

Nicolay Gerold

Jazz.ToolsPowerSyncFireproofAutomergeDXOSElectricSQLand of course, Berlin's own: Yjs .

Mem0: The Memory Layer for Personalized AI Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI expe

🎤 audapolis An editor for spoken-word media with transcription. audapolis aims to make the workflow for spoken-word-heavy media editing easier, fas

Get Started · Examples · Try the Demo · Docs · Discord Instant is a client-side database that makes it easy to build real-time and co

Frontend-Tools• 69

Nicolay Gerold

Build UIs without the grunt workStorybook is a frontend workshop for building UI components and pages in isolation. Thousands of teams use it for UI d

Papermark The open-source DocSend alternative. papermark.io Papermark is the open-source document sharing alternative to DocSend

Empower JavaScriptwith native APIs Liberate your development by using platform APIs directly without leaving your of JavaScript.

Welcome to Extension Extension is a plug-and-play, zero-config, cross-browser extension development tool for browser extensions with bu

Cloud Engineering• 7

Nicolay Gerold

#sapling Over winter break last year, I tried learning everything I could about AWS. I took a bunch of notes and decided to compile them all here. Whi

Terraform Providers: Terraform is primarily used for defining the infrastructure resources. Its strength lies in its vast collection of providers that

Pulumi Examples This repository contains examples of using Pulumi to build and deploy cloud applications and infrastructure across major programming

Burrow Burrow is a serverless and globally-distributed HTTP proxy for Go built on AWS Lambda. It is designed to be completely compatible with the stan

Golang• 1

Golang

Nicolay Gerold

Burrow Burrow is a serverless and globally-distributed HTTP proxy for Go built on AWS Lambda. It is designed to be completely compatible with the stan

LLMs• 119

Nicolay Gerold

Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by ANY LLM of your choice, statistical methods, or NLP models tha

You can think your way into solving a deterministic system, but you cannot think your way into solving a probabilistic system. The first thing that I

Mem0: The Memory Layer for Personalized AI Mem0 provides a smart, self-improving memory layer for Large Language Models, enabling personalized AI expe

Zerox OCR A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, t

Rust• 3

Nicolay Gerold

What is Pingora Pingora is a Rust framework to build fast, reliable and programmable networked systems. Pingora is battle tested as it has been servin

Luminal is a deep learning library that uses composable compilers to achieve high performance. Current ML libraries tend to be large and comple

orch orch is a library for building language model powered applications and agents for the Rust programming language. It was primarily built for usa

Data Storage• 37

Nicolay Gerold

Rottnest : Data Lake Indices You don't need ElasticSearch or some vector database to do full text search or vector search. Parquet + Rottnest is all y

Who will this data model serve? These are the stakeholders and users of the data model. Why does this data model need to be built? What is the purpos

Our Goals We made it lightweight and kept the efficiency in mind: Self-contained We ship

SQL Studio Single binary, single command SQL database explorer. SQL studio supports SQLite, libSQL, PostgreSQL, MySQL and DuckDB. Local SQLite DB File

Data• 7

Nicolay Gerold

Optimizing Further Creating so many indices and aggregating so many tables is sub-optimal. To optimize this, we employ materialized views, which creat

Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactiv

Local database for development Each table in the database had an accompanying script that would generate a subset of the data for use in local develop

Who will this data model serve? These are the stakeholders and users of the data model. Why does this data model need to be built? What is the purpos

Data Loading• 22

Nicolay Gerold

Indexify - Extraction and Retrieval from Videos, PDF and Audio for Interactive AI Applications

The solution: The ingestion service To meet these unique demands, the Search Infrastructure team implemented the Ingestion Service to gracefully handl

Surya Surya is a document OCR toolkit that does: OCR in 90+ languages that benchmarks favorably vs cloud services Line-level text detection in any la

ETL The part of the system I'm most proud of, and on which I spent the most effort, is the ETL process. We had a series of shell scripts for each data

Backend-Tools• 18

Nicolay Gerold

Public APIs A collective list of free APIs for use in software and web development

A better type of backendConvex is the fullstack TypeScript development platform. Replace your database, server functions and glue code.

What is Pingora Pingora is a Rust framework to build fast, reliable and programmable networked systems. Pingora is battle tested as it has been servin

The complete Protobuf platformAccelerate gRPC adoption with the Buf Schema Registry — built by the world's Protobuf experts.

Software Engineering• 41

Nicolay Gerold

Using feature flags It's always a good idea to put new features behind a feature flag. This contributes to a rollout strategy that can surface user fe

Shipping to production Before shipping to production, we think about all the different artifacts that might be affected by a new feature. Here's a non

Build & Deployments Our build process starts by pushing changes to a repository on GitHub. When code is pushed to a repository through a pull request,

The complete Protobuf platformAccelerate gRPC adoption with the Buf Schema Registry — built by the world's Protobuf experts.

LLM-Stack• 12

Nicolay Gerold

Portkey's AI Gateway is the interface between your app and hosted LLMs. It streamlines API requests to OpenAI, Anthropic, Mistral, LLama2, Anyscale, G

dstack is an open-source toolkit and orchestration engine for running GPU workloads. It's designed for development, training, and deployment of gen AI

LanceDB LanceDB is an open-source vector database for AI that's designed to store, manage, query and retrieve embeddings on large-scale multi-modal da

Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by ANY LLM of your choice, statistical methods, or NLP models tha

LLM Evaluation• 2

Nicolay Gerold

Take a look at our official page for user documentation and examples: langtest.org Key Features Generate and execute more than 50 distinct types of t

Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by ANY LLM of your choice, statistical methods, or NLP models tha

Cool Projects / Repos• 11

Nicolay Gerold

TurboSeek An open source AI search engine. Powered by Together.ai. Tech stack Next.js app router with Tailwind Together AI for LLM inference Mix

Welcome to Quartz 4Jun 13, 20242 min readQuartz is a fast, batteries-included static-site generator that transforms Markdown content into fully functi

libsearch 🔎 Simple, index-free text search for JavaScript, used across my personal projects like YC Vibe Check, linus.zone/entr, and my personal pr

Turn questions into data insights. Make your team more informed and save time, by using AI for Data Analysis on your

Training• 2

Nicolay Gerold

In your sport of choice: Perform a 5-minute Zone 5 effort. Make this a Very Hard effort but leave yourself room to improve next time. Calculate the

Tempo training feels different than we expect. Not a lot of huffing of the breath. No burning in the legs. Heart rate responds slowly. Aim for a 3

NLP• 4

Nicolay Gerold

We can detect factually inconsistent summaries via the natural language inference (NLI) task. The NLI task works like this: Given a premise sentence a

Google Deepmind used similar idea to make LLMs faster in Accelerating Large Language Model Decoding with Speculative Sampling. Their algorithm uses a

FlashText This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm.

RAG• 21

Nicolay Gerold

FuzzTypes FuzzTypes is a set of "autocorrecting" annotation types that expands upon Pydantic's included data conversions. Designed for simplicity, it

Unlike some other popular algorithms, DiskANN is designed to keep memory usage to a minimum. This makes it a great match for use cases where Turso alr

Welcome to RAGatouille Easily use and train state of the art retrieval methods in any RAG pipeline. Designed for modularity and ease-of-use, backed by

rerankers A lightweight unified API for various reranking models. Developed by @bclavie as a member of answer.ai Welcome to rerankers! Our goal is