In order to build conviction, we rely on founders to tell us a compelling story, almost always in the form of slides. We’ve funded companies almost entirely because of the quality of their seed decks. Poor deck? We’ll likely pass on the opportunity.
Let’s first look at how to calculate self-attention using vectors, then proceed to look at how it’s actually implemented – using matrices.
The first step in calculating self-attention is to create three vectors from each of the encoder’s input vectors (in this case, the embedding of each word). So for each word, we create a... See more
The opposite of process is chaos, but an in-between is rapid adaptation, emphasis on rapid . Limit scope as much as possible, ship small things fast, and react to feedback quickly.
The design process assumes that dev time is the utmost priority, and this moves a bulk of labor into the design process. This is a faulty premise. The most valuable thing... See more
Don’t divide your attention: focusing on one thing yields increasing returns for each unit of effort.
At a micro level, an extra hour of focus on the current project has a much higher return than an hour on something new, or worse, 5 minutes each on 12 new things. Before you ever do something new, you should understand the opportunity cost vs.... See more