Ajeesh Garg
@ajeesh
Ajeesh Garg
@ajeesh
Patchifying Images for Computer Vision
Transformers and
Vision TransFormers original paper
Attention and
SAM model
MAE is less sensitive to outliers but I am still trying to mae as my loss function? Why do you think someone might choose MAE as the loss but still monitor MSE as a metric? What behavior during training might that help catch?
My data distribution is really sparse so as my data distribution is more sparse i choose mae over mse. Basically, it will app
... See moreLoss and Loss function
Loss Function notes - mae
Attention and
Transformers Explained Visually I
Introduces the Transformer, a novel neural network architecture based solely on attention mechanisms for sequence transduction, improving machine translation quality, training speed, and parallelization over recurrent and convolutional models.
proceedings.neurips.ccModels and
Attention is all you need paper