Forget Privacy: You're Terrible at Targeting Anyway
Saved by Alex Dobrenko and
This is, by the way, the dirty secret of the machine learning movement: almost everything produced by ML could have been produced, more cheaply, using a very dumb heuristic you coded up by hand, because mostly the ML is trained by feeding it examples of what humans did while following a very dumb heuristic. There's no magic here. If you use ML to teach a computer how to sort through resumes, it will recommend you interview people with male, white-sounding names, because it turns out that's what your HR department already does. If you ask it what video a person like you wants to see next, it will recommend some political propaganda crap, because 50% of the time 90% of the people do watch that next, because they can't help themselves, and that's a pretty good success rate.
Saved by Alex Dobrenko and
Since the generative AIs have been trained on the entirety of human work — most of it mediocre — it produces “wisdom of the crowd”-like results. They may hit the mark but only because they are average.
In little more than a decade, machine learning has moved from a highly specialized technique to something that almost anyone with data and computational power can do. That is to be welcomed — yet it remains essential that the industry can navigate both the proliferation of tools and frameworks in the space and the ethical issues that are becoming
... See moreThere is a broad assumption underlying many machine-learning models that the model itself will not change the reality it’s modeling. In almost all cases, this is false.
both beautiful and tragic. It is beautiful because on a good day it requires very little work; you often don’t need to spend much time on the pesky job of feature engineering, and in the best case the machine takes care of a large fraction of what needs to be done. It is tragic because nothing ever guarantees that any system in the real world will
... See moreIBM, for example, managed to fix the problem of poor gender identification that Joy Buolamwini discovered by building a new training set with more pictures of black women. Google solved its gorilla challenge in the opposite way: by removing pictures of gorillas from the training set. Neither solution is general; both are instead hacks, designed to
... See moreA couple months ago, Oxford and Cambridge researchers illustrated the risk of homogeneity in a study of AI Generated content in Nature magazine. The risk increases as AI gets trained not only on human-created content, but on other AI-generated content.
As an example, the researchers studied an AI model trained on images of different breeds of dogs.
... See moreThe essential lesson here is that, when datasets are large enough, the knowledge encapsulated in all that data will often trump the efforts of even the best programmers.