A regular reader sends a WSJ link. 
Apparently social media software shows cultural bias. An image-scanning software's initial data set was predominantly white. (The article does not say what the percentages were, and whether the predominance simply tracked American demographics or was more than that.) Additionally, because there are more white faces than black ones on the internet, a software that continually learns from experience amplifies this bias.
Fixing the imbalance is certainly the first priority for the software developers.  But another use immediately occurred to me.  Observing how the software goes wrong might be useful in learning how we learn our biases, including the black box of what "initial data set" is hard-wired into us, and whether that can be compensated for.  In the examples from the article, there was some intuitive connection between how we act and how the software acts, making it quickly understandable.  There might be more that is not immediately noticeable.
There may be a hidden problem in that.  If software modification gives us insight into how we ourselves might be modified, couldn't it be used to increase our biases rather than decrease them, in a manner which reflected the desires of the more powerful?
I think that experiment has already been run. People over-estimate the percentage of the population that identifies as gay/lesbian by an order of magnitude due to media exposure.
ReplyDeleteSeveral issues come into play here.
ReplyDeleteProperly used, neural net analyses produce empirical algorithms with no clear path to understanding causes. Sometimes that's what you want--you understand the causes and just want some selection that maximizes A and minimizes B, in order to use that set of events to study something else. If you use it for some hiring procedure you have no way to know why some trait is selected for, so you have no way to know whether it introduces an invidious bias. A friend who used to run a plastic recycling firm said he asked somebody to figure out the characteristics of an ideal employee, based on the firm's successes with people they'd hired in the past. For that hot noisy job, the top two predictors of a successful hire were 1) a beard and 2) a criminal conviction.
The models give statistical answers, and, without training on truly gigantic data sets, aren't going to be as sensitive to variations in minority populations. A 5% effect on 95% of the data is going to have a larger effect on the prediction than a 50% effect on the remaining 5%. You can, as the article suggests, post-process your results with studies sensitive to other factors, such as the effects on minority groups. But it isn't obvious that under of over representation in the final set is necessarily invidious.
You can try to restrict models to only those factors you think are or should be relevant, but this kind of reductionism is likely to be biased no matter which direction you go. Our groups are lucky--we don't have to consider income or history or favorite TV program in making models for NN analysis: just particle interactions (to the best of our current understanding). With people you find that favorite alcoholic beverage can correlate strongly with political party--which item do you want to include in your model of relevant factors for picking the best liver transplant candidates?
Garbage in, Garbage out. Biases of omission or commission on the part of coders, or memory drop-outs, or lack of sufficient experience.
ReplyDelete