Software 'no more accurate than untrained humans' at judging reoffending risk
Source: The Guardian
Software 'no more accurate than untrained humans' at judging reoffending risk
Program used to assess more than a million US defendants may not be accurate enough for potentially life-changing decisions, say experts
Hannah Devlin Science correspondent
Wed 17 Jan 2018 19.00 GMT
The credibility of a computer program used for bail and sentencing decisions has been called into question after it was found to be no more accurate at predicting the risk of reoffending than people with no criminal justice experience provided with only the defendants age, sex and criminal history.
The algorithm, called Compas (Correctional Offender Management Profiling for Alternative Sanctions), is used throughout the US to weigh up whether defendants awaiting trial or sentencing are at too much risk of reoffending to be released on bail.
Since being developed in 1998, the tool is reported to have been used to assess more than one million defendants. But a new paper has cast doubt on whether the softwares predictions are sufficiently accurate to justify its use in potentially life-changing decisions.
Hany Farid, a co-author of the paper and professor of computer science at Dartmouth College in New Hampshire, said: The cost of being wrong is very high and at this point theres a serious question over whether it should have any part in these decisions.
-snip-
Read more: https://www.theguardian.com/us-news/2018/jan/17/software-no-more-accurate-than-untrained-humans-at-judging-reoffending-risk
janterry
(4,429 posts)is age.
The older you get, the more stable (and less likely to reoffend) you become.
As far as those awaiting sentencing....I don't know that data
Loki Liesmith
(4,602 posts)1998 tech is worse than useless.
bettyellen
(47,209 posts)Last edited Wed Jan 17, 2018, 07:17 PM - Edit history (1)
inputs.
Loki Liesmith
(4,602 posts)bettyellen
(47,209 posts)bias. Garbage in, garbage out. You need a wide range of experience and opinions to have a well rounded world view- the internet is a shit show, and there are reasons for that.
PatentlyDemocratic
(89 posts)That is the entire point of machine learning. To figure things out without being given explicit "rules" by programmers.
Loki Liesmith
(4,602 posts)The main problem with machine learning is not programmers though. You cant really bias a result obtained via optimal statistical learning. What can be biased is the training data. There is a real lack of quality data sets out there so everyone falls back on the same ancient training sets. But thats not the programmers fault.
Need public investment in quality data stores.
bettyellen
(47,209 posts)said that they discovered that sentence suggesting software (could be same program- but maybe not?) was faulty because of biased human inputs. Too tired to look for it, but totally made sense when I read it. And honestly its a nightmarish scenario. So it was memorable.
Loki Liesmith
(4,602 posts)Machine learning...think of it like a dog you want to do a trick. The programmer comes up with a way to train the dog...lets say give him a biscuit.
So the programmer tests lots of different kinds of biscuits to get the dog to do a trick when it hears a sound and picks the best one.
But the problem isnt how to train the dog. Its what sound to get the dog to respond to. Say all you have on hand are recordings of Van Halen. Then the dog will efficiently learn to respond to David Lee Roths voice with a trick.
But of course thats not a very useful trick. The algorithm the programmer created was perfect an unbiased. But the training data was flawed.
Thats the problem with bias in machine learning. Our training data sets are flawed. They are randomly pulled from public data which has tended to over-represent white people. Because historically internet users have been more white than the average population.
Why use public internet data? Its cheap and easy to find. Data cleansing is time consuming and not always productive.
What we need to do is fund a big public works project to create good and representative training data. Because data is increasingly a public good and society cant function with biased and dirty data.
bettyellen
(47,209 posts)Programmers lack of awareness and experience, leading to promoting a racial bias however inadvertently. Its interesting to here silicone valley folks insist that its all about bytes and there cant be an bias at all happening or need for diversity. Theyre unaware because their bias is like societys default settting - which is biased against women and POC much of the time.
Orrex
(63,172 posts)That kind of thinking should be familiar to anyone working in the corporate world.
Jim__
(14,063 posts)I would expect experienced people to do better than untrained people. If that's true, then the effect of the software is actually negative - its predictions are worse than what we would expect from people in the field.