Editorials & Other Articles

Eugene

(61,807 posts) Wed Jan 17, 2018, 04:38 PM Jan 2018

Software 'no more accurate than untrained humans' at judging reoffending risk

Source: The Guardian

Software 'no more accurate than untrained humans' at judging reoffending risk

Program used to assess more than a million US defendants may not be accurate enough for potentially life-changing decisions, say experts

Hannah Devlin Science correspondent
Wed 17 Jan 2018 19.00 GMT

The credibility of a computer program used for bail and sentencing decisions has been called into question after it was found to be no more accurate at predicting the risk of reoffending than people with no criminal justice experience provided with only the defendant’s age, sex and criminal history.

The algorithm, called Compas (Correctional Offender Management Profiling for Alternative Sanctions), is used throughout the US to weigh up whether defendants awaiting trial or sentencing are at too much risk of reoffending to be released on bail.

Since being developed in 1998, the tool is reported to have been used to assess more than one million defendants. But a new paper has cast doubt on whether the software’s predictions are sufficiently accurate to justify its use in potentially life-changing decisions.

Hany Farid, a co-author of the paper and professor of computer science at Dartmouth College in New Hampshire, said: “The cost of being wrong is very high and at this point there’s a serious question over whether it should have any part in these decisions.”

-snip-

Read more: https://www.theguardian.com/us-news/2018/jan/17/software-no-more-accurate-than-untrained-humans-at-judging-reoffending-risk

12 replies

= new reply since forum marked as read

Highlight:

Software 'no more accurate than untrained humans' at judging reoffending risk (Original Post) Eugene Jan 2018 OP

The only accurate predictor for reoffending upon release janterry Jan 2018 #1

Modern machine learning will change this Loki Liesmith Jan 2018 #2

Diversity among programmers is needed because the logic isnt based on biased human bettyellen Jan 2018 #3

Those statements have nothing to do with each other Loki Liesmith Jan 2018 #4

Spellcheck had changed is to isnt. Diversity is needed for better inputs - to reduce the inherent bettyellen Jan 2018 #5

Machine learning is not based on biased human inputs PatentlyDemocratic Jan 2018 #6

Ok. Get it. Loki Liesmith Jan 2018 #7

Okay- that flew over my head. I actually read something six weeks ago that bettyellen Jan 2018 #8

Sorry Loki Liesmith Jan 2018 #9

Ahhh, got it. Ill still contend it was overlooked that those inputs were faulty because of the bettyellen Jan 2018 #12

Sounds like one of those "it cost a lot of money, so dammit we're using it" deals Orrex Jan 2018 #10

Did they also test the software against experienced people? Jim__ Jan 2018 #11

janterry

(4,429 posts)

1. The only accurate predictor for reoffending upon release

Reply to Eugene (Original post)

Wed Jan 17, 2018, 04:41 PM

Jan 2018

is age.

The older you get, the more stable (and less likely to reoffend) you become.

As far as those awaiting sentencing....I don't know that data

Loki Liesmith

(4,602 posts)

2. Modern machine learning will change this

Reply to Eugene (Original post)

Wed Jan 17, 2018, 04:47 PM

Jan 2018

1998 tech is worse than useless.

bettyellen

(47,209 posts)

3. Diversity among programmers is needed because the logic isnt based on biased human

Reply to Loki Liesmith (Reply #2)

Wed Jan 17, 2018, 04:50 PM

Jan 2018

Last edited Wed Jan 17, 2018, 07:17 PM - Edit history (1)

inputs.

Loki Liesmith

(4,602 posts)

4. Those statements have nothing to do with each other

Reply to bettyellen (Reply #3)

Wed Jan 17, 2018, 05:03 PM

Jan 2018

bettyellen

(47,209 posts)

5. Spellcheck had changed is to isnt. Diversity is needed for better inputs - to reduce the inherent

Reply to Loki Liesmith (Reply #4)

Wed Jan 17, 2018, 07:20 PM

Jan 2018

bias. Garbage in, garbage out. You need a wide range of experience and opinions to have a well rounded world view- the internet is a shit show, and there are reasons for that.

PatentlyDemocratic

(89 posts)

6. Machine learning is not based on biased human inputs

Reply to bettyellen (Reply #5)

Wed Jan 17, 2018, 07:30 PM

Jan 2018

That is the entire point of machine learning. To figure things out without being given explicit "rules" by programmers.

Loki Liesmith

(4,602 posts)

7. Ok. Get it.

Reply to bettyellen (Reply #5)

Wed Jan 17, 2018, 08:13 PM

Jan 2018

The main problem with machine learning is not programmers though. You can’t really bias a result obtained via optimal statistical learning. What can be biased is the training data. There is a real lack of quality data sets out there so everyone falls back on the same ancient training sets. But that’s not the programmers fault.

Need public investment in quality data stores.

bettyellen

(47,209 posts)

8. Okay- that flew over my head. I actually read something six weeks ago that

Reply to Loki Liesmith (Reply #7)

Thu Jan 18, 2018, 12:16 AM

Jan 2018

said that they discovered that sentence suggesting software (could be same program- but maybe not?) was faulty because of biased human inputs. Too tired to look for it, but totally made sense when I read it. And honestly it’s a nightmarish scenario. So it was memorable.

Loki Liesmith

(4,602 posts)

9. Sorry

Reply to bettyellen (Reply #8)

Thu Jan 18, 2018, 01:47 AM

Jan 2018

Machine learning...think of it like a dog you want to do a trick. The programmer comes up with a way to train the dog...let’s say give him a biscuit.
So the programmer tests lots of different kinds of biscuits to get the dog to do a trick when it hears a sound and picks the best one.

But the problem isn’t how to train the dog. It’s what sound to get the dog to respond to. Say all you have on hand are recordings of Van Halen. Then the dog will efficiently learn to respond to David Lee Roth’s voice with a trick.

But of course that’s not a very useful trick. The algorithm the programmer created was perfect an unbiased. But the training data was flawed.

That’s the problem with bias in machine learning. Our training data sets are flawed. They are randomly pulled from public data which has tended to over-represent white people. Because historically internet users have been more white than the average population.

Why use public internet data? It’s cheap and easy to find. Data cleansing is time consuming and not always productive.

What we need to do is fund a big public works project to create good and representative training data. Because data is increasingly a public good and society can’t function with biased and dirty data.

bettyellen

(47,209 posts)

12. Ahhh, got it. Ill still contend it was overlooked that those inputs were faulty because of the

Reply to Loki Liesmith (Reply #9)

Thu Jan 18, 2018, 03:58 PM

Jan 2018

Programmers lack of awareness and experience, leading to promoting a racial bias however inadvertently. It’s interesting to here silicone valley folks insist that it’s all about bytes and there can’t be an bias at all happening or need for diversity. They’re unaware because their bias is like society’s default settting - which is biased against women and POC much of the time.

Orrex

(63,172 posts)

10. Sounds like one of those "it cost a lot of money, so dammit we're using it" deals

Reply to Eugene (Original post)

Thu Jan 18, 2018, 09:09 AM

Jan 2018

That kind of thinking should be familiar to anyone working in the corporate world.

Jim__

(14,063 posts)

11. Did they also test the software against experienced people?

Reply to Eugene (Original post)

Thu Jan 18, 2018, 11:56 AM

Jan 2018

I would expect experienced people to do better than untrained people. If that's true, then the effect of the software is actually negative - its predictions are worse than what we would expect from people in the field.

Reply to this discussion