r/science 2d ago

Medicine One of the most cited AI models used to scan chest x-rays doesn’t accurately detect potentially life-threatening diseases in women & Black people. Black women fell to the bottom, with the AI not detecting disease in half of them for conditions such as cardiomegaly, or enlargement of the heart.

https://www.science.org/content/article/ai-models-miss-disease-black-female-patients
4.5k Upvotes

251 comments sorted by

u/AutoModerator 2d ago

Welcome to r/science! This is a heavily moderated subreddit in order to keep the discussion on science. However, we recognize that many people want to discuss how they feel the research relates to their own personal lives, so to give people a space to do that, personal anecdotes are allowed as responses to this comment. Any anecdotal comments elsewhere in the discussion will be removed and our normal comment rules apply to all other comments.


Do you have an academic degree? We can verify your credentials in order to assign user flair indicating your area of expertise. Click here to apply.


User: u/MistWeaver80
Permalink: https://www.science.org/content/article/ai-models-miss-disease-black-female-patients


I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

1.1k

u/Seraph199 2d ago

This is the massive problem with AI. It can seem perfectly accurate, then it turns out the scientists were only testing it on specific subjects for "reliability" and ope it turns out that defeats the entire purpose of AI and trains it to literally discriminate just like the people who made it.

252

u/STLtachyon 2d ago

Or the initial training data were skewed one way or another. A similar case was an AI determining if a patient had a disease partially by looking at the hospital that the xray was taken. It did so, because the initial data included cases of a local epidemic which meant the patients location was factored in the "diagnosis".

22

u/sold_snek 2d ago

Oof, that's a huge one.

17

u/psymunn 1d ago

I heard a case of an AI model that could tell the difference between cancer and a non-cancerous mole by identifying if the photo used had a ruler or measuring device in it. That's one problem with AI models being non-human readable. It's like regex but many times worse

19

u/vg1220 2d ago

I’m a little surprised this paper got by the reviewers. They show that sex (female), race (black), and age (older) have lower rates of diagnosis. Women have more breast tissue on average than men, and racial minorities and the elderly correlate with obesity - all of which is known to detrimentally affect Xray image quality. Not one mention in the methods regarding controlling for BMI, chest circumference, or anything like that.

4

u/spookmann 2d ago

Well, to be fair, the blood donation center in NZ did that for years.

They wouldn't accept my blood because I had visited the UK in the 10-year window of the BSE occurrences.

And we did that way more recently for COVID, by asking where people had been.

13

u/tokynambu 2d ago

It’s a not-unreasonable strategy. It looks like, although it will take a generation or more to know, that the risks of CJD in humans triggered by BSE in meat were overstated. Incidence of CJD in the UK has not risen substantially, and there were 0 (zero) vCJD (the variant caused by BSE) cases in 2020. That said, in the 1990s and 2000s no-one knew, the incubation period is long and there had been a lot of BSE in the UK food chain. Since transmission by blood transfusion has been recorded, and the blood products industry is still recovering from AIDS and hepatitis transmission in the 1980s, broad-spectrum elimination of UK blood from a nation’s supply is and was a reasonable response.

2

u/spookmann 1d ago

Yeah. Shame they couldn't test, though.

That was a lot of regular donors that it cost them!

137

u/HerbaciousTea 2d ago

Neural networks are pattern finding engines, and pattern finding engines only. A pattern resulting from biased data is absolutely no different to it from a pattern resulting from actual real world correlations.

88

u/Anxious-Tadpole-2745 2d ago

We often don't pay attention to all the patterns so we miss crucial ones. 

We tried to breed Chcolate Labs for intelligence without realizing that food motiviation accelerates task compliance. So we ended up trying to breed for intelligence snd simply made very hungry dogs.

47

u/VoilaVoilaWashington 2d ago

Describing labs as "very hungry dogs" is hilariously apt. I have a golden who tries so so hard but he's just a hungry moron that means well.

8

u/evergleam498 2d ago

One time our yellow lab got into the 40lb bag of dog food in the garage when we weren't home. He ate so much he got sick, then ate so much he got sick again. He probably would've kept eating if we hadn't come home when we did.

-7

u/MarsupialMisanthrope 2d ago

It’s at least discriminating based on data, unlike doctors who do it based on personal prejudices. Data can be corrected for by adding more training data containing groups that were underweighted in the original dataset. Convincing a doctor to stop giving lousy care to patients in demographics they dislike is a lot harder, not least because they’ll fight to the last to avoid admitting they’re treating some patients based on how they look and not their symptoms.

10

u/snubdeity 2d ago

unlike doctors who do it based on personal prejudices

This just isn't true, most of the time. Doctors, as a whole, are probably about as left-leaning as this damned site. And even black doctors perform worse with black patients than they do with white ones.

Why? Because they were trained on the same skewed data these AIs were.

And it's really hard to get better data.

9

u/ebbiibbe 2d ago

If you study health care informatics in college there are numerous studies about bias from health care professionals.

16

u/son_of_abe 2d ago

Doctors, as a whole, are probably about as left-leaning as this damned site

Sorry, this could not be more wrong. This was my impression as well before being introduced to networks of medical doctors. Roughly half I've met were conservative.

It makes more sense once you consider the financial barrier to entry that medical school poses. Many MDs come from wealth and have politics that align more with those interests than that of their profession (science).

16

u/Bakoro 2d ago

Doctors aren't magically immune from prejudice, no one is.

There are racist doctors and serial killer doctors, same with nurses, same with everything else. Positions of power and prestige are especially attractive to bad people of whatever flavor. Also, some doctors are just bad at their job.
That's just life.

Getting better data is not hard at all, it's just socially and politically unattractive to say that we're going to start collecting everyone's anonymized medical data as a matter of course. It's what we should do, but people would freak out about it.

2

u/yukonwanderer 1d ago

Women are still largely excluded from medical studies. Don't tell me it's hard to get good data. It's critical that we get good data.

300

u/redditonlygetsworse 2d ago edited 2d ago

trains it to literally discriminate just like the people who made it.

Yes: garbage in, garbage out. AI can only replicate our biases, not remove them.

Still, though, once the problem is identified it's not a big mystery how to fix it. It might not be cheap or fast to re-train, but it's not like we don't know how.

98

u/spoons431 2d ago

But honestly they'll just use it and say it's fine - they're like who cares about more than half the population.

Medical basis is real and still now is 2025 there is little or nothing being done about - as an example and I tend to use this one a lot is there's still no real research into women and how ADHD affects them differently and oestrogen fluctuations, monthly for decades and across their lifetime, affects the systems and severity of this. This is despite 2 conclusions that are know - 1. ADHD is a chronic lack of dopamine in the brain. 2. Oestrogen levels affect dopamine levels.

There have been issues with this reported in the community for decades at this point, but it only something that is just beginning to be looked at.

71

u/Fifteen_inches 2d ago

To also add, they only recently started publishing a visual encyclopedia of how rashes appear on dark skin tones, because even black doctors are taught on the white skin patient standard.

12

u/ineffective_topos 2d ago

The idea that ADHD is a chronic lack of dopamine in the brain is a misconception or oversimplification as far as I know. It's somewhat more accurate that it includes failures in certain dopamine pathways.

4

u/nagi603 2d ago

See also "a kid is just a small adult, right?"

5

u/Rhywden 1d ago

I'll one-up you on this: There has been only recently a study done on women's peri-menopausal issues with lack of iron due to increased menstrual bleeding.

One of the big issues exclusively for women and only this year someone finally got around to establishing key facts about it.

65

u/Mausel_Pausel 2d ago

How do you fix it? You can’t train it with data you don’t have, and the medical community has routinely minimized the participation of women and minorities in their studies. 

83

u/redditonlygetsworse 2d ago

Yep, 100%. Like I said above: replicate our biases.

So you fix it by getting that data. Again, like I said, not necessarily cheap or fast; but we know exactly how to do it. We're not back at square one.

19

u/OldBuns 2d ago

This is technically the case, but it comes with an important caveat.

The tendency of human bias to bleed into AI is almost unavoidable.

I'm not saying it's bad or shouldn't be used or anything, but we need to be wary of treating this as "just a tool" that can be used for good or bad depending on the person using it, because this isn't a case where you can just fix it by being cognizant enough.

Bias is innate in us. The methods and procedures we use to test and train these things exacerbates those biases because they are built into the process as assumptions.

In addition to this, sometimes, even if you are intentionally addressing the biases, the bias comes FROM the algorithm itself.

"Algorithmic oppression" by safiya noble is a fantastic read on the issue, and uses a very succinct example.

Imagine an algorithm or AI that's trained to put the most popular barbershops at the top of the list.

In a community of 80% white individuals and 20% black, there will NEVER be a case where a barbershop that caters to that specific hair type will ever appear on that algorithm. This inherently means less access to a specific service by a specific group of people.

But also, how would you even TRY to go about solving this issue in the algorithm other than creating 2 different ones altogether?

What new problems might that cause?

This is obviously oversimplified, but it's a real life example of how bias can appear in these systems without that bias existing in the people that create it.

3

u/Dragoncat_3_4 2d ago

But also, how would you even TRY to go about solving this issue in the algorithm other than creating 2 different ones altogether?

Well... yeah.

Once you've identified your currently existing formula/ratio/normal range/etc doesn't work with a specific sub group within your population, you split the data and revise your formula for both groups.

In this case they would probably need to re-label all of their training data to include race as well as obtain more images of both pathological and healthy people of the underrepresented racial group.

Of course, the researchers procuring the data need to take extra care to avoid underreporting said pathology due to their pre-existing bias but these things should work themselves out with enough revisions.

8

u/OldBuns 2d ago

these things should work themselves out with enough revisions

Maybe, but at what cost? How many, and how large, are the mistakes we are willing to unleash onto society in the hopes that "eventually they'll be worked out"?

7

u/Dragoncat_3_4 2d ago

I'd imagine the mistake count would be a lot lower than when these things were initially formulated at least.

That's how it works in medical science in general though. People do studies, other people collate the results into guidelines and then somebody inevitability comes along and publishes " X and Y are inadequate diagnostic criteria for A, B or C groups; we propose a revised X and Y criteria for these groups". And eventually the guidelines get updated.

Appropriate use of AI could speed up the process and potentially expose biases and faults in the data more quickly.

5

u/OldBuns 2d ago

That's how it works in medical science in general though.

100%. This study is a good example of that. But remember that the use case, in this instance, is diagnostically assistive, not actionably prescriptive.

The big, existential risks and mistakes we should be worried about are the processes in which AI takes an active role in creating and building our world, material or digital.

As McLuhan would say, once we create and shape a tool that fundamentally changes the way we engage with the world, the tool then inevitably shapes us.

Social media AI algorithms are a perfect example of how the system itself breeds bias in its consumers, even though the bias wasn't built in, nor is the AI "aware" of this bias.

And yet, even knowing all its faults and issues, we can't really "put the genie back in the bottle" so to speak.

This broadly fits into the question of "the alignment problem" where you simply cannot know for sure whether the AI is learning what you ACTUALLY want it to learn vs something that LOOKS like what you want it to learn.

Two minute papers and Robert Miles are great YouTube channels with lots of videos about this specific topic if you're interested.

1

u/F0sh 2d ago

But also, how would you even TRY to go about solving this issue in the algorithm other than creating 2 different ones altogether?

Create an algorithm that first automatically segments the population and then uses the estimated segment in the recommendation part.

It's utterly routine already - you've seen it everywhere with messages like "people like you read/bought/watched/listened to..." and is the basis of recommender systems.

The methods and procedures we use to test and train these things exacerbates those biases because they are built into the process as assumptions.

I think that is often not true - when the process makes any attempt to address bias you can do a very good job of mitigating them. You will generally end up with some other bias, but it won't be along the same lines that societal biases take.

1

u/OldBuns 2d ago

I know I didn't do it justice, but the essay I referenced covers it more in depth and addresses all of these things.

I have also replied to other comments giving additional details.

The major factor is what's called "the alignment problem" and it has not been solved.

It's utterly routine already - you've seen it everywhere with messages like "people like you read/bought/watched/listened to..." and is the basis of recommender systems.

Well exactly, but we know that this causes many other issues and we now have the unique problem of not quite knowing what in the algorithms is causing it to behave this way, and therefore we don't have a fix because they are opaque systems.

1

u/F0sh 1d ago

You may end up with new problems, but what I'm getting at is that once you can measure a problem, you can take action to fix it algorithmically. If the problem is hard to measure, then you don't really know that the algorithm has made it any worse.

In the case of the most commonly raised issue with recommender systems - "bubbles" - you don't really know that this was any worse than without recommender systems. The system itself may recommend things in a very bubbly way, but people tend to behave the same way already because they're also trying to get recommendations that are likely to match their own preferences; and people tend not to only use recommender systems to get their recommendations even when they exist.

I saw a study last year that said despite the undeniable filter bubbles on social media, a large majority of people are still aware of the news stories that would generally be outside their bubble, because most people don't just get their news from facebook.

we don't have a fix because they are opaque systems.

It's always worth remembering that humans are pretty opaque too.

1

u/Bakoro 2d ago

"Algorithmic oppression" by safiya noble is a fantastic read on the issue, and uses a very succinct example.

Imagine an algorithm or AI that's trained to put the most popular barbershops at the top of the list.

In a community of 80% white individuals and 20% black, there will NEVER be a case where a barbershop that caters to that specific hair type will ever appear on that algorithm.

Part of the problem is that people don't even understand the questions they are asking, the meaning is glossed over or framed with a particular perspective. Then the data is usually interpreted through the lens of a malformed or biased question.
A question of popularity is literally a question of bias.

In your example, a black barbershop could make the list. It could even top the list. It would do that by being in a community where there are only one or two black barbershops, but many white barbershops.
One barbershop could be overwhelmed, catering to an underserved community.

You asked about "popularity" and stumbled into a much greater issue of economic and social inequity.

That's not just a convenient hypothetical that I pulled from the air, we can see parallels in so called "food deserts" where people don't have easy access to grocery stores, and often times poor public transportation.
I'd wager that if you did a "popularity" study, you'd find weirdly "popular" spots, which are literally just people going to whatever is available.

You're likely to get problematic results whenever you're trying to regress down to a single point, stripped of context.

But also, how would you even TRY to go about solving this issue in the algorithm other than creating 2 different ones altogether?

By asking better questions, and then giving multiple answers based on different factors, and giving contextualized results.

Everything has bias to it, from the questions we ask, to the data collection, to the data processing.
What we can do is offer insights which are open to investigation, rather than presenting things as absolute facts.

→ More replies (6)

2

u/Victuz 2d ago

But even assuming that somehow you gather the data and "tie off" the bias. How do you ensure no different bias enters into the model? How do you ensure that the new data doesn't somehow "poison" the model making it less reliable?

The problem with black box solutions like these is that beyond extensive testing, and using other black boxes to test your own black box there isn't any good solution so far as I know.

34

u/AuDHD-Polymath 2d ago

I mean it’s actually rather straightforward to address. Model generalization is often not a priority when engineering AI, because doing it properly will make it seem like it gives marginally worse results (on the biased data you do have).

  • Get more data and be more careful about how you sample it
  • or weight the rarer samples (like black women) higher in training to balance out the importance
  • Or choose a loss function that penalizes this effect
  • Or remove data selectively until the training dataset is more balanced
  • various other training techniques like regularization and ‘dropout’

I make medical computer vision models and things like robustness and reliability and generalization just aren’t valued by the higher ups as much, because they cant easily show those things off.

10

u/F0sh 2d ago

And an important one: don't use models that are unreliable on certain populations within those populations.

This model is better than doctors on the population it was evaluated on. If you can use it on that population, it frees up doctors to spend more time diagnosing scans of the patients it doesn't work well on.

You're right, it shouldn't be hard to fix the model, and retraining once an architecture and data pipeline has been found is cheap in comparison to the initial research. But in the worst case, having a biased model is still better than having no model.

2

u/vannak139 1d ago

A lot of times, the population models do or don't work on isn't remotely clear. Many times, instrumentation settings or even bias in how data is labeling is done, or even crazier stuff like the sun being high up when images were taken, can drive bias as much or more as racial or population based bias.

1

u/F0sh 1d ago

A perfectly reasonable health policy is that any procedure (be it surgery, how to handle scans or, in this case, the use of AI) be evaluated on particular populations (men, women, specific minorities, etc) before widespread use. So that if the original studies didn't track subpopulation performance, it cannot be used without further study.

6

u/00kyb 2d ago

It really is a shame the stark difference between the good things we can do with AI and what shareholders and executives want to do with AI

→ More replies (3)

6

u/RobfromHB 2d ago

How do you fix it? You can’t train it with data you don’t have

No, but you can balance training data or use something like SMOTE to correct for this. It's a fairly common problem and there are a lot of techniques to manage it.

2

u/VitaminPb 2d ago

The data most likely already exists but was not part of the training data.

But I think the most interesting observation you can make is that lung scans of women and black people apparently are different from those of white men. Is it how the scans are made or actual biological differences that are significant enough to affect the detection? Why would a black man’s lung scan be significantly different from a white man? Women’s breasts might be an issue, but a male?

-21

u/SolarStarVanity 2d ago

If you think it's the medical community that minimizes it, and not women and minorities that choose not to volunteer for said research then you've done very little research volunteer gathering in your life.

27

u/redditonlygetsworse 2d ago

You know perfectly well that it is both. And people in those groups have very good historical reasons to be skeptical.

-17

u/SolarStarVanity 2d ago

Regardless of their reasons, it's on them to change it. Historically, sure, there's been reluctance to include both groups, but said reluctance has been gone for years: researchers bend over backwards to increase representation in their data. At this point, the only people that can change said bias in data are those missing from it.

17

u/SkellySkeletor 2d ago

I really cant stand the whole “well of course they’re anti-science/medicine” bit. Positive feedback loop of certain groups not participating in trials, scientists unable to invent data that’s not coming in, and then using that very lack of data as further cause to not participate. What are they supposed to do?

1

u/dftba-ftw 2d ago

Woosh

They are not saying, scientist ignore minority data, results don't benifit minorities, therefore minorities don't participate.

They're saying that historically we've done horrible things to minorities, secretly, during medical studies.

→ More replies (9)

1

u/SolarStarVanity 2d ago

I... think I know what you are saying? Could you clarify who each "they" refers to?

0

u/ChefDeCuisinart 2d ago

This may sound crazy, but they could put women and minorities on their ethical review boards. But they tend to not do that.

2

u/SkellySkeletor 2d ago

It’s frustrating, because we’ve come so far in science and equality of treatment, and yet we’re still decades behind where we need to be. The next generation of scientists seems to finally be showing a better mix of backgrounds and contexts outside of, you know, rich white male, so maybe that ball will finally roll.

2

u/ChefDeCuisinart 2d ago

"It's your own fault we can't help you, after we historically used and abused you!"

14

u/prof_the_doom 2d ago

I can't imagine why they don't choose to volunteer.

It's a pretty long history of abuse and sidelining that you have to overcome, and there's a lot of people out there that aren't actually all that interested in overcoming it.

18

u/randynumbergenerator 2d ago

Also, even when women and minorities seek treatment, their symptoms are often minimized. At least in the US, there are still practicing medical professionals that think "black people feel less pain," resulting in less access to pain relief for the same conditions.

https://www.aamc.org/news/how-we-fail-black-patients-pain

3

u/magus678 2d ago

People are dumping on you, but as someone who has worked in the clinical trial space, you are correct. Women in particular simply are not interested in doing these trials, even when incentivized.

People are quick to validate them in this but the most commonly cited reason has always been "work life balance," nothing conspiratorial. Doing these trials is not fun, and women simply decline, like many other jobs, to participate.

Which is of course fine. But it is very strange for them decry the industry for their lack of inclusion. Doctors cannot design around data women refuse to give them.

6

u/Potential_Being_7226 PhD | Psychology | Neuroscience 2d ago

But this has not historically been true. Women have been historically omitted from clinical trials for several reasons. First, people thought that men could generalize to all people. Now we know that’s not true. Second, researchers thought it was best to exclude women on the chance they might be pregnant, avoiding any catastrophes like that which happened with thalidomide. Researchers have also incorrectly thought that women’s hormone fluctuations would lead to widely variable data, making it difficult to glean any meaningful information from the inclusion of women participants. 

https://www.aamc.org/news/why-we-know-so-little-about-women-s-health

https://www.ncbi.nlm.nih.gov/books/NBK236583/

https://www.labiotech.eu/in-depth/women-clinical-trial/

In a clinical settings, women’s health issues are consistently downplayed and a discounted. Women’s pain is underestimated. Woman are assumed to be exaggerating. Health issues are often attributed to psychological issues by practitioners who are not fully qualified to make psychological diagnoses.

Women with families are also responsible for a greater proportion of domestic responsibilities than men. If women enroll in clinical trials, who picks up their “third shift?” 

https://www.icelandreview.com/news/tens-of-thousands-participate-in-womens-strike/

You say you have “worked in the clinical trial space,” but you seem not to understand all the historical and cultural factors that have limited women being included in clinical trials. You chalk it up to a lack of motivation, but you fail to recognize the situational factors that influence women’s behavior in this context. That’s called a fundamental attribution error.

https://en.m.wikipedia.org/wiki/Fundamental_attribution_error

Given your cursory understanding of the issues here, I genuinely hope you’re not longer working in the “clinical trial space.” And if you are, then I hope you do some more reading and check your assumptions about the factors that actually limit and have limited women’s inclusion in clinical trials. 

-1

u/magus678 2d ago

You say you have “worked in the clinical trial space,” but you seem not to understand all the historical and cultural factors that have limited women being included in clinical trials.

I am saying that the women, when asked why they are not interested, are not citing any of this. They are just saying they don't want to stay in a clinic for a week plus at a time for the compensation being offered. Perhaps you should talk to those women to better explain their motivations to them.

1

u/yukonwanderer 1d ago

What studies are you referring to?

1

u/Potential_Being_7226 PhD | Psychology | Neuroscience 1d ago

I’m saying there’s more to it than not being “interested.” There are a variety of reasons why people might not be “interested.”

Also, the plural of anecdote is not data. 

→ More replies (2)

1

u/Mausel_Pausel 2d ago

5

u/SolarStarVanity 2d ago

Tell me why I should, and what I'll find there.

7

u/foamy_da_skwirrel 2d ago

I'm sure you won't read this either but in case someone else scrolling through wants to, this was a really interesting read about why it's more complicated than just who is volunteering

https://pmc.ncbi.nlm.nih.gov/articles/PMC3222419/#:~:text=Issues%20of%20trust%2C%20physician%20perceptions,participation%20in%20therapeutic%20clinical%20trials.

→ More replies (6)

1

u/vannak139 1d ago

I think that you're a bit off on how you're reading this, tbh. Garbage in garbage out is a huge simplification, that's simply not true or at the very minimum, not that simple. Models such as "Noise2Noise" are pretty clear indications that you can train output of higher quality than input. In this model, they start with clean images, add noise, and then add even more noise. They have a model map More Noise to Less Noise, and get cleaner data than the level Less Noise was at. You throw noisy data in, and get clean data. Of course, good data is important but the GIGO rule isn't some hard fact we can't escape, its not conservation of energy or something.

On the opposite side of things, even if you do identify some kind of bias issue, a subtype that isn't being classified correctly, this doesn't automatically lead you to a solution. The plan fact is, we have many strategies and sometimes, even often, they don't work at all. On the r/learnmachinelearning subreddit right now, there's a post asking if "SMOTE ever works". Smote is one such strategy for dealing with under-represented data, standing for Synthetic Minority Oversampling TEchnique. This isn't exactly the same problem being addressed, but its pretty clear we have many more ideas for how to address issues, than we have one-click solutions which actually work.

It is very common in ML to have "an answer" for some problem, and it just doesn't work. I don't think you actually need to be in the weeds of technical details to see this is the case.

35

u/Strict-Brick-5274 2d ago

It's also a problem with data sets available.

Data that AI is trained on tends to be homogenised because data comes from rich places that tend to have homogeneous groups of people.

This is a nuanced issue.

21

u/WTFwhatthehell 2d ago

If you go to figure 2 you'll see that the results from the radiologists and the AI largely overlap.

The radiologists had roughly the same shortfall in roughly the same groups.

19

u/justgetoffmylawn 2d ago

Unfortunately, this is a problem with medicine in general.

Up until not that long ago, research trials often used only men because women's pesky hormone system confused the study results. Therefore, the 'results' were only really valid for men, but were used for rx'ing to women as well.

This is a massive problem - with AI, our medical system (good luck being a women in her 50's suffering a heart attack), our justice system, etc.

Bias is not unique to AI, but hopefully we'll pay attention to it more than we do in humans.

8

u/Optimoprimo Grad Student | Ecology | Evolution 2d ago

It's the massive problem with the current algorithms that we have started conflating with AI. The current models don't truly "learn," they just identify patterns and replicate them. That foundational approach will forever cause them to be susceptible to replication error and will make them incapable of scaling to generally useful applications.

3

u/never3nder_87 2d ago

Hey look it's the X-Box Kinect phenomenon

2

u/K340 2d ago

Good thing the current U.S. administration hasn't effectively banned any research to address this kind of issue from receiving federal funds.

2

u/Icy_Fox_749 2d ago

So it’s not a problem with the AI itself but the person operating the AI.

The AI did exactly what it was prompted to do.

31

u/InnuendoBot5001 2d ago

Yeah, then corporations tell us that we can trust everything to AI, meanwhile black resumes get canned because the AI that reads them is built on racist data, because basically all the data america has is tainted by racial bias. These models spit out what we put in, and the world has too much hatred for us to expect anything else out of them.

3

u/OldBuns 2d ago

Yes. This is technically the case, but it comes with an important caveat.

The tendency of human bias to bleed into AI is almost unavoidable.

I'm not saying it's bad or shouldn't be used or anything, but we need to be wary of treating this as "just a tool" that can be used for good or bad depending on the person using it, because this isn't a case where you can just fix it by being cognizant enough.

Bias is innate in us. The methods and procedures we use to test and train these things exacerbates those biases because they are built into the process as assumptions.

In addition to this, sometimes, even if you are intentionally addressing the biases, the bias comes FROM the algorithm itself.

"Algorithmic oppression" by safiya noble is a fantastic read on the issue, and uses a very succinct example.

Imagine an algorithm or AI that's trained to put the most popular barbershops at the top of the list.

In a community of 80% white individuals and 20% black, there will NEVER be a case where a barbershop that caters to that specific hair type will ever appear on that algorithm. This inherently means less access to a specific service by a specific group of people.

But also, how would you even TRY to go about solving this issue in the algorithm other than creating 2 different ones altogether?

What new problems might that cause?

This is obviously oversimplified, but it's a real life example of how bias can appear in these systems without that bias existing in the people that create it.

1

u/vannak139 1d ago

Bias is not only innate in us, it's a critical in ML as well, critical for analysis itself. Just talking about getting rid of bias, or suggesting we just use two models, are kind of practical examples of this; you can't just "take out" the bias. 

Anyways, the answer no one will like but is workable is that the model should look at your chest xray and tell you your race, or fat, or old, or in a high background radiation area. Think that would work better than a second, smaller model.

1

u/OldBuns 1d ago

Yes, I realize I absolutely butchered the example in hindsight.

See my other comments for clarification.

You're absolutely right, and this is something that not many people are able to accept it seems.

The alignment problem HAS NOT been solved, and in my opinion, that should be priority One.

1

u/WTFwhatthehell 2d ago

But also, how would you even TRY to go about solving this issue in the algorithm other than creating 2 different ones altogether?

Modern social media handles it by sorting people by what they like and matching them with similar people.

Do you like [obscure thing] ? Well the system has found the 10 other people in the world that like it and shows you things they like.

 Nothing needs universal popularity, you can be popular with one weird group and the algorithm will unite you with them.

It does however automatically put people in a media filter bubble with those most like them which can lead to some weird worldviews. 

3

u/OldBuns 2d ago

It does however automatically put people in a media filter bubble with those most like them which can lead to some weird worldviews. 

Exactly. We may try to shape our tools, but in turn they shape us.

3

u/WTFwhatthehell 2d ago

I vaguely remember an analysis looking at politicians who posted a lot on twitter and how likely they were to embrace fringe policies that flop at election time.

People can be totally deluded about what'd actually popular with the public  because only a tiny fraction of the public get shown their posts.

→ More replies (2)

-1

u/Cautious_Parsley_898 2d ago

This isn't a meaningful argument against AI. It's an argument against researchers using one model and making bold assumptions about it's usefulness.

They can likely create a second model for women or black individuals now that they know the issue.

31

u/prof_the_doom 2d ago

It's an argument for more regulation, and to make sure that we never stop verifying.

Imagine somebody didn't do this study, and we got to a point where for costs/insurance reasons, everyone just stopped using actual x-ray technicians and just did whatever the AI told them to?

10

u/aedes 2d ago

This is why proper studies of diagnostic tests of any variety in medicine require multiple stages of study in multiple patient cohorts and settings. 

The whole process of clinical validation (not just developing the test) can easily take 5-10y - it takes time to enroll patients into a study, wait for the outcomes to happen, etc.

It’s one reason why anyone who says AI will be widespread in clinical medicine within less than 5y has no idea what they’re talking about. 

3

u/Anxious-Tadpole-2745 2d ago

Its an argument against AI. We clearly are oversold on how it works and implementing it is difficult because we don't understand it. It means we shouldn't adopt it without knowing all the possible issues.

The fact that they keeping coming out with new models is a case against using them because there are so many untested unkowns. 

Its like if we had iOS 1 then iOS 5 then next year its a Linux Ubuntu distro. The shift is too great to reliably implement

→ More replies (1)

1

u/FoghornFarts 2d ago

This is a massive problem with science. Far too many scientists see women and non-whites as "unnecessary variables". The "default white man" is pervasive across every area of study.

8

u/oviforconnsmythe 2d ago

What a quintessentially 'reddit' take on things....The effectiveness of an predictive AI model is as good as the data set that its trained on. The availability of data, especially medical data is tricky due to several factors. In this case, the Stanford team which built the chest Xray model (cheXzero) used a dataset of ~400000 chest xray images to train the model, but it seems only 666 (0.16%) of those images actually contained both diagnostic (from a radiologist) and demographic (race, age, sex) data.

In the UWash study cited in this news article, their findings of AI bias are based on these 666 images which contained the necessary metadata. Its not an issue with the scientists from the Stanford study - the more data available for training, the more robust the model will be. Given the limited metadata they had to work with, taking into account demographic biases is outside the scope of their project and they used the full dataset. Its also worth noting (only because you mention this as an issue) that only two of the six authors on the Stanford team are white and one of them is female (the rest appear of east/south Asian origin). The UWash team highlighted an important issue with the model that demonstrates major pitfalls in the Stanford model which need to be addressed - but I think the baseless claim that the Stanford team is racist/sexist is very unfair, and its even more unfair to generalize it across scientists.

Its also worth pointing out that the UWash study itself has "sampling bias" (not with malicious intent of course though; they had the same limitations as the Stanford team). Their model is trained on only the 666 images with demographic data - no one knows the demographics of the other ~400000 images used. Its difficult to tell whether their findings hold true across the entire data set simply because the necessary metadata doesn't exist. This is the core of the issue here:

Using chest Xray images as an example, medical privacy laws and patient consent can make it difficult to publish these kinds of data to public databases. And that's just the images, nevermind the demographic data. Add that to other variables that need to be controlled (eg quality of the Xray, reliability of patient health records, agreements between database administration and clinical teams etc), its tricky to get a large enough data set to robustly train a ML model while accounting for things like demographics. I'm of the opinion that consent for release of medical data should be a prerequisite and obligation for access to health care (assuming data security is robust and discrete patient identifiers are removed). Likewise, hospitals/clinics should be obliged to upload their data in free-publicly available datasets.

-2

u/FoghornFarts 2d ago

This isn't a "Reddit" take. Go read Invisible Women. Maybe you're part of the problem.

1

u/Days_End 2d ago

I mean that's just the fault of our regulations. It's so expensive to run studies that cofounding variables are never worth the risk to any company.

It also doesn't help that people really like to burry their head in the sand and pretend "races" aren't different enough to have very different interactions with the same drug.

-1

u/plot_hatchery 2d ago

Most of my peers in my life have been very left leaning. The politics in your echo chamber is causing you more suffering than you realize. Please try to get out of it and attain a more balanced view. You'll be happier and have a more clear picture of the world.

1

u/FoghornFarts 2d ago

Go read Invisible Women and then tell me that again with a straight face.

1

u/not_today_thank 2d ago

trains it to literally discriminate just like the people who made it.

After reading the article that might be exactly what they need to do, build discrimination (as in the ability or power to see or make fine distinctions) into the model so to speak. Reading the chest x-ray of an 80 year old white man compared to a 30 year black woman with the same model is probably not going to yield the best results.

1

u/Red_Carrot 2d ago

The upside to discovering its error is to either only use it on the sunset it is good for while giving it additional training for others areas or if that will not work, start from scratch.

1

u/VoilaVoilaWashington 2d ago

And controlling for that is almost impossible. Which is more work, writing a contract, or finding a stray "not" in a legalese contract? Finding a mistake in a pattern recognition system is so so so hard because you really don't know what you're looking for.

1

u/Ryslin 2d ago

That's not really a problem with AI, though. It's a problem with our methods of training AI.

We've had a very similar issue with automatic hand dryers. Some of the earlier hand dryers worked based on light reflectivity. Guess what - white people have more reflective skin. It refused to dry the hands of people with a critical threshold of melanin in their skin. If they tested with non-white people, they would have realized that their thresholds needed adjustment. We're dealing with something similar here. With all the attention put on racism and equity, we still keep forgetting to implement diversity in our product design.

1

u/Bakoro 2d ago

It's a problem across a lot of technology and science.

Essentially every image recognition/analysis tool or toy I've ever encountered has had significant issues with darker skinned people.

A disproportionate amount of what we know about humans is mostly from studying European descendants, and men.
Even when it comes or animals, many studies have been limited to males, to reduce complexity and variance.

We really need high quality, diverse public data sets. This is something the government should be funding. AI isn't going away, we need to find ways to make it work for everyone.
Medical diagnostics, of all things, should not be exclusively in private hands.

1

u/vannak139 1d ago

As someone who does do AI research in medical stuff,this is actually a pretty good idea. They're one of the few who could actually do it without getting hippa'd

1

u/WhiteRaven42 1d ago

I know of the issue in general but I'm pretty surprised race affects their reading of x-rays of all things.

-3

u/TheKabbageMan 2d ago

This isn’t really an “AI” problem. What you are describing is human error

0

u/hellschatt 2d ago

I didn't read the study, but usually, this problem occurs due to lack of data from certain groups of people.

I assume there is simply less data available from black women, and this is usually due to the history of people of African origin, as well as their current living conditions.

We simply have less data available since these people don't visit (for many reasons like poverty) the doctor as often, or since the majority of these people live in countries where we don't have easy ways of collecting data from them.

→ More replies (2)

103

u/Risk_E_Biscuits 2d ago

It's clear that a lot of people don't understand how AI works. AI is only as good as its training, and most AI currently takes a LOT of human input for training. If an AI is fed poor data, then it will simply replicate that poor data. We've known our medical data has been biased against minority groups for many years (both inadvertently and intentionally).

There are also different types of AI. There are AI that analyze speech patterns specifically, or images specifically, or even parallel data sets specifically. Ask a speech pattern AI to give you a picture and you'll get a strange result. Ask an image recognizing AI to write you a poem, it will come out all sorts of weird.

The big problem is most people think AI is all just like ChatGPT. Those types of AI are like a "Swiss army knife", great for a variety of uses, but poor for specific uses. You wouldn't ask a surgeon to do an operation with a "Swiss army knife". So the AI model used really does matter, and it will take some time to get the proper models implemented in each industry.

Since studies like these are done with AI trained on medical data, it is obvious that it will have bias since most medical data has bias. The key here is to improve the medical industry to provide more accurate data for minority groups.

37

u/314159265358979326 1d ago

Yeah, the old "garbage in, garbage out" is still perfectly relevant. The algorithm isn't the problem here - it can't choose to discriminate - it's the human-generated training data, which is a much more fundamental, much harder to solve issue.

1

u/colacolette 1d ago

Exactly. When people talk about "racist AI" they don't mean it is literally racist, they mean the data it is being fed is racially biased.

0

u/vannak139 1d ago

This isn't a technical limit of ai/ml, and in many ways it's wrong. Certain models such as noise2noise specifically push against this idea of garbage in garbage out. In that paper they show you can very easily clean noisy data, without clean examples. 

It's not magic, and there are limits. But this hard line youre imagining has lots of caveats and research making it more wrong every day.

5

u/IsNotAnOstrich 1d ago

This isn't about noisy data though, it's about bad data or a lack of data.

→ More replies (1)

454

u/Spaghett8 2d ago

Yeah, unfortunately, tech development faces a lot of biases. At the bottom is most often black women.

The same happened with facial recognition. While white men had an error recognition rate of 1%, black women had an error rate of around 35%. From a 1/100 mistake to a 35/100.

Lack of inclusivity is a well known and common algorithmic bias. It’s quite sad that even large companies and heavily funded studies constantly repeat it.

32

u/CTRexPope 1d ago

It’s not just an AI problem, it’s a general science problem. For example, they’ve shown that the ability to taste bitterness varies by race, and can effect how effective bitter tastes in like children’s medicine are.

63

u/The_ApolloAffair 2d ago

While that’s probably true to some extent, there are other unintentional factors. Cameras simply aren’t as good at picking up details on a darker face, leading to worse facial recognition results. Plus, fewer variations in hair/eye color doesn’t help.

28

u/Ostey82 1d ago

Ok so this I can totally understand when we are talking about a normal camera with varying lights etc etc but an x-ray?

Why does it happen with the x-ray, does the disease actually look different in a black person v a white person? I would have thought that lung cancer is lung cancer and if you got it looks the same.

3

u/montegue144 1d ago

Wait... How can you even tell if someone's black or white on an X-ray... How does the machine know?

2

u/Ostey82 1d ago

That's what I mean the x-ray won't know the colour of the skin so unless cancer looks different in different races and sexes, which I don't think it would, how does the AI get it wrong

39

u/X-Aceris-X 2d ago

This is some really wonderful research on the subject, showing that the current 10-point Monk Scale for skin tones is not good enough for ensuring camera systems capture diverse skin tones.

Improving Image Equity: Representing diverse skin tones in photographic test charts for digital camera characterization

https://www.imatest.com/2025/03/improving-image-equity-representing-diverse-skin-tones-in-photographic-test-charts-for-digital-camera-characterization/?trk=feed-detail_main-feed-card_reshare_feed-article-content

71

u/Anxious-Tadpole-2745 2d ago

Black women are often catregorized as male by white humans in the real world at the same rate. That makes sense.

64

u/RobinsEggViolet 2d ago

Somebody once called me racist for pointing this out. As if acknowledging bias means you're in favor of it? So weird.

→ More replies (2)
→ More replies (5)

354

u/Levofloxacine 2d ago

I remember telling this dude that many modern technologies have a bias agaisnt people of colour. I didn’t even say it was due to sinister reasons and done on purpose. He replied calling me a « woke ».

Interesting article. Thank you.

It’s somewhat dire because, as a black woman and a MD as well, I would have never been able to tell the patients race by his chest xray alone. Quite crazy what AI is capable of now.

It’s great that this research took the time to think about biases. Lets hope they keep pushing to dismantle them.

67

u/Pyrimidine10er 2d ago edited 1d ago

N=1 here, and also an MD- but a physician scientist working in the AI space. I’m actually not surprised there was a performance degradation for women (which can have some plausible factors that need consideration like physical size differences + a shadow from breast tissue, etc) but am surprised about the drop in accuracy for black people.

For all of the models I’ve developed I’ve also required demographic and other factor breakdowns (age, race, ethnicity, geographic location, sex/gender, different weights, BMI, presence of DM, HTN, other comorbidities, month and year of when a given test occurred, etc) and also build combos: obese white women, obese white man, obese black women, etc. I also think about the devices- the machines may be different brands. Did all of our black folks only get their X-rays from a Siemens machine that’s 40 yrs old and thus more likely to be used at the safety net hospital? I’ve gotten pushback about it being too much from some academic contributors, but this finding provides more motivation to make sure we don’t inadvertently discriminate. There sometimes are sample size limitations after applying 5 layers of filters, but I’d rather do our best to understand the impact of these models across a broad as possible swath of people. I say all this to give you hope that at least some of us take this problem serious and are actively thinking about how to stop health disparities.

This is also why the work in AI explainability is starting to gain more traction. What is the model using for its prediction can shine a light into why there’s bias. But with the current neural networks, and LLMs the ability to peak into the black box is limited. As the explainability research progresses we may see some really interesting physiology differences that are not perceptible to standard human senses (the AI work in ECGs over the last few yrs has been crazy). Or we find that the AI is focusing on things that it really should not- like the L or R side sticker indicator magnet thing on a CXR.

7

u/ASpaceOstrich 1d ago

The fact you got pushback is wild. These are supposed to be scientists and they aren't trying to eliminate variables from the tests? Are they insane?

110

u/hoofie242 2d ago

Yeah a lot of white people have a fantasy view of how they think the world works and hate when people pop their ignorance bubble and react hostile .

90

u/JazzyG17 2d ago

I still remember white people getting pissed off and calling bandaids woke when they came out with the other colors. The original is literally their skin color so they never had to worry about it being literally highlighted on their bodies

13

u/proboscisjoe 1d ago

I have literwlly been told the words “I don’t believe you” when I describe an experience I had to someone and they could not conceive in their naive, privileged mind how it was possible for what happened to me to happen to anyone.

I pointed out that the war in Ukraine was happening. How is that possible? They still didn’t accept it.

Since then I have started telling white people “I’m not going to explain that to you. It’s not worth the effort.”

8

u/anomnib 2d ago

The underlying study shows the plots for how well it predicts demographics. It is crazy good. This is also a danger for potentially outing trans people.

I wonder how much of this can be fixed by training models that place the same value of performance accuracy across demographic groups.

That’s what I was experimenting with when I worked in tech.

-15

u/caltheon 2d ago

medically you kind of need to know if someone is trans though. And socially, hiding that is deceptive.

7

u/IsamuLi 2d ago

"And socially, hiding that is deceptive."

Why?

2

u/Epiccure93 2d ago

Only in dating. Otherwise it doesn’t matter

4

u/BalladofBadBeard 1d ago

We all hide all kinds of stuff. It's called privacy. Nobody owes you all the details about their bodies, that's such a creepy take.

3

u/Agasthenes 1d ago

Probably because of your wording. Modern technology doesn't discriminate. That's something only humans do.

It was just trained on incomplete data. Which is a valid approach when you try to get something to work at all.

The only problem happens when it is then sold as a finished or complete product and no further work is done to complete it.

→ More replies (11)

107

u/TheRealBobbyJones 2d ago edited 1d ago

I think the bigger thing to take away is that difference between black people and white people is big enough to throw off a model designed to generalize(to an extent). An enlarged heart should be an enlarged heart. Presumably the model was not fed racial or gender information during training. As such they probably compared to the general average rather than the average per grouping. They should redo the original training but feed in demographic data with the scan. 

Edit: or a fine-tuning with the demographic data. 

Edit2: perhaps instead of demographic data they could use genetic information. But the variance in heart size or other such data is probably influenced by both lifestyle and genetics. Idk what would be the best data to add in to correct for this sort of thing. Just racial data would likely miss certain things. For example if a white guy who identifies as white was 1/64th native would that 1/64 be enough to throw off AI diagnostics? If so how could we correct for that? Most people probably wouldn't even know their ancestry to such a degree. Or alternatively if someone was malnourished growing up but is otherwise healthy today. Would AI diagnostics throw a false positive? 

28

u/Chicken_Water 2d ago

Curious if this implies black women typically have smaller hearts, whereas an enlarged heart for them is typical size for white men. This shouldn't be a very difficult issue to resolve, we just need more training data for medical models.

73

u/JimiDarkMoon 2d ago

This has been known for a long time in pharmaceutical therapy treatments, all of our available data was based on Caucasian men. Imagine medication not working right on a woman, or elderly Asian male because of who was only allowed in the trial phase.

The women in your lives are the most susceptible to medical errors based on the gender bias alone, not being heard.

This absolutely does not surprise me.

11

u/Roy4Pris 2d ago

Roger that. Also, the number of white men who have ever received chest x-rays will be orders of magnitude greater than black women, so the data set was skewed from the get-go. Pretty disappointing if that wasn’t factored in.

12

u/Days_End 2d ago

Races are both shockingly similar and surprisingly different at the same time.

6

u/Dirty_Dragons 2d ago

Yeah I had no idea that the internal organs would be different across ethnicities. That's wild.

33

u/DarwinsTrousers 2d ago

So what is the difference in the chest x-rays of women and black people?

I would have thought ribs are ribs.

6

u/ninjagorilla 2d ago

Ya im confused about this. I definitely cannot diagnose someone’s race off a cxr and wouldn’t have thought skin color was a confounding factor on this sort of imaging

18

u/ADHD_Avenger 2d ago

I wonder if the doctors they compared to were really a good set to compare to as well - it's not like AI is the only thing that misses issues on bias - cross-racial bias is a big problem with doctors, as is cross-gender, and other issues. They compared the AI to doctors who managed to catch these issues from what I can see - with a set where doctors both caught and missed issues, would it be different? The real immediate value of AI is if it as used as a filter for potential items to flag for review, either prior or post human review.

4

u/ninjagorilla 2d ago

It said the model could predict a patients race with 80% accuracy while a radiologist could only hit 50%…. But they weren’t sure how and what the confounding factor was that caused the miss rate to go ip

1

u/Dirty_Dragons 2d ago

A 50% rate is just guessing. How can the AI tell?

1

u/ninjagorilla 2d ago

Depending on the choices … it didn’t specifically say if it was white/black or if there were more races to pick from .

10

u/[deleted] 2d ago

[deleted]

12

u/FaultElectrical4075 2d ago

AI doesn’t process images the same way humans do. What is obvious to humans might not be obvious to AI and vice versa.

3

u/eldred2 2d ago

Feed these misses back in as training data, so they will learn it. This is how you improve the models.

3

u/soparklion 2d ago

Are there different parameters for identifying cardiomegaly in black women?  Or is it using the pretest probability for white women to underdiagnose black women? 

4

u/febrileairplane 2d ago

Why is model training conducting with datasets that lead to these shortfalls?

Could you improve the training and validation sets to be more representative of the while population?

If these variables (race/gender) would reduce the power of the model, could you break the training and validation sets out into separate race/gender sets?

So an AI/MLM trained on specifically white men, then one trained specifically on black men and so on...

4

u/FaultElectrical4075 2d ago

The datasets have these shortfalls because the humans that created them are biased. There is no such thing as an unbiased dataset.

0

u/caltheon 2d ago

What's normal for one race is not normal for another, so the training data needs to be made aware of these differences. There is also a movement in medicine to disregard race as a social construct, with people trying to treat everyone the same (noble goals) but is having the opposite effect since the premise is wrong. You can see that false bias in this article. https://www.nejm.org/doi/full/10.1056/NEJMms2206281 Basically, in trying not to be racist, they are being racist

→ More replies (2)

5

u/NedTaggart 2d ago

how did the AI know they were black just from an x-ray of the chest?

7

u/ALLoftheFancyPants 2d ago

I wish that was not still disappointed in medical researchers for stuff like this. Bias in medicine research and then practice has caused large discrepancies in people’s healthcare and expected mortality. It shouldn’t still be happening.

6

u/[deleted] 2d ago

[deleted]

1

u/DeltaVZerda 2d ago

They already admitted that when they excluded them from the initial training.

2

u/hidden_secret 2d ago

People have told me all my life that skin color was just skin color. But there are actually big differences in the organs?!

2

u/Bakoro 2d ago

This isn't only a problem with AI, nearly this exact same situation is repeated across science and technology. Even when it comes to studying rats, a lot of studies will only study male rats to reduce variables.

I wholeheartedly stand by AI tools as a class of technology, but these things need massive amounts of data. This kind of thing simply should not be just left to a private company, and the anonymized data need to be freely available to researchers.

2

u/simplyunknown8 1d ago

I haven't read the document.

But how does the AI know the race from an x-ray

5

u/Droidatopia 2d ago

Is anyone else confused why including demographic information in the prompts reduced the effect of bias?

This seems counterintuitive.

3

u/omega884 2d ago

If you would expect demographics to be diagnostically relevant, then you'd expect them to reduce the effect of the "bias". That is, if you're looking for "enlarged hearts" and your training has a bimodal distribution correlated with sex, then if you don't tell your model the sex of the patient, it just has to guess whether a hear that falls into the higher node is abnormally large for the patient or completely average. If your bimodal distribution also happens to be weighted to the upper mode, your model will be right more often than not by guessing that the heart is normal sized. But in the specific case of the sex correlated to the lower mode, it will wrong more often than not.

Give it the diagnostically relevant sex data though, and now it has a better chance to decide "if sex A and high mode size, then it's average because sex A has sizes clustered around this mode, but if sex B, then it's enlarged because sex B cluster's around the lower node."

-4

u/Ok-Background-502 2d ago

It helps with the bias that everybody is white.

6

u/Droidatopia 2d ago

That doesn't make any sense when compared to the context of that part of the paper though.

It found the model was much better at determining patient's race and age than the human doctors were.

4

u/Ok-Background-502 2d ago

It's probably not using that information without being prompted because it's AI. I think human doctors ALWAYS factor in race, but it's not obvious to me that AI would use that information by default.

More likely that specialized AI lives in a race-less world with only white people by construction.

6

u/Droidatopia 2d ago

That's the counterintuitive part.

The paper says the model is better than humans at figuring out the race and age of the patient from the image alone.

But then the model's pro white/pro man bias is lessened by including the demographics in the prompt.

So the model has the ability to discern race/sex from the image, but won't use that information to produce a better diagnosis that it is capable of creating unless specifically told to?

2

u/Ok-Background-502 2d ago

That's how, in my experience, AI works at this point. AI knows how to answer a lot of questions directly, but needs to be promoted to answer lateral questions like "think about what race you are looking at", or it will not because that was never the question.

It's like when you are trained to use your gut feeling to decide something. And then you are trained to use your gut feeling to decide another thing.

Your answer to question 2 might inform the answer to question 1. But if you were asked to use your gut feeling to decide the answer to question 1 again, your gut decision might not have used your answer to question 2.

You have to train the model with supervision to use a specific piece of information if you want it to reliably use it in future problems.

4

u/Commemorative-Banana 2d ago edited 2d ago

You and the person you’re responding to are thinking about AI from an LLM-prompting perspective, which is wrong. Medical imaging ML models are not using LLMs, and they don’t need to be “told” to “think” about race, or “convinced” to not “withhold” conclusions. Quotations for anthropomorphization.

ML models already consider every detail of the data they are given, and shortfalls like this simply mean they were not given good enough data.

→ More replies (2)
→ More replies (2)

2

u/trufus_for_youfus 2d ago

This is very interesting. I had no idea that women and/ or various ethnicities had marked differences in cardiovascular systems to begin with.

2

u/Petrichordates 2d ago

Good thing we banned research on diverse populations then!

1

u/YorkiMom6823 2d ago

When computers and programing were still pretty new I was introduced to a phrase "Garbage in, garbage out" since then I've wondered why people don't recall this phrase more often. Programmers including researchers and AI trainers are still operating under the GIGO rule. No program, including AI is one whit better than the comprehension and biases of the creators.

1

u/West_Many4674 1d ago

The ingrained biases of AI are a feature, not a bug. This technology will be used to further oppress minority groups. It’s designed to make us miserable, not happier. 

1

u/blazbluecore 1d ago

Ahh yes the racist machines. First it was the racist people, now it’s the boogeymen racist machines. Next it’s gonna be racist air. If only we could solve racism, the world would be a perfect place for everyone to live in peace and prosperity! Darn it all!

-2

u/armchairdetective 2d ago

We know. We know!

We have been shouting about this issue with all types of AI models for at least a decade! We're just ignored.

Self-driving cars will kill black pedestrians.

Algorithms to select job applicants disadvantage people with career breaks for care or pregnancy, as well as people with non-white-sounding names.

Two years of articles about how AI is going to diagnose better than any doctor and then, obviously, no. It'll make sure black women die.

I am tired.