8–10% of algorithmic recommendations are ‘bad’, but… an exploratory risk-utility meta-analysis and its regulatory implications
We conducted a quantitatively coarse-grained, but wide-ranging evaluation of the frequency recommender algorithms provide ‘good’ and ‘bad’ recommendations, with a focus on the latter. We found 151 algorithmic audits from 33 studies that report fitting risk-utility statistics from YouTube, Google Sea...
        Saved in:
      
    
          | Published in | International journal of information management Vol. 75; p. 102743 | 
|---|---|
| Main Authors | , , , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            Elsevier Ltd
    
        01.04.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0268-4012 1873-4707 1873-4707  | 
| DOI | 10.1016/j.ijinfomgt.2023.102743 | 
Cover
| Summary: | We conducted a quantitatively coarse-grained, but wide-ranging evaluation of the frequency recommender algorithms provide ‘good’ and ‘bad’ recommendations, with a focus on the latter. We found 151 algorithmic audits from 33 studies that report fitting risk-utility statistics from YouTube, Google Search, Twitter, Facebook, TikTok, Amazon, and others. Our findings indicate that roughly 8–10% of algorithmic recommendations are ‘bad’, while about a quarter actively protect users from self-induced harm (‘do good’). This average is remarkably consistent across the audits, irrespective of the platform nor on the kind of risk (bias/ discrimination, mental health and child harm, misinformation, or political extremism). Algorithmic audits find negative feedback loops that can ensnare users into spirals of ‘bad’ recommendations (or being ‘dragged down the rabbit hole’), but also highlight an even larger likelihood of positive spirals of ‘good recommendations’. While our analysis refrains from any judgment of the causal consequences and severity of risks, the detected levels surpass those associated with many other consumer products. They are comparable to the risk levels of generic food defects monitored by public authorities such as the FDA or FSIS in the United States. Consequently, our findings inform the ongoing discussion regarding regulatory oversight of the potential risks posed by recommender algorithms.
•Analyzed 151 algorithmic audits for frequency of 'good' and 'bad' recommendations.•8%− 10% of algorithmic recommendations found to be 'bad'; a quarter ‘do good’.•Spiral of ‘bad’ recommendations exists, but upward spiral of ‘good’ is even larger.•Risk is consistent across platforms and harms (bias, mental health, fake, etc.).•Risk levels akin to generic food defects, suggesting need for regulatory oversight. | 
|---|---|
| ISSN: | 0268-4012 1873-4707 1873-4707  | 
| DOI: | 10.1016/j.ijinfomgt.2023.102743 |