• classic example of Goodhart’s law (“When a measure becomes a target, it ceases to be a good measure”) or, if you prefer, Campbell’s law (“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”).

    While most film-PR companies aim to get the attention of critics from top publications, Bunker 15 takes a more bottom-up approach, recruiting obscure, often self-published critics who are nevertheless part of the pool tracked by Rotten Tomatoes. In another break from standard practice, several critics say, Bunker 15 pays them $50 or more for each review. (These payments are not typically disclosed, and Rotten Tomatoes says it prohibits “reviewing based on a financial incentive.”

    The Bunker 15 employee replied that of course journalists are free to write whatever they like but that “super nice ones (and there are more critics like this than I expected)” often agreed not to publish bad reviews on their usual websites but to instead quarantine them on “a smaller blog that RT never sees. I think it’s a very cool thing to do.” If done right, the trick would help ensure that Rotten Tomatoes logged positive reviews but not negative ones.

  • I honestly can’t even remember the last time I used Rotten Tomatoes. From its inception, it was always somewhat of a flawed concept in that it wouldn’t tell you how good a movie was, but how favorable it would be to general audiences. If 100% of reviewers gave it a 7/10, it would still be “100% fresh.”

    I had heard that studios would only invite positive reviewers to early releases, and I had heard studios were gaming the Rotten Tomatoes reviews; so I’m not entirely surprised, but I am a little surprised to see an article about it.

    My personal recommendation would be to find a reviewer/ critic who usually has similar tastes as yourself and follow them over the use of Rotten Tomatoes (though if I’m being honest, I’ve been checking those reviews less often too).

      • You watch the stuff that sounds interesting and maybe don’t watch the stuff that doesn’t? You can base this off trailers, reviews, or just listen to people discussing movies and what they say about them, and if they make it sound like something you want to watch. The same goes for games. I have never in my life cared about a number score or a tomato meter percentage. I have never thought a movie “should” get a different number assigned to them and gotten angry over it. Who cares? That number isn’t actually part of the movie and doesn’t impact the quality of the movie. It’s just some number other people came up with that doesn’t in any way determine whether or not you will like a movie.

        • I’ve found when I watch low rated movies that they are typically worthy of their low rating.
          Sometimes I do find that high rated games / movies aren’t as great for me (BOTW is one example) as the majority sees them.
          One example is the zombie genre.
          I watch a ton of them and I have yet to come across one where the aggregate review score is much different than what I felt about the film / show.
          For sure I disagree with the hatred / burn out that people had with TWD as I iked it to the end.

      • I tried that (many) years ago.

        In order to get a good prediction of “should watch / shouldn’t watch”, the system used scores on a 0-10 scale for the amount of 20+ categories present in each film… then each user would give their category preferences on a -5…+5 scale… and the sum of a film’s category scores × user preferences, would end up being highly correlated to the user’s like/dislike of the film.

        From the end user’s perspective, it only required entering 20+ preferences… but scoring each film on 20+ categories, proved much more difficult. People would give different scores for their perceived amount of a category in a film, and while the personal sum[score×preference] was highly correlated to their like/dislike verdict, the sum[avg(scores)×preference] was all over the place, and we weren’t able to find a way to assign film category scores that would work reasonably well for everyone.

        Turns out people not only have different category preferences, but also different category perceptions for the same film.

        Maybe revisiting the idea today, with the help of some AI, could find some effective grouping or a different predictor, but back then we just mothballed the whole thing.