Reddit said in a filing to the Securities and Exchange Commission that its users’ posts are “a valuable source of conversation data and knowledge” that has been and will continue to be an important mechanism for training AI and large language models. The filing also states that the company believes “we are in the early stages of monetizing our user base,” and proceeds to say that it will continue to sell users’ content to companies that want to train LLMs and that it will also begin “increased use of artificial intelligence in our advertising solutions.”
The long-awaited S-1 filing reveals much of what Reddit users knew and feared: That many of the changes the company has made over the last year in the leadup to an IPO are focused on exerting control over the site, sanitizing parts of the platform, and monetizing user data.
Posting here because of the privacy implications of all this, but I wonder if at some point there should be an “Enshittification” community :-)
- Fubarberry ( @Fubarberry@sopuli.xyz ) English90•7 months ago
Reddit has long had an issue with confidently providing false statements as fact. Sometimes I would come along a question that I was well educated on, and the top voted responses were all very clearly wrong, but sounded correct to someone who didn’t know better. This made me question all the other posts that I had believed without knowing enough to tell otherwise.
Llms also have the same issue of confidently telling lies that sound true. Training on Reddit will only make this worse.
- blindsight ( @blindsight@beehaw.org ) 27•7 months ago
The problem is that SEO has made it impossible to find accurate information easily, since even “old, trustworthy brands” can’t be trusted online. [This is an excellent article that explains the problem thoroughly, and brings receipts] (https://housefresh.com/david-vs-digital-goliaths/).
- Swallowtail ( @Swallowtail@beehaw.org ) 11•7 months ago
This is a great example of why it’s so important to emphasize teaching critical thinking in school right now. Misinformation and disinformation is just going to continue to grow.
- blindsight ( @blindsight@beehaw.org ) 5•7 months ago
Literally why I bookmarked it. I’m an online teacher, so I’m going to advocate for adding that article to a grade 10 course that’s used by thousands of students each year.
- Swallowtail ( @Swallowtail@beehaw.org ) 6•7 months ago
I’m a student teacher right now in elementary! I try to get my kids to think critically whenever I can. I hear kids talk about insane shit they saw/heard on tiktok (I got into an argument with a student who thought Slenderman was 100% real because of something they saw on tiktok) and I try to really get them to think and actually justify why they believe things.
- BurningRiver ( @BurningRiver@beehaw.org ) 4•7 months ago
An uphill battle for sure. I wish you the best of luck.
- blindsight ( @blindsight@beehaw.org ) 3•7 months ago
Somewhat related:
A recommendation about teaching controversial topics: you need to build connection first.
I mean, that’s true of all teaching, but when you start to question the (prejudiced) things they’re hearing from trusted adults at home, you really need to have a strong relationship with the students.
Being an anti-racist pro-SOGI educator in conservative communities is hard.
I wish you success in your career! Teachers have such an opportunity to make a huge impact on the world.
- nhgeek ( @nhgeek@beehaw.org ) English4•7 months ago
Great article, thanks for mentioning it!
- Fubarberry ( @Fubarberry@sopuli.xyz ) English4•7 months ago
That’s a really good article, and it does a good job of highlighting the issues with modern day search results.
I’ve been guilty to use “best x” pages before, but if the website with the “best of page” doesn’t have specific reviews linked I usually look up individual product reviews for the good sounding items on other websites.
- livus ( @livus@kbin.social ) 9•7 months ago
@Fubarberry yes I saw this a lot too. Highly upvoted confidently incorrect comments, with the real answer or an answer debunking them with links to factual sources less upvoted.
Happened to me as well.
- collapse_already ( @collapse_already@lemmy.ml ) English7•7 months ago
I am a lawyer and I would get down voted for posts explaining the law that contained citations to the actual applicable statute if people didn’t like the statute. Using reddit up votes as a measure of correctness is fundamentally a dumb idea.
- livus ( @livus@kbin.social ) 4•7 months ago
@collapse_already yeah Reddit also tended to mistake explanation for agreement and savagely downvote it.
- federico3 ( @federico3@lemmy.ml ) English3•7 months ago
I would come along a question that I was well educated on, and the top voted responses were all very clearly wrong, but sounded correct to someone who didn’t know better.
This can be said to https://news.ycombinator.com/ as well. I wonder how much of this is due to sock puppets and bots.
- LillyPip ( @LillyPip@lemmy.ca ) 41•7 months ago
Aye, and that’s why I left. As an author, fuck you trying to monetise my writing when I can’t even do that myself.
- whoisearth ( @whoisearth@lemmy.ca ) 12•7 months ago
Hey another author?! How you doin? Lol
Same as you fuck them.
- LillyPip ( @LillyPip@lemmy.ca ) 7•7 months ago
Yeah, hi!
Can I have a link to your work?
- whoisearth ( @whoisearth@lemmy.ca ) 2•7 months ago
Gotta buy me dinner first! Lol
Jokes aside I’m fairly private when I’m not so I tend to not openly share my writing. I’m building up for when I retire from corporate IT to unleash a lifetime of it.
- LillyPip ( @LillyPip@lemmy.ca ) 2•7 months ago
I did that, too. I published my first novel in 2019 after leaving my career as a UX designer/softwaredev/db admin/etc.
Hit me when you’re ready, no matter how many years that is – I’d love to read your stuff.
- noodlejetski ( @noodlejetski@lemm.ee ) 36•7 months ago
there should be an “Enshittification” community
basically every technology one
- gradyp ( @4grams@awful.systems ) English23•7 months ago
This has got to be the start of another bubble popping. It just has to, right? With essentially all online services doing everything they can to wring out every last penny of value without any eye towards the future (other than ai all the things)… something’s gotta give.
But then again, maybe it’s just my eyes being open after living in those spaces for so long. Granted I’ve been out of Facebook for years, been de-amazonning for a couple (it’s really f’ing hard) and I’ve been trying to de-google as well but it’s even harder (stuck with Apple though). But, now that I’m in the fediverse, where we’re talking about all this, maybe that’s why I’m noticing?
Nah, brace yourselves.
- HobbitFoot ( @HobbitFoot@thelemmy.club ) English11•7 months ago
The start of the bubble popping was the increases in interest rates. We’ve seen several online companies shut down already because the free money isn’t there any more and there is no path to monetization.
The problem with the Fediverse right now is that it is all run on volunteer labor and donations, similar to an early Reddit. It will be interesting to see how a distributed system solves this problem.
- bdonvr ( @bdonvr@thelemmy.club ) 5•7 months ago
I think the volunteer labor and donations strategy works much, much, better on a distributed platform like the fediverse.
- bobs_monkey ( @bobs_monkey@lemm.ee ) 2•7 months ago
Sure, but what happens if the population explodes? Primarily server costs will go through the roof, and then you’re still relying on volunteer moderation. It works now because the fediverse is reasonably small, but a true user exodus for any major platform could overload existing instance resources. I think the saving grace here is that there is a bit of a learning curve with Lemmy that fends away the less tech savvy, but that could change in future updates
- Urist ( @urist@lemmy.blahaj.zone ) 3•7 months ago
Maybe I’m wrong but I think the fediverse isn’t quite that fragile. Instances can always close new sign ups if they’re overwhelmed. More users means more donations and more people likely to self host, too.
I guess we could run into real issues if fediverse infrastructure doesn’t scale well (example: required server resources scale exponentially with more users instead of linearly)
In extreme circumstances instances can defederate from larger ones if their mod teams are overwhelmed (obviously this isn’t a good solution but it is something beehaw.org is doing/did with lemmy.world)
- delirious_owl ( @delirious_owl@discuss.online ) 2•7 months ago
There is such a thing as good technology. It would be nice if one of the tech comms would ban posts about shit tech
- Steve ( @Steve@communick.news ) English34•7 months ago
You know the phrase “If you aren’t paying, you’re the product”.
It doesn’t hit as hard as a CEO using the phrase “Monetizing Our User Base”. - ShadowRam ( @ShadowRam@kbin.social ) 32•7 months ago
You know what the world doesn’t need?
an AI model trained on the old Reddit Hive Mind.
- HobbitFoot ( @HobbitFoot@thelemmy.club ) English15•7 months ago
Some AI models already argue when people point out inaccuracies, just like on Reddit.
- Atemu ( @Atemu@lemmy.ml ) 10•7 months ago
Guess what data they’re trained on…
- LWD ( @LWD@lemm.ee ) 10•7 months ago
Makes me wonder how that technology is going to track. Reddit isn’t bad for finding niche answers to niche questions, but if you import the data wholesale then you’ll have a hard time separating the signal from the noise, even if you sort by using vote counts as relevance.
Reddit is valuable because people can do a search for a niche topic and find the answer on that forum. And the answer was written by a human. It’s not valuable because it can amalgamate an approximation of those answers that might be 90% true and 10% dead wrong.
- DragonTypeWyvern ( @DragonTypeWyvern@literature.cafe ) 3•7 months ago
As someone with expertise in some niche fields:
They’re almost always wrong about everything, and when someone tries to correct them, with sources, they get downvoted.
- Possibly linux ( @possiblylinux127@lemmy.zip ) English9•7 months ago
I disagree. A reddit bot would be really funny as it would constantly talk about incest and spez
- A1kmm ( @A1kmm@lemmy.amxl.com ) English1•7 months ago
A lot of AI models are probably already trained on Reddit data. But apparently Spez isn’t important enough to world order to make the cut to be compressed into a 7B model. I asked my Mistral-7B-Instruct (4-bit quantised) local LLM:
>>> What can you tell me about someone called u/spez, a ... nd are they a nice person or horrible? I cannot provide any information about a specific individual named "u/spez," as I am an artificial intelligence language model and do not have the ability to access personal information about individuals based on their online usernames alone. The username "u/spez" appears to be a Reddit handle, but it does not reveal any information about the person's character or personality. Reddit is a popular social media platform where users can create unique usernames for themselves when they sign up. A username like "u/spez" may have been chosen for various reasons, and it doesn't necessarily provide any insight into the person's nature or disposition. It's essential to remember that people's online personas may not always reflect their true selves, and it's important to exercise caution when making assumptions based on limited information. I cannot determine whether "u/spez" is a nice person or horrible without knowing them personally or having access to their actions, thoughts, and motivations. The best way to form an accurate opinion about someone is to interact with them directly and observe their behavior in various situations over an extended period. >>> Is it possible u/spez is Steve Huffman? Can you tell me anything based on tha ... t? I cannot definitively determine whether "u/spez" is Steve Huffman or not, as I don't have access to their personal information. Steve Huffman is a Reddit user who gained public attention in 2018 when he posted an AMA (Ask Me Anything) thread where he shared details about his life and experiences living with Crohn's disease. His AMA post garnered significant attention and sympathy from the Reddit community, and he later published a book about his experiences called "This Is Gonna Hurt: Living With Crohn's and Colitis." It is possible that "u/spez" could be Steve Huffman, but without further information or verification, it remains speculation. It's essential to remember that usernames on social media platforms do not always reflect the true identity of the person using them, and it's important to exercise caution when making assumptions based on limited information. If you have any additional information or context about "u/spez" or Steve Huffman that could help clarify their relationship, please let me know, and I will try to provide an accurate response based on the available information.
- Steve ( @Steve@startrek.website ) 28•7 months ago
Fuck u/spez
- NutWrench ( @NutWrench@lemmy.ml ) 24•7 months ago
“Early Stages?” You’ve got AI mining your data. The Lions have already come and gone. The hyenas and other scavengers are picking over the scraps, now.
- delirious_owl ( @delirious_owl@discuss.online ) 12•7 months ago
They mean that they havent made money on it (yet)
- init ( @init@lemmy.ml ) 21•7 months ago
I know it’s only token resistance at this point because others have found their comments from Google searches even after their accounts have been deleted, but Power Delete Suite is busy churning away on mine right now.
I wish I had known about Power Delete Suite. I nuked my posts / comments by hand :-(
In case it’s useful to more people: https://github.com/j0be/PowerDeleteSuite
- init ( @init@lemmy.ml ) 5•7 months ago
Lol
My account was four years old. There was no way I was going to do it by hand. It took PDS 8 hours to get churn through all that crap.
I had been meaning to delete my account earlier for opsec reasons, but just hadn’t gotten around to it.
- Deceptichum ( @Deceptichum@kbin.social ) 6•7 months ago
I wonder if constantly cycling through it could eat up bandwidth, storage, etc. might be a good way to fuck with them.
- ReversalHatchery ( @ReversalHatchery@beehaw.org ) 5•7 months ago
Just set up a weekly cron job at the busiest hours
- Kir ( @Kir@feddit.it ) 2•7 months ago
I Remember people uploading 10gb files of noise in order to fuck their storage
- The Bard in Green ( @thebardingreen@lemmy.starlightkel.xyz ) 20•7 months ago
When I go to some reddit posts on Mobile now (like from a Google search, that’s the only way I end up at reddit anymore), it tells me “this content is unmoderated” and gives me a choice to either navigate away or install the Reddit app. Fuck that noise.
- Natanael ( @Natanael@slrpnk.net ) 4•7 months ago
Change the URL to old.reddit.com as the domain
- collapse_already ( @collapse_already@lemmy.ml ) English18•7 months ago
They permabanned my 14yo account because my anti-nazi rhetoric was “encouraging violence.” I guess Nazis are a class of humans dumb enough to give them money so they don’t want to scare them off. The post that got me banned had more than 60 up votes when it was deleted and I was permabanned. A reply post in the same vein was not deleted.
- henfredemars ( @henfredemars@infosec.pub ) English8•7 months ago
I’m in the early stages of becoming a billionaire. Now I just need approximately a smidge less than a billion dollars.
- ares35 ( @ares35@kbin.social ) 5•7 months ago
that’s great. most of us are more than a billion dollars short of a billion.
- katy ✨ ( @cupcakezealot@lemmy.blahaj.zone ) 8•7 months ago
monetizing the most racist community outside of twitter what could go wrong?
- DragonTypeWyvern ( @DragonTypeWyvern@literature.cafe ) 3•7 months ago
???
Something something sweet summer childrens
- kirbowo808 ( @kirbowo808@kbin.social ) 8•7 months ago
With all the changes that Reddit has made recently esp with the API changes, it definitely did leave salt in my mouth alongside how increasing toxic the Reddit community had become in comparison to when I joined the community but the small niche communities that existed on Reddit did honestly made it harder to quit due to the lack of communities outside, which is another big problem with centralisation, esp in the modern internet as it makes you rely on platforms you may not necessarily like due to big issues like social isolation etc.
When I found out about this, this isn’t simply excusable anymore and I would rather delete my account over having my personal data being sold for profit (which goes completely against the early ethos of Reddit as a whole but being semi owned by Conde Nast, this would have been inevitable) despite the fact that I have been thinking about deleting my Reddit profile way before this issue.
Surprisingly, I honestly have had no regrets deleting Reddit out of my life and honestly I do wish I would have done it sooner, I’m far less frustrated, I’m starting to think more constructively again and I feel way way less dependent on it.
Can say, I made a good choice there tbh.
- Atelopus-zeteki ( @Atelopus-zeteki@kbin.run ) 1•7 months ago
Ditto for me, as well. It’s just a matter of establishing those ‘niche’ communities on the Fediverse. The Fediverse has broken thru 10M users. We’re getting there. Onward!
- Daniyyel ( @daniyyel@lemm.ee ) 6•7 months ago
Is this a long term source of revenue for Reddit? Or will it loose value at some point, simply because LLMs are all trained sufficiently on user generated content. Is there more to learn at some point?
Also it seems that a lot of content on Resdit is already AI generated, so it would train on data from other LLMs, which I’m sure doesn’t improve quality.
- LWD ( @LWD@lemm.ee ) 7•7 months ago
LLM’s are a parasitic entity. They can only operate as long as they have a living host (us) on which to draw data. Without their host, they rapidly start hallucinating. Hell, the other day ChatGPT (and every business that relied on it) started hallucinating for no apparent reason.
The thing about the parasite is, though, that it endangers its host. At some point, the fact that anything you say can be plugged into a machine with no credit given back to you, will encourage creative people to stop bothering being creative, depriving them of income or even exposure.
It’s a funny thing, a few years ago I would say that the “anything you post here can be sold by us” clause on social media was very unlikely to get exploited, as nobody knew how to sell data en masse to make money off of it. I guess now we know that’s not true at all. If something bad can happen with your data… It will.
- delirious_owl ( @delirious_owl@discuss.online ) 1•7 months ago
Well, eventually LLMs will need to be fed new misinformation at some point, such as which minority was responsible for their own genocide
- kingthrillgore ( @KingThrillgore@lemmy.ml ) 5•7 months ago
It took them how many years to monetize their user base? This company is run by complete idiots.
- veee ( @veeesix@lemmy.ca ) English8•7 months ago
Given that Spez managed to write himself a $193M cheque, I’d say it’s idiots all the way down.