Google Says It'll Scrape Everything You Post Online for AI

misk ( @misk@lemm.ee ) · 1 year ago

Google Says It'll Scrape Everything You Post Online for AI

millie ( @millie@beehaw.org ) · 1 year ago

Crazy that Google feeds on all our data and has for years, but when OpenAI puts the benefit of that data back into the hands of users it catches flack.

Rentlar ( @Rentlar@beehaw.org ) · 1 year ago

Perhaps we lived in blissful ignorance all this time. Before AI Language Learning models they are today, Google Translate was most of what the data was going to and it was mainly about getting an adequate translation. Now it’s being used to answer questions on all different subjects using parts of real people’s answers, which could be more frightening to people.

shanghaibebop ( @shanghaibebop@beehaw.org ) · edit-2 1 year ago

I think it’s a problem of value capture.

People had no problem posting on reddit and wasting tons of hours helping strangers solve their problems. But now that reddit puts that information behind a paywall, people will have massive issues with that.

Similarly, google scrapped data, but didn’t APPEAR (and i can’t emphasize that enough) to use that data to deliver value that cannot be shared by the people who created that data. Most of the time your value is aligned so that you give up your “data” to google so that google can either provide you with better traffic through its search engine, or better ads to generate revenue for you.

OpenAI does not benefit the original publisher of that information what so ever.

millie ( @millie@beehaw.org ) · 1 year ago

I don’t know about that. When’s the last time you looked something up on Google and the first link was driving traffic to a website rather than scraping one and present it in-engine?