It literally cannot come up with novel solutions because it’s goal is to regurgitate the most likely response to a question based on training data from the internet. Considering that the internet is often trash and getting trashier, I think LLMs will only get worse over time.
AI has poisoned the well it was fed from. The only solution to get a good AI moving forward is to train it using curated data. That is going to be a lot of work.
On the other hand, this might be a business opportunity. Selling curated data to companies that want to make AIs.
I could see large companies paying to train the LLM on their own IP even just to maintain some level of consistency, but it obviously wouldn’t be as valuable as hiring the talent that sets the bar and generates patent-worthy inventions.
You can fine tune a model with specific stuff today. OpenAI offers that right on their website and big companies are already taking advantage. It doesn’t take a whole new LLM, and the cost is a pittance in comparison.
Also the more the internet is swept with AI generated content, the more future datasets will be trained on old AI output rather than on new human input.
It literally cannot come up with novel solutions because it’s goal is to regurgitate the most likely response to a question based on training data from the internet. Considering that the internet is often trash and getting trashier, I think LLMs will only get worse over time.
AI has poisoned the well it was fed from. The only solution to get a good AI moving forward is to train it using curated data. That is going to be a lot of work.
On the other hand, this might be a business opportunity. Selling curated data to companies that want to make AIs.
I could see large companies paying to train the LLM on their own IP even just to maintain some level of consistency, but it obviously wouldn’t be as valuable as hiring the talent that sets the bar and generates patent-worthy inventions.
You can fine tune a model with specific stuff today. OpenAI offers that right on their website and big companies are already taking advantage. It doesn’t take a whole new LLM, and the cost is a pittance in comparison.
Also the more the internet is swept with AI generated content, the more future datasets will be trained on old AI output rather than on new human input.
Humans are also now incentivized to safeguard their intellectual property from AI to keep a competitive advantage.
What are some strategies for doing that? (This is me, totally not a bot)
Paywalls.