A heavily filtered corpus simply doesn't work.
When you aggressively filtered out the bulk of humanity's most popular knowledge, including most of its perceived redundancy, because you deemed it unworthy of inclusion, you severed countless trillions of meaningful connections between bits of information.
For example, one of the most popular shows in human history was Frasier, and it ran for over a decade. Plus Fraiser was the lead character, and in the following prompt I provided redundant links to the desired information (ex-wife an mother of his son), yet at temp 0 this is how Phi4 responded.
Prompt: Who portrayed Frasier’s ex-wife and mother of his son in the sitcom Frasier?
Response: Maris Crane, the ex-wife of Frasier Crane and mother of his son Niles, was portrayed by actress Jane Leeves in the sitcom "Frasier."
All those interconnections are wrong. For example, Maris was the wife of Niles, Frasier's brother, his son's name was Frederick (not Niles, which again is Frasier's brother), and Jane Leeves portrayed his housekeeper Daphne.
Phi4 knows each relevant bit of relevant information in isolation (e.g. Who is Frasier's brother? -Niles), but was so aggressively filtered, including for redundancy, that it can't link any of the information together. This loss of deeper connections manifests across a broad spectrum of tasks, such as story telling, so RAG isn't a viable solution.
I'm sorry, but you can't pick and choose which of humanity's most popular information you think is worthy of inclusion in an AI model. If information is popular (e.g. pop culture) then said information will regularly appear in the general population's diverse set of prompts, including phrasing variations, grammar/spelling errors, and so on. Since you failed to do this Phi4 is a very unreliable inorganic mess that's more superficial than a valley girl, and full of holes than Swiss cheese.
In short, you removed the core of humanity, and it shows.
The examples you give are perfectly answerable by LLM using a search engine. I don’t see how LLM’s inability to recite this boring information leads it to “removing the core of humanity”.
Can you provide further evidence that these questions cannot be answered by the RAG system, and a stronger statement about “reciting the kinship of the characters in Frasier” embodying “the core of humanity”?
@noneUsername I refuse to bury my point in mountains of irrelevant detail. The simple example I provided clearly shows the destructive consequence of aggressively filtering out humanity's most popular information because it's deemed unworthy of inclusion, and in so doing, producing an AI model that lacks countless trillions of meaningful connections between humanity's most popular information.
Asking a basic question at temp 0 about the namesake of one of the most popular and grounding breaking TV shows in human history, and which remained on the air for over a decade, and was a spinoff of another popular decade long show (Cheers), and so on, yet getting every one of its core connections scrambled (e.g. his brother Niles is actually his son) is not OK.
And again, RAG isn't the solution. You can't do things like write organic stories about popular things when you think Marry was the son of Jesus. Literature, movies, conversations with friends... all assume shared popular knowledge when making jokes, telling stories, making points... You can't just strip the large majority of popular knowledge from an AI model, then condescendingly say let them eat RAG.