Evaluating the Factual Accuracy of ChatGPT-4o, Gemini, and Perplexity.ai in Real-World Queries
DevelopmentCorporate
AUGUST 16, 2024
Large language models (LLMs) like ChatGPT-4o, Gemini, and Perplexity.ai are assessed using the WildHallucinations benchmark to handle "hallucinations"—generating incorrect information. ChatGPT-4o excels in well-documented areas, Gemini prioritizes accuracy over responsiveness, and Perplexity.ai uses real-time retrieval to update its responses. Each has strengths and weaknesses, necessitating further improvements.
Let's personalize your content