AIModels.fyi
  • Home
  • Creators
  • Models
  • Notes
  • Advertise
  • 🎉 Support my work
Sign in Subscribe

Latest

LLMs outperform humans in predicting the results of neuro experiments

LLMs surpass humans in predicting which neuroscience experiments will succeed (81% vs 64%)

AI could help researchers prioritize promising experiments, accelerating discovery and reducing waste.
aimodels-fyi Mar 9, 2024
Deepfake watermark system

Meta can now secretly watermark deepfake audio

Researchers have found a way to imperceptibly watermark fake audio
aimodels-fyi Feb 3, 2024
WebVoyager framework

Teaching AI to see websites like a human made it more capable

Tencent's AI can now complete the majority of its tasks on Google, Amazon, and Wikipedia
aimodels-fyi Jan 28, 2024
"Flow engineering" doubles code generation accuracy (19% vs 44%)

"Flow engineering" doubles code generation accuracy (19% vs 44%)

The authors of a new paper present an approach that "intensifies" code generation.
aimodels-fyi Jan 20, 2024
AMIE is much more accurate than a real doctor

Google's new LLM doctor is right way more often than a real doctor

The LLM's differential diagnosis list had the correct diagnosis 59% of the time, vs. 34% for human doctors.
aimodels-fyi Jan 13, 2024
All LLM improvements are just task contamination?

All LLM improvements are just task contamination?

Current benchmarks are probably overestimating the true capabilities of LLMs
aimodels-fyi Jan 4, 2024
Prompting with unified diffs makes GPT-4 write much better code

Prompting with unified diffs makes GPT-4 write much better code

A developer for an open-source paired-programming tool discovered the trick
aimodels-fyi Dec 30, 2023

Subscribe to AIModels.fyi

Don't miss out on the latest news. Sign up now to get access to the library of members-only articles.
  • Sign up
AIModels.fyi © 2025. Powered by Ghost