social.coop is one of the many independent Mastodon servers you can use to participate in the fediverse.
A Fediverse instance for people interested in cooperative and collective projects. If you are interested in joining our community, please apply at https://join.social.coop/registration-form.html.

Administered by:

Server stats:

501
active users

#gpt4

1 post1 participant0 posts today
Continued thread

Powerful #AI models, such as #OpenAI’s #GPT4 and Google #Gemini, also face extra obligations, such as having to be more transparent about how models are trained. Meta’s Joel Kaplan said the code risked imposing “unworkable and technically unfeasible requirements”.

"Separately, the authors also tested several contemporaneous large language models (GPT-4, GPT-3.5 and Llama 3 8B). GPT-4's edit summaries in particular were rated as significantly better than those provided by the human Wikipedia editors who originally made the edits in the sample – both using an automated scoring method based on semantic similarity, and in a quality ranking by human raters (where "to ensure high-quality results, instead of relying on the crowdsourcing platforms [like Mechanical Turk, frequently used in similar studies], we recruited 3 MSc students to perform the annotation").

This outcome joins some other recent research indicating that modern LLMs can match or even surpass the average Wikipedia editor in certain tasks (see e.g. our coverage: "'Wikicrow' AI less 'prone to reasoning errors (or hallucinations)' than human Wikipedia editors when writing gene articles").

A substantial part of the paper is devoted to showing that this particular task (generating good edit summaries) is both important and in need of improvements, motivating the use of AI to "overcome this problem and help editors write useful edit summaries":"

meta.wikimedia.org/wiki/Resear

meta.wikimedia.orgResearch:Newsletter/2025/January - Meta

Moderne #KI-Modelle verblüffen mit ihrer Leistungsfähigkeit: Sie lösen komplexe Aufgaben, analysieren wissenschaftliche Texte und schreiben sogar Gedichte – sachlich präzise und sprachlich elegant. Doch ein neuer Test, "Humanity's Last Exam", zeigt die Grenzen dieser Technologie auf. Selbst Spitzenmodelle wie #GPT4 und Google #Gemini scheitern in vielen Bereichen. Interview mit Sören Möller 👨‍🔬 zu dem Test, der selbst die besten KI-Modelle scheitern lässt. 🎙 👉 fz-juelich.de/de/aktuelles/new

Interesting. DuckDuckGo (the privacy search engine folk) has an AI Chat offering. Actually it's four chat apps accessed through one portal (duck.ai of course), ChatGPT, Claude, Llama, and Mistral.

Looks like we're going to be drowning in AI tools soon.

#KINutzen #Retröt
#KünstlicheIntelligenz kann effektiv #Verschwörungstheorien widerlegen. Durch gezielte Argumentation sank der Glaube an solche Theorien bei den Teilnehmenden um 20%. Die Chats hatten auch eine nachhaltige Wirkung auf die nächsten Monate. Die Ergebnisse zeigen, dass KI eine vielversprechende Unterstützung im Kampf gegen #Fehlinformationen sein könnte.

#KünstlicheIntelligenz #Verschwörungstheorien #Faktencheck #Studie #GPT4 #Science

tino-eberl.de/nutzen-kuenstlic

Tino Eberl · Künstliche Intelligenz widerlegt Verschwörungstheorien erfolgreich
More from Tino Eberl