• Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

  • Dec 17 2024
  • Duração: 56 minutos
  • Podcast

Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

  • Sumário

  • In this episode of Gradient Dissent, Joseph E. Gonzalez, EECS Professor at UC Berkeley and Co-Founder at RunLLM, joins host Lukas Biewald to explore innovative approaches to evaluating LLMs.

    They discuss the concept of vibes-based evaluation, which examines not just accuracy but also the style and tone of model responses, and how Chatbot Arena has become a community-driven benchmark for open-source and commercial LLMs. Joseph shares insights on democratizing model evaluation, refining AI-human interactions, and leveraging human preferences to improve model performance. This episode provides a deep dive into the evolving landscape of LLM evaluation and its impact on AI development.

    🎙 Get our podcasts on these platforms:

    Apple Podcasts: http://wandb.me/apple-podcasts

    Spotify: http://wandb.me/spotify

    Google: http://wandb.me/gd_google

    YouTube: http://wandb.me/youtube


    Follow Weights & Biases:

    https://twitter.com/weights_biases

    https://www.linkedin.com/company/wandb


    Join the Weights & Biases Discord Server:

    https://discord.gg/CkZKRNnaf3

    Exibir mais Exibir menos

O que os ouvintes dizem sobre Evaluating LLMs with Chatbot Arena and Joseph E. Gonzalez

Nota média dos ouvintes. Apenas ouvintes que tiverem escutado o título podem escrever avaliações.

Avaliações - Selecione as abas abaixo para mudar a fonte das avaliações.