Episódios

  • #115 Using Time Series to Estimate Uncertainty, with Nate Haines
    Sep 17 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways:

    • State space models and traditional time series models are well-suited to forecast loss ratios in the insurance industry, although actuaries have been slow to adopt modern statistical methods.
    • Working with limited data is a challenge, but informed priors and hierarchical models can help improve the modeling process.
    • Bayesian model stacking allows for blending together different model predictions and taking the best of both (or all if more than 2 models) worlds.
    • Model comparison is done using out-of-sample performance metrics, such as the expected log point-wise predictive density (ELPD). Brute leave-future-out cross-validation is often used due to the time-series nature of the data.
    • Stacking or averaging models are trained on out-of-sample performance metrics to determine the weights for blending the predictions. Model stacking can be a powerful approach for combining predictions from candidate models. Hierarchical stacking in particular is useful when weights are assumed to vary according to covariates.
    • BayesBlend is a Python package developed by Ledger Investing that simplifies the implementation of stacking models, including pseudo Bayesian model averaging, stacking, and hierarchical stacking.
    • Evaluating the performance of patient time series models requires considering multiple metrics, including log likelihood-based metrics like ELPD, as well as more absolute metrics like RMSE and mean absolute error.
    • Using robust variants of metrics like ELPD can help address issues with extreme outliers. For example, t-distribution estimators of ELPD as opposed to sample sum/mean estimators.
    • It is important to evaluate model performance from different perspectives and consider the trade-offs between different metrics. Evaluating models based solely on traditional metrics can limit understanding and trust in the model. Consider additional factors such as interpretability, maintainability, and productionization.
    • Simulation-based calibration (SBC) is a valuable tool for assessing parameter estimation and model correctness. It allows for the interpretation of model parameters and the identification of coding errors.
    • In industries like insurance, where regulations may restrict model choices, classical statistical approaches still play a significant role. However, there is potential for Bayesian methods and generative AI in certain areas.

    Exibir mais Exibir menos
    1 hora e 40 minutos
  • #114 From the Field to the Lab – A Journey in Baseball Science, with Jacob Buffa
    Sep 5 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways:

    • Education and visual communication are key in helping athletes understand the impact of nutrition on performance.
    • Bayesian statistics are used to analyze player performance and injury risk.
    • Integrating diverse data sources is a challenge but can provide valuable insights.
    • Understanding the specific needs and characteristics of athletes is crucial in conditioning and injury prevention. The application of Bayesian statistics in baseball science requires experts in Bayesian methods.
    • Traditional statistical methods taught in sports science programs are limited.
    • Communicating complex statistical concepts, such as Bayesian analysis, to coaches and players is crucial.
    • Conveying uncertainties and limitations of the models is essential for effective utilization.
    • Emerging trends in baseball science include the use of biomechanical information and computer vision algorithms.
    • Improving player performance and injury prevention are key goals for the future of baseball science.

    Chapters:

    00:00 The Role of Nutrition and Conditioning

    05:46 Analyzing Player Performance and Managing Injury Risks

    12:13 Educating Athletes on Dietary Choices

    18:02 Emerging Trends in Baseball Science

    29:49 Hierarchical Models and Player Analysis

    36:03 Challenges of Working with Limited Data

    39:49 Effective Communication of Statistical Concepts

    47:59 Future Trends: Biomechanical Data Analysis and Computer Vision Algorithms

    Thank you to my Patrons for making this episode possible!

    Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde,...

    Exibir mais Exibir menos
    1 hora e 2 minutos
  • #113 A Deep Dive into Bayesian Stats, with Alex Andorra, ft. the Super Data Science Podcast
    Aug 22 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways:

    • Bayesian statistics is a powerful framework for handling complex problems, making use of prior knowledge, and excelling with limited data.
    • Bayesian statistics provides a framework for updating beliefs and making predictions based on prior knowledge and observed data.
    • Bayesian methods allow for the explicit incorporation of prior assumptions, which can provide structure and improve the reliability of the analysis.
    • There are several Bayesian frameworks available, such as PyMC, Stan, and Bambi, each with its own strengths and features.
    • PyMC is a powerful library for Bayesian modeling that allows for flexible and efficient computation.
    • For beginners, it is recommended to start with introductory courses or resources that provide a step-by-step approach to learning Bayesian statistics.
    • PyTensor leverages GPU acceleration and complex graph optimizations to improve the performance and scalability of Bayesian models.
    • ArviZ is a library for post-modeling workflows in Bayesian statistics, providing tools for model diagnostics and result visualization.
    • Gaussian processes are versatile non-parametric models that can be used for spatial and temporal data analysis in Bayesian statistics.

    Chapters:

    00:00 Introduction to Bayesian Statistics

    07:32 Advantages of Bayesian Methods

    16:22 Incorporating Priors in Models

    23:26 Modeling Causal Relationships

    30:03 Introduction to PyMC, Stan, and Bambi

    34:30 Choosing the Right Bayesian Framework

    39:20 Getting Started with Bayesian Statistics

    44:39 Understanding Bayesian Statistics and PyMC

    49:01 Leveraging PyTensor for Improved Performance and Scalability

    01:02:37 Exploring Post-Modeling Workflows with ArviZ

    01:08:30 The Power of Gaussian Processes in Bayesian Modeling

    Thank you to my Patrons for making this episode possible!

    Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna,...

    Exibir mais Exibir menos
    1 hora e 31 minutos
  • #112 Advanced Bayesian Regression, with Tomi Capretto
    Aug 7 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways:

    • Teaching Bayesian Concepts Using M&Ms: Tomi Capretto uses an engaging classroom exercise involving M&Ms to teach Bayesian statistics, making abstract concepts tangible and intuitive for students.
    • Practical Applications of Bayesian Methods: Discussion on the real-world application of Bayesian methods in projects at PyMC Labs and in university settings, emphasizing the practical impact and accessibility of Bayesian statistics.
    • Contributions to Open-Source Software: Tomi’s involvement in developing Bambi and other open-source tools demonstrates the importance of community contributions to advancing statistical software.
    • Challenges in Statistical Education: Tomi talks about the challenges and rewards of teaching complex statistical concepts to students who are accustomed to frequentist approaches, highlighting the shift to thinking probabilistically in Bayesian frameworks.
    • Future of Bayesian Tools: The discussion also touches on the future enhancements for Bambi and PyMC, aiming to make these tools more robust and user-friendly for a wider audience, including those who are not professional statisticians.

    Chapters:

    05:36 Tomi's Work and Teaching

    10:28 Teaching Complex Statistical Concepts with Practical Exercises

    23:17 Making Bayesian Modeling Accessible in Python

    38:46 Advanced Regression with Bambi

    41:14 The Power of Linear Regression

    42:45 Exploring Advanced Regression Techniques

    44:11 Regression Models and Dot Products

    45:37 Advanced Concepts in Regression

    46:36 Diagnosing and Handling Overdispersion

    47:35 Parameter Identifiability and Overparameterization

    50:29 Visualizations and Course Highlights

    51:30 Exploring Niche and Advanced Concepts

    56:56 The Power of Zero-Sum Normal

    59:59 The Value of Exercises and Community

    01:01:56 Optimizing Computation with Sparse Matrices

    01:13:37 Avoiding MCMC and Exploring Alternatives

    01:18:27 Making Connections Between Different Models

    Thank you to my Patrons for making this episode...

    Exibir mais Exibir menos
    1 hora e 27 minutos
  • #111 Nerdinsights from the Football Field, with Patrick Ward
    Jul 24 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways:

    • Communicating Bayesian concepts to non-technical audiences in sports analytics can be challenging, but it is important to provide clear explanations and address limitations.
    • Understanding the model and its assumptions is crucial for effective communication and decision-making.
    • Involving domain experts, such as scouts and coaches, can provide valuable insights and improve the model's relevance and usefulness.
    • Customizing the model to align with the specific needs and questions of the stakeholders is essential for successful implementation.
    • Understanding the needs of decision-makers is crucial for effectively communicating and utilizing models in sports analytics.
    • Predicting the impact of training loads on athletes' well-being and performance is a challenging frontier in sports analytics.
    • Identifying discrete events in team sports data is essential for analysis and development of models.

    Chapters:

    00:00 Bayesian Statistics in Sports Analytics

    18:29 Applying Bayesian Stats in Analyzing Player Performance and Injury Risk

    36:21 Challenges in Communicating Bayesian Concepts to Non-Statistical Decision-Makers

    41:04 Understanding Model Behavior and Validation through Simulations

    43:09 Applying Bayesian Methods in Sports Analytics

    48:03 Clarifying Questions and Utilizing Frameworks

    53:41 Effective Communication of Statistical Concepts

    57:50 Integrating Domain Expertise with Statistical Models

    01:13:43 The Importance of Good Data

    01:18:11 The Future of Sports Analytics

    Thank you to my Patrons for making this episode possible!

    Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew...

    Exibir mais Exibir menos
    1 hora e 26 minutos
  • #110 Unpacking Bayesian Methods in AI with Sam Duffield
    Jul 10 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways:

    • Use mini-batch methods to efficiently process large datasets within Bayesian frameworks in enterprise AI applications.
    • Apply approximate inference techniques, like stochastic gradient MCMC and Laplace approximation, to optimize Bayesian analysis in practical settings.
    • Explore thermodynamic computing to significantly speed up Bayesian computations, enhancing model efficiency and scalability.
    • Leverage the Posteriors python package for flexible and integrated Bayesian analysis in modern machine learning workflows.
    • Overcome challenges in Bayesian inference by simplifying complex concepts for non-expert audiences, ensuring the practical application of statistical models.
    • Address the intricacies of model assumptions and communicate effectively to non-technical stakeholders to enhance decision-making processes.

    Chapters:

    00:00 Introduction to Large-Scale Machine Learning

    11:26 Scalable and Flexible Bayesian Inference with Posteriors

    25:56 The Role of Temperature in Bayesian Models

    32:30 Stochastic Gradient MCMC for Large Datasets

    36:12 Introducing Posteriors: Bayesian Inference in Machine Learning

    41:22 Uncertainty Quantification and Improved Predictions

    52:05 Supporting New Algorithms and Arbitrary Likelihoods

    59:16 Thermodynamic Computing

    01:06:22 Decoupling Model Specification, Data Generation, and Inference

    Thank you to my Patrons for making this episode possible!

    Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal

    Exibir mais Exibir menos
    1 hora e 12 minutos
  • #109 Prior Sensitivity Analysis, Overfitting & Model Selection, with Sonja Winter
    Jun 25 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work !

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways

    • Bayesian methods align better with researchers' intuitive understanding of research questions and provide more tools to evaluate and understand models.
    • Prior sensitivity analysis is crucial for understanding the robustness of findings to changes in priors and helps in contextualizing research findings.
    • Bayesian methods offer an elegant and efficient way to handle missing data in longitudinal studies, providing more flexibility and information for researchers.
    • Fit indices in Bayesian model selection are effective in detecting underfitting but may struggle to detect overfitting, highlighting the need for caution in model complexity.
    • Bayesian methods have the potential to revolutionize educational research by addressing the challenges of small samples, complex nesting structures, and longitudinal data.
    • Posterior predictive checks are valuable for model evaluation and selection.

    Chapters

    00:00 The Power and Importance of Priors

    09:29 Updating Beliefs and Choosing Reasonable Priors

    16:08 Assessing Robustness with Prior Sensitivity Analysis

    34:53 Aligning Bayesian Methods with Researchers' Thinking

    37:10 Detecting Overfitting in SEM

    43:48 Evaluating Model Fit with Posterior Predictive Checks

    47:44 Teaching Bayesian Methods

    54:07 Future Developments in Bayesian Statistics

    Thank you to my Patrons for making this episode possible!

    Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi...

    Exibir mais Exibir menos
    1 hora e 11 minutos
  • #108 Modeling Sports & Extracting Player Values, with Paul Sabin
    Jun 14 2024

    Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!

    • My Intuitive Bayes Online Courses
    • 1:1 Mentorship with me

    Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!

    Visit our Patreon page to unlock exclusive Bayesian swag ;)

    Takeaways

    • Convincing non-stats stakeholders in sports analytics can be challenging, but building trust and confirming their prior beliefs can help in gaining acceptance.
    • Combining subjective beliefs with objective data in Bayesian analysis leads to more accurate forecasts.
    • The availability of massive data sets has revolutionized sports analytics, allowing for more complex and accurate models.
    • Sports analytics models should consider factors like rest, travel, and altitude to capture the full picture of team performance.
    • The impact of budget on team performance in American sports and the use of plus-minus models in basketball and American football are important considerations in sports analytics.
    • The future of sports analytics lies in making analysis more accessible and digestible for everyday fans.
    • There is a need for more focus on estimating distributions and variance around estimates in sports analytics.
    • AI tools can empower analysts to do their own analysis and make better decisions, but it's important to ensure they understand the assumptions and structure of the data.
    • Measuring the value of certain positions, such as midfielders in soccer, is a challenging problem in sports analytics.
    • Game theory plays a significant role in sports strategies, and optimal strategies can change over time as the game evolves.

    Chapters

    00:00 Introduction and Overview

    09:27 The Power of Bayesian Analysis in Sports Modeling

    16:28 The Revolution of Massive Data Sets in Sports Analytics

    31:03 The Impact of Budget in Sports Analytics

    39:35 Introduction to Sports Analytics

    52:22 Plus-Minus Models in American Football

    01:04:11 The Future of Sports Analytics

    Thank you to my Patrons for making this episode possible!

    Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi...

    Exibir mais Exibir menos
    1 hora e 18 minutos