• 177: Vector Databases

  • Nov 4 2024
  • Duração: 1 hora e 28 minutos
  • Podcast

177: Vector Databases

  • Sumário

  • Intro topic: Buying a Car

    News/Links:

    • Cognitive Load is what Matters
      • https://github.com/zakirullin/cognitive-load
    • Diffusion models are Real-Time Game Engines
      • https://gamengen.github.io/
    • Your Company Needs Junior Devs
      • https://softwaredoug.com/blog/2024/09/07/your-team-needs-juniors
    • Seamless Streaming / Fish Speech / LLaMA Omni
      • Seamless: https://huggingface.co/facebook/seamless-streaming
      • Fish: https://github.com/fishaudio/fish-speech
      • LLaMA Omni: https://github.com/ictnlp/LLaMA-Omni

    Book of the Show

    • Patrick:
      • Thought Emporium Youtube
        • https://youtu.be/8X1_HEJk2Hw?si=T8EaHul-QMahyUvQ
    • Jason:
      • Novel Minds
        • https://www.novelminds.ai/


    Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h


    Tool of the Show

    • Patrick:
      • Escape Simulator
        • https://pinestudio.com/games/escape-simulator/
    • Jason:
      • Cursor IDE
        • https://www.cursor.com/

    Topic: Vector Databases (~54 min)

    • How computers represent data traditionally
      • ASCII values
      • RGB values
    • How traditional compression works
      • Huffman encoding (tree structure)
      • Lossy example: Fourier Transform & store coefficients
    • How embeddings are computed
      • Pairwise (contrastive) methods
      • Forward models (self-supervised)
    • Similarity metrics
    • Approximate Nearest Neighbors (ANN)
    • Sub-Linear ANN
      • Clustering
      • Space Partitioning (e.g. K-D Trees)
    • What a vector database does
      • Perform nearest-neighbors with many different similarity metrics
      • Store the vectors and the data structures to support sub-linear ANN
      • Handle updates, deletes, rebalancing/reclustering, backups/restores
    • Examples
      • pgvector: a vector-database plugin for postgres
      • Weaviate, Pinecone
      • Milvus

    ★ Support this podcast on Patreon ★
    Exibir mais Exibir menos

O que os ouvintes dizem sobre 177: Vector Databases

Nota média dos ouvintes. Apenas ouvintes que tiverem escutado o título podem escrever avaliações.

Avaliações - Selecione as abas abaixo para mudar a fonte das avaliações.