Episódios

  • Podcast: Snowflake vs. Teradata
    Sep 9 2022

    Hi everyone! This is an update to my recent blog post on the final days of the legacy data warehouse (link below).

    The topic of legacy data warehouses slowly fading away struck a chord with many readers. Now we have updates from Snowflake and Teradata.

    On Aug 24, the same day I published “The Final Days of the Legacy Data Warehouse,” Snowflake announced its earnings for Q2 FY2023. Not surprisingly, a question about legacy systems came up during Snowflake’s earnings call. One financial analyst asked Snowflake CEO Frank Slootman about the level of activity of customers migrating from on-premises systems to Snowflake’s data cloud.

    Slootman: “In the last week, I've heard two very, very iconic names in two different industries that were staunch on-premises people, who would never ever go cloud, and that are now going [cloud]. So I just feel that the resistance is completely breaking….A lot of this is that they’re going to get left behind. You can’t take advantage of innovations that are only available on the cloud. We’re going to see acceleration out of this.”

    Is he right? I have no doubt that he is.

    According to Ocient, 59% of respondents to its survey are actively looking to switch data warehouse providers. They specifically named IBM, Cloudera, and Teradata as the top 3 legacy environments that data managers want to move away from.

    Their reasons:

    ·      40% want to modernize their legacy platforms

    ·      42% feel their existing system isn’t comprehensive enough, and

    ·      36% say it’s not flexible enough

    This explains why Snowflake, with its data cloud and data marketplace, has become such a tour de force. Other disruptors are Databricks, Firebolt, SingleStore, TileDB, Yellowbrick, and of course AWS, Google, and Microsoft.

    I would include Ocient as well, with its hyperscale data warehouse platform, which is capable of analyzing trillions of records.

    The old guard responds

    Where does that leave traditional data warehouse providers—companies like IBM and Teradata? They know that their customers want newer, cloud-native platforms. And they’re taking steps to modernize their offerings.

    That brings me back to Teradata, which recently made a product announcement that is relevant to this whole discussion.

    Teradata is synonymous with the older data warehouses that many organizations are looking to replace. But Teradata is fighting back, as SVP Ashish Yajnik described to me in an earlier Cloud Database Report podcast conversation (link below).

    Teradata’s new cloud-native architecture

    Now, Teradata has just introduced VantageCloud Lake, a new and improved cloud data warehouse that is based on a cloud-native architecture. With modern capabilities like object storage in the cloud, auto scaling, and self-service in AWS, and soon to be available in other clouds.

    So the decision to move to a cloud data warehouse is getting easier, but also harder in some respects.

    * Easier because that’s the inevitable direction the industry is heading. For CIOs and CTOs the question is when, not if.

    * Harder because incumbent vendors like Teradata are not standing by while Snowflake and Databricks pick off their installed base. They’re responding with cloud-native platforms of their own.

    Who will be the next leaders in this fast-changing market? We’ll have to wait a while longer for the query results on that question.



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    Ainda desconhecido
  • Teradata on Top 3 Priorities, AWS Partnership, 'Intelligent' Multi-Clouds
    Feb 6 2022

    In this episode of the Cloud Database Report Podcast, I talk with Ashish Yajnik, SVP of product management for data & analytics at Teradata, the old-school data warehouse company that is transitioning its platform and capabilities to the cloud.

    I’ve been covering Teradata for more that 20 years. Here’s an article I wrote for InformationWeek in 2007 when NCR spun off Teradata as a separate company.

    So I’ve been watching with interest as Teradata continues to modernize its data warehouse environment, Vantage. Many of Teradata's customers continue to manage enterprise data warehouses on premises, while transitioning to the cloud over months or years.

    Yajnik is responsible for Teradata’s product transformation to the cloud, which puts him in the thick of things as the company repositions its traditional data warehouse for use in hybrid and multi-cloud environments. 

    It’s worth noting that Teradata Vantage is available on all of the Big 3 public cloud platforms—AWS, Google Cloud, and Microsoft Azure. In our interview, Yajnik downplayed any competition with those heavy hitters, even though they call offer data warehouse platforms of their own. “We don’t see cloud service providers as competitors,” he said.

    In fact, just last week Teradata announced an expanded partnership with Microsoft to integrate Teradata Vantage with Microsoft Azure. And in November, Teradata announced a three-year “strategic collaboration agreement” with AWS that includes, among other things, joint product development and integration.

    Separately, Teradata has recently announced customer deals with Telefonica, Volkswagen, and Tesco. 

    We will learn more about how Teradata’s strategy is playing out when the company announces its financial results for Q4 and full-year 2021 on Feb. 7.

    Highlights from the Podcast

    Key topics from the interview include: 

    * Teradata's priorities for the year ahead

    * Strategic collaboration with AWS on product development and integration of Vantage on AWS

    * Expanding use of AI & ML in Teradata environments

    * Customer projects, including Volkswagen for smart factories

    * What Teradata is doing to enable more data sharing

    * Teradata’s core strengths in this fast-changing competitive market

    Quotes from the podcast: 

    * "What we are embarking on is to make this whole multi-cloud journey much more intelligent and not so accidental for our customers."

    * "Our customers  require a  unified architecture from both companies [Teradata and AWS] in order to modernize and build their data and analytics platform."

    * "We are seeing a ton of interest in the analytics roadmaps, especially in the context of these industry data models."

    * "We've seen customers go to competitors, hit a brick wall in terms of their scaling needs, and come back to Vantage."

    * "Not all analytics are created equal."



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    Ainda desconhecido
  • Ocient CEO Chris Gladwin: Analyzing the World's Largest Datasets
    Dec 22 2021

    Ocient is a software startup that specializes in complex analysis of the world's largest datasets. Early adopters are hyperscale web companies and enterprises that need to analyze data sets of billions or trillions of records. 

    Prior to Ocient, Gladwin was the founder of object storage vendor Cleversafe, acquired by IBM in 2015. That experience with mega-size data storage  carried over to Ocient, whose software is optimized to run on NVMe solid state storage, industry standard CPUs, and 100 GB networking. 

    John Foley is editor of the Cloud Database Report and senior analyst with Acceleration Economy. 

    Key topics from the interview include: 

    • Ocient is focused on very large datasets—petabytes, exabytes, and trillions of rows of data
    • Leading uses cases include digital ad auctions, telecom network traffic, vehicle fleets
    • Ocient uses a computer adjacent architecture with storage and compute in the same tier
    • Ocient is available on premises, in the cloud, and as a managed service
    • What’s ahead for Ocient in 2022

    Quotes from the podcast: 

    • "Our focus is on complex analysis of at least hundreds of billions of records, if not trillions or tens of trillions or hundreds of trillions. That's that's territory that was previously impossible."
    • "Billions is kind of the last scale at which humans can actually make or touch data that big. It's very hard to do, but it's possible. But at trillions scale, it's just not possible."
    • "I've challenged people to give me an example of some new technology, some new version of something that makes less data than the version it replaces."
    • "5g is arguably the largest technology infrastructure investment ever. It's going to create a whole lot more data, at least 10 times the amount of data, for everything."
    • "What we see is, over time, data analysis is going to occur on these hyperscale systems."


    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    36 minutos
  • Yellowbrick CTO Mark Cusack: What Is a Cloud-Native Data Warehouse?
    Nov 11 2021

    Yellowbrick Data is a 7-year-old startup that continues to grow in the highly competitive cloud data warehouse market. Yellowbrick recently raised $75 million in its latest round of capital funding as it expands into a variety of industries, including telecom, healthcare, retail, and manufacturing. 

    Yellowbrick describes itself as a cloud-native data warehouse.  It is available for deployment on premises and in hybrid cloud and multi-cloud environments.

    Key topics from the interview include: 

    • What make a database or data warehouse cloud native? APIs, open source, storage tiers,  networking. How does Yellowbrick define it?
    • One of the key things with cloud-native data warehouses is the separation of storage and compute. It gives you scalable storage and dynamic compute resources.
    • Not all approaches to storage/compute are the same. Yellowbrick has published a white paper that  defines six different levels of storage/compute separation.
    • There are  performance and workload advantages, but also  important considerations around cost.

    Quotes from the podcast: 

    • "The separation of storage and compute is table stakes for cloud data warehouses today."
    • "The ultimate goal is a data warehouse that provides the same cloud experience wherever you need to deploy it for business needs or business reasons. That could be data sovereignty, data gravity, regulations, security, latency and things like that, but provide the same easy-to-consume experience throughout."
    • "We're addressing two problems: One, software in data warehouses is not as efficient as it could be. And second, there's a lot of unpredictability around the costs of running these systems." 
    • "Democratization of data and analytics is a key trend. And making a self-service experience for line-of-business users is critical." 


    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    27 minutos
  • TileDB CEO Stavros Papadopoulos: A Universal Database for Complex Data
    Nov 3 2021

    With a PhD in Computer Science and Engineering from the Hong Kong University of Science and Technology, Papadopoulos worked as a research scientist at Massachusetts Institute of Technology and Intel Labs prior to launching TileDB.  As he explains in this interview, the idea for TileDB originated in that research work in emerging big data systems and the hardware requirements to support those workloads. 

    Universal databases are not new, but they are re-emerging as an alternative to the single-purpose databases that have become popular in the tech industry.  

    Key topics from the interview include: 

    • TileDb stores data in multi-dimensional arrays, or matrixes. The data types and workloads it supports.
    • How TileDB differs from  object-relational universal databases of a generation earlier.
    • How TileDB compares to purpose-built databases – time-series, graph, document, vector, etc.
    • Use cases and early adopters.
    • TileDB’s availability as a cloud service and for use on-premises.

    Quotes from the podcast: 

    • “These ideas were shaped based on interactions we had with practitioners and data scientists across domains. That was key. We did not delve into the traditional, relational query optimization and SQL operations that other people were doing with different architectures in the cloud."
    • "I was very drawn to scientific use cases like geospatial and bio-informatics. And it came as a great surprise to me that none of those verticals and applications were using databases."
    • "Is there a way to build a single storage engine to consolidate this data? A single authentication layer, a single access control layer, and so on. This is how it started."


    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    33 minutos
  • Yugabyte CTO Karthik Ranganathan: Where Data Lives Forever
    Sep 13 2021

    Ranganathan discusses the design considerations that influenced development of YugabyteDB, including the learnings gleaned from the engineering team’s previous work at Facebook.  YugabyteDB can be deployed on premises or as a cloud service. With built-in replication, YugabyteDB can be used to distribute data across geographic regions in support of data localization requirements and for high availability.

    Key topics in the interview include: 

    • The Yugabyte engineering team worked on the HBase and Cassandra databases at Facebook, experience that is now carrying over to the work they are doing at Yugabyte.
    • How YugabyteDB is different from other distributed SQL databases, including its support for both SQL and NoSQL interfaces.
    • Common uses cases for Yugabyte DB include real-time transactions, microservices, Edge and IoT applications, and geographically-distributed workloads.
    • Yugabyte is available via Apache 2.0 license and as self-managed and fully-managed cloud services.

    Quotes from the podcast: 

    • “One of the important characteristics of transactional data is the fact that it needs to live forever.”
    • “We reuse the upper half of Postgres, so it literally is Postgres-compatible and has all of the features.”
    • “We said we're going to meet developers where they develop. We will support both API's [SQL and NoSQL]. We're not going to invent a new API — that's what people hate.”
    • “It's not the database that people pay money for; it’s the operations of the database and making sure it runs in a turnkey manner that people really find valuable in an enterprise setting.” 


    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    26 minutos
  • Matillion's Ciaran Dynes: Data Transformation for Cloud Data Warehouses
    Aug 27 2021

    In this episode of the Cloud Database Report Podcast, editor and host John Foley talks with Ciaran Dynes, Chief Product Officer of Matillion, about the  process of integrating and preparing data for cloud data warehouses. Ciaran is responsible for product strategy and incorporating customer requirements into Matillion’s products, which include software tools for data integration and ETL/ELT. 

    Key topics in the interview include: 

    • ETL, which stands for Extract, Transform, and Load, has been standard practice with on-premise data warehouses for 50 years. But ETL is changing in the cloud because data transformation happens in the cloud data warehouse, after data has been extracted and loaded. This new process is called ELT.
    • Data must be integrated from myriad sources. Matillion says that many cloud data warehouses pull data from more than 1,000 databases, applications, and other sources.
    • Data quality is an ongoing challenge, but automation can help. 

    Quotes from the podcast conversation: 

    • “We’ve moved to this general concept of bronze, silver, and gold versions of data.”
    • That’s the game we’re in — can we connect, combine, then synchronize back out into the operational system so we can take an action with a customer in real time?"
    • “Big data forced organizations to make data a board-level and executive-level conversation.”
    • “The culture of data is changing rapidly within companies.”

     



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    31 minutos
  • Google Cloud's Andi Gutmans: What's Driving Database Migrations and Modernization
    Jul 28 2021

    The adoption of cloud databases is accelerating, driven by business transformation and the need for database modernization. 

    In this episode of the Cloud Database Report Podcast, founding editor John Foley talks with Andi Gutmans, Google Cloud's GM and VP of Engineering for Databases, about the platforms and technologies that organizations are using to build and manage these new data environments. 

    Gutmans is responsible for development of Google Cloud's databases and related technologies, including Bigtable, Cloud SQL, Spanner, and Firestore. In this conversation, he discusses the three steps of cloud database adoption: migration, modernization, and transformation. "We're definitely seeing a tremendous acceleration," he says. 

    Gutmans talks about the different types of database migrations, from "homogenous" migrations that are relatively fast and simple to more complex ones that involve different database sources and target platforms. He reviews the tools and services available to help with the process, including Google Cloud's Database Migration Service and Datastream for change data capture. 

    Gutmans provides an overview of the "data cloud" model as a comprehensive data environment that connects multiple databases and reduces the need for organizations to build their own plumbing. Data clouds can "democratize" data while providing security and governance. 

    Looking ahead, Google Cloud will continue to focus on database migrations, developing new enterprise capabilities, and providing a better experience for developers. 



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit clouddb.substack.com
    Exibir mais Exibir menos
    25 minutos