×

Tag: Education

Showing 77 results

Run optimized geospatial queries with Trino

March 23, 2023

The Trino open source distributed query engine is known as a choice for running ad-hoc analysis where there’s no need to model the data and...

How fast access to data and quality ML code can enable competitive differentiation and innovation

February 16, 2023

2022 ended with many successful AI models being deployed, including OpenAI’s ChatGPT. There’s no doubt that there will be plenty more successes in 2023....

Trino for Large-Scale ETL @ Lyft

January 25, 2023

Lyft operates one of the largest transportation networks in the world. A business like ours depends on data on so many levels. Data relating...

Federating Data Products during migration

January 19, 2023

Organizations who are interested in adopting Data Mesh strategies have found success with Starburst and Google Cloud Platform (GCP) by simplifying migrations and connecting...

Why data products matter

January 17, 2023

The increasing interest in data mesh architectures has driven more and more conversation around the concept of data as a product. Yet this idea...

Building a federated data lakehouse with Starburst Galaxy

January 11, 2023

We are eleven days into the new year, and I have spent the past two weeks exerting unreasonable amounts of effort trying to make...

Build great data products and reduce cognitive load

January 10, 2023

Regular readers of our blog know that many businesses have found success managing data as a product. In essence, they have learned how to...

Tableau Cloud + Starburst: New Connector Supports Shift to Cloud-based SaaS

December 19, 2022

The shift to cloud-based software-as-a-service platforms is accelerating in just about every tech industry. So it wasn’t much of a surprise to the analytics...

What Are The Different Types Of Data Products

December 16, 2022

As we’ve gone from Data Mesh theory to practice, organizations have been shifting their focus towards the central tenet of Data Mesh — building...

Optimizing Operations with a Data Product Runbook

December 13, 2022

As we begin to see greater maturity across the Data Mesh designs that early adopters have put in place, we can start to better...

Apache Iceberg Time Travel & Rollbacks in Trino

December 7, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Democratize Data Access and Data Discovery

November 30, 2022

As more data than ever is being produced, organizations strive to capitalize on capturing every customer insight possible. In a perfect world, gathering all...

Apache Iceberg Schema Evolution in Trino

November 22, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Optimize your Data Products for Competitive Differentiation

November 21, 2022

A recent Gartner survey of over 2,000 CIOs highlighted the importance of accelerating the time to value from their digital investments. The survey also...

Apache Iceberg DML (update/delete/merge) & Maintenance in Trino

November 17, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Iceberg Partitioning and Performance Optimizations in Trino

November 8, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Introduction to Apache Iceberg In Trino

October 20, 2022

This post is part of the Iceberg blog series. Read the entire series: Introduction to Apache Iceberg in Trino Iceberg Partitioning and Performance Optimizations...

Second Edition of Trino: The Definitive Guide

October 5, 2022

Starburst has played a key role in the Trino community for a long time now. We contribute  to the success of Trino every day....

The Data Virtualization Evolution is Just Beginning

October 4, 2022

Data virtualization revolutionized the data infrastructure space by serving data consumers directly on top of data stores, without the need to move data elsewhere....

Delivering Text Search Capabilities Directly on the Data Lake with Starburst

September 29, 2022

In the big data analytics world, enabling analytics on unstructured text is a powerful capability. For that reason, it would be of use that...

Rethinking SIEM Solutions

September 13, 2022

As organizations strive to become more agile, there has been a mass movement jumping headfirst into what is called a security data lake. Gartner...

The Difference Between Micro-Partitioning vs. Indexing and a Better Way

September 8, 2022

When optimizing your analytics database performance, one of the most important decisions is to choose how data is stored and accessed. There are two...

Identify threats faster with a security data lake

August 26, 2022

The glory days of SIEM are over. Security teams are not only measured by their ability to collect as much data as possible, but...

The choice is yours: Open source Trino and Starburst Galaxy

August 9, 2022

A few months back when Starburst Galaxy launched on AWS, Google Cloud, and Azure, I wrote a blog on What Fully-Managed Means to Starburst....

Starburst Lakehouse: Data Warehouse Functionality, Without The Cost

July 14, 2022

Next-Gen data management and analytics strategies We’ve all lived it. Heard it. Adapted to it. The next analytics strategy with numerous ‘modern’ technologies to...

Scaling Up: When to Migrate from PostgreSQL to a Data Lake

July 13, 2022

One of the true pillars of the tech revolution, PostgreSQL is an OLTP database designed primarily to handle transactional workloads. The technology has been...

Confessions of a Space Quest League Advocate

July 6, 2022

Mission 2 Wrap and Mission 3 Launch We all know at least one pandemic puzzler, a devoted crossworder, or a religious wordler who finds...

Let’s Get Granular: Why Granularity Impacts Role-Based Access Control

June 14, 2022

About a month into my first job I finished building my first data pipeline ever. I soaked in the “I Made THAT!” moment, and...

Transforming Your Data Pipelines with Starburst

June 9, 2022

Current State of ETL/ELT Extract-transform-load, more commonly known by its street name “ETL”, has been around since the early days of computing. Bringing together...

The Past, Present, and Future of Trino

May 24, 2022

Recently, I had the pleasure of chatting with Ravit Jain on his show “The Ravit Show” to discuss the evolution of Trino and where...

Starburst and Data Products: The Key to a Data Mesh

April 28, 2022

The key to success for any company is deriving business value from data in a robust, scalable, and timely fashion. A huge part of...

Unlocking the Value of a Data Mesh Through Metadata

April 21, 2022

Discoverable. Understandable. Trustworthy. These are just a few of the key ideas of a Data Mesh infrastructure. Go through all of them and you’ll...

Starburst Enriches Google BigQuery’s Customer Experience with Access to Data Across Hybrid, Cross-Cloud, Salesforce, and Data Warehouses

April 5, 2022

At Starburst, we’re partnering with Google Cloud to allow hybrid and cross-cloud federation of data and analytics. With this new offering, Starburst supports data...

Simplifying Policy Enforcement for Your Data Mesh with Starburst Enterprise and Immuta

March 23, 2022

This blog was co-authored by Alex Breshears, Product Manager at Starburst In today’s global economy, it’s impossible to understate the importance of being able...

New Study Reveals Key Trends Including a Shift Towards Decentralized Data Architecture

March 22, 2022

Data sprawl grows in complexity. Data science emerges as the top analytical workload. Data pipeline is fraught with challenges. Data access is now more...

What A SQL Query Engine Can Do For Big Data

February 16, 2022

Nod with me if you’ve suffered from the following problems with processing and analyzing Big Data via a centralized approach: different query languages, niche...

Top 6 Reasons to Migrate to the Cloud

January 25, 2022

Starburst released the 2021 State of Data market research report, conducted by Enterprise Management Associates (EMA), in collaboration with Red Hat, early last year....

What does fully managed means in the cloud

January 14, 2022

Starburst solves the problem of sprawling data ecosystems by providing a mechanism for you to directly query your data. Starburst was built upon the...

The Right Way to Query Across Data Sources in Tableau (or, The Cross-Database Join Is Not Always Your Friend)

January 13, 2022

Summary Use the right tool for the right job. Not doing so means the difference between your Tableau viz rendering in seconds vs. minutes...

Starburst Stargate: One Cluster to Rule Them All

December 9, 2021

I think of Starburst Stargate as the Lord of the Rings feature. Or the galactic empire feature. In a prior blog post, I introduced...

Data warehouse vs Lake vs Lakehouse architecture

December 6, 2021

As companies shift their analytical ecosystems from on-premise to cloud and try to avoid “data lock-in”, we’re noticing some very interesting data patterns. This...

Tableau is Just Better with Starburst

November 15, 2021

I’m one of those strange people who has always enjoyed doing performance testing. The thought of spinning up lots of machines to do my...

The Analytics Engine for Distributed Data

October 1, 2021

The idea of a single source of truth has been around since the beginning of big data. However, over the years, through the data...

Dynamic Filtering: Supporting High Speed Access to Data

September 20, 2021

Analysts are often tasked with deriving insights for business units where the data can span multiple locations.  This is increasingly true today when the...

Does the data mesh make data integration harder?

September 17, 2021

Every five years, a small group of leaders in the data management research community get together to do a self assessment --- what are...

The Intelligent Edge

September 13, 2021

Today’s digital world is an expanding frontier of emerging technologies. There are endless innovations, inspired by data, informed by data, enabled by data, and...

What Data Mesh Means for Data Analysts and Data Scientists

September 7, 2021

Get early access to free early release chapters (including the newly released chapter!) of the O’Reilly book, Data Mesh: Delivering Data-Driven Value at Scale,...

Hybrid Distributed Data Store and RDBMS

August 12, 2021

As companies shift their analytical ecosystems from on-premise to cloud and try to avoid “data lock-in”, we’re noticing some very interesting data patterns. This...

The data mesh approach to analytics at scale

July 29, 2021

Get early access to free early release chapters (including the newly released chapter!) of the O’Reilly book, Data Mesh: Delivering Data-Driven Value at Scale,...

Why Data Lakehouse Architecture Now?

July 20, 2021

Recently I presented with my colleagues, Justin Borgman, CEO of Starburst, and Daniel Abadi, Chief Scientist, at The Data and AI Summit by Databricks...

Starburst Elements: What is Trino?

July 15, 2021

Data engineers are struggling to keep up with the demands of their data consumers. Every team has their favorite database object storage or other...

What is Data Literacy?

July 6, 2021

As I walked out of my lecture on data literacy in New York a few years back, a member of the audience told me,...

Starburst Elements: A Holistic View of Your Cluster and Query Environment with Starburst Insights

July 1, 2021

This is the fifth episode in our video series, Starburst Elements, focused around anything and everything Starburst. In this episode, our Product Manager Vishal...

The State of Data Analysts

June 28, 2021

The world of data analysis is constantly changing and evolving, and sometimes it can be hard to keep up with. I had the pleasure...

Starburst Stargate: The Final Frontier in Analytics Anywhere

June 9, 2021

Today we announced Starburst Stargate, the industry’s first gateway for global cross-cloud analytics. I’m excited to share more behind why we built this and...

Trino on Ice IV: Deep Dive Into Iceberg Internals

June 8, 2021

Trino on ice I: A gentle introduction to Iceberg Trino on ice II: In-place table evolution and cloud compatibility with Iceberg Trino on ice...

Trino on Ice III: Iceberg Concurrency Model, Snapshots, and the Iceberg Spec

May 25, 2021

Trino on ice I: A gentle introduction to Iceberg Trino on ice II: In-place table evolution and cloud compatibility with Iceberg Trino on ice...

Starburst Elements: Start Fast with Starburst Galaxy

May 20, 2021

This is the fourth episode in our video series, Starburst Elements, focused around anything and everything Starburst. In this episode, our Product Manager Vishal...

Trino on Ice II: In-Place Table Evolution and Cloud Compatibility with Iceberg

May 11, 2021

Trino on ice I: A gentle introduction to Iceberg Trino on ice II: In-place table evolution and cloud compatibility with Iceberg Trino on ice...

Starburst Elements: Introduction to Starburst Galaxy

May 7, 2021

This is the third episode in our video series, Starburst Elements, focused around anything and everything Starburst. In this episode, our Product Manager Vishal...

Trino On Ice I: A Gentle Introduction To Iceberg

April 27, 2021

Trino on ice I: A gentle introduction to Iceberg Trino on ice II: In-place table evolution and cloud compatibility with Iceberg Trino on ice...

Trino, Data Governance, and Accelerating Data Science

April 22, 2021

Back during the Datanova conference, I had the pleasure of interviewing Martin, Dain, and David, three of the original creators of Presto. They started...

Newly Released Independent Study Uncovers Distributed Data & Analytics Trends

March 4, 2021

  The recent pandemic has caused a massive digital shift and made the need for data access more critical than ever for 53% of...

Understanding the Starburst and Trino Hive Connector Architecture

February 18, 2021

After a decade of running Hive queries on their data lakes, many companies are astonished at the speeds in which they are able to...

The Future of Analytics: In Conversation With Matt Fuller

February 5, 2021

Datanova is just next week. More than 2,000 data and analytics leaders will join us to learn more about how to unlock the value...

Video: How to Migrate from EMR to Starburst Enterprise Presto

November 13, 2020

In this video, I walk you through the steps of migrating between an existing EMR Presto cluster that has existing data in S3 and...

Top 10 Reasons to Migrate from OS Presto on EMR to Starburst Enterprise Presto

November 13, 2020

In today’s data architecture economy, there are no shortages of options when it comes to choosing various distributions and deployment strategies for a given...

Starburst Secrets – Hiding Sensitive Presto Credentials

August 26, 2020

As Presto continues to rapidly become the SQL engine of choice powering the modern big data consumption layer, security is at the top of...

The Death of Apache Drill

August 6, 2020

One of the things that really drew me to and got me excited about Presto over 4 years ago was that it wasn’t tied...

The 4 Stages to Big Data Nirvana (In the Cloud)

July 18, 2019

Nirvana - a state of perfect happiness; an ideal or idyllic place.  In big data “Nirvana” is a wishlist of items: The ability to...

Presto Summit 2019 Recap

June 25, 2019

*If you are looking for the 2019 NYC Presto Summit, event info and registration can be found here.* Presentation Slides Presentation Videos   The...

Starburst Enterprise & Databricks Delta Lake Support

June 13, 2019

TL;DR - There is now Starburst Enterprise Databricks Delta Lake compatibility.   Delta Lake The big data ecosystem has many components but the one...

The Art of Abstraction: the continuing separation of compute and storage for data analytics

December 4, 2018

We recently invited 451 Research VP, Matt Aslett to share his thoughts and observations on the practice of separating the storage and computation of...

Securing Presto with Apache Ranger

October 3, 2018

Starburst is excited to announce the general availability of Starburst Presto Enterprise 208e. This is the first Presto release to bring you Apache Ranger...

Starburst’s Presto on AWS up to 18x faster than EMR

June 26, 2018

Karol Sobczak & Anu Sudarsan, Co-Founders & Software Engineers at Starburst Introduction Last week, we announced the availability of Starburst’s Presto on AWS Marketplace. With...

Introduction to Trino Cost-Based Optimizer

April 9, 2018

The Cost-Based Optimizer (CBO) we have released just recently achieves stunning results in industry standard benchmarks (and not only in benchmarks)! The CBO makes...

Presto Cost-Based Optimizer rocks the TPC benchmarks!

April 3, 2018

Wojciech Biela, Co-founder at Starburst Introduction As mentioned in our previous blog about the Starburst Presto release and its hottest addition - the Cost...

Start Free with
Starburst Galaxy

Up to $500 in usage credits included

  • Query your data lake fast with Starburst's best-in-class MPP SQL query engine
  • Get up and running in less than 5 minutes
  • Easily deploy clusters in AWS, Azure and Google Cloud
For more deployment options:
Download Starburst Enterprise

Please fill in all required fields and ensure you are using a valid email address.

s