Explore the next frontier of data

Read the latest news and opinions from our experts

 

Featured Post

Recent Posts

Starburst Enterprise 350-e LTS Release

Starburst 350-e LTS Release

The Starburst Enterprise 350-e LTS release includes many significant features that help Starburst customers with new and enhanced connectivity, improved performance, and more robust security. We’re also extremely proud of the availability of Starburst Insights, a web application for cluster and query reporting which provides a holistic view of performance and usage across the entire Starburst/Trino environment.

As always, this major release combines features that have been contributed back to the open source project as well as curated for Starburst Enterprise customers. To experience this latest release first hand, please download here.

The new release includes:

Insights

Seeing the entire picture is essential for organizations to make accurate and timely decisions to drive business outcomes. With numerous disparate data sources, and many concurrent users running different workloads, gaining the visibility into overall performance and usage is an enormous benefit to our customers. 

Starburst Insights provides a visual overview of important metrics about your Starburst Enterprise cluster for all types of users, from platform administrators to data consumers. Correlating high cpu to the number of queries running is vital information regarding the health and performance of your Starburst clusters. Easy to read graphs with a date filter allow you to drill into busy time frames to investigate potential issues. The ability for users and administrators to proactively respond from real-time visibility ensures clusters, queries, and overall usage are optimized.

From the Insights interface, you can access detailed query history, including single-query statistics, query plans, and cluster performance information from a selected date range providing a full GDPR level audit trail in real-time. 

 

Additionally, users can drill down into query history, including:

  • Most accessed tables, and most active users by CPU time and queries
  • Most and least active queries
  • Completed query history
  • Query details, including query statistics and execution plan

Data Lake Connectivity:

Trino (formerly PrestoSQL) has widely been recognized as the de facto data lake query engine. Traditional SQL engines have difficulty or have slow performance querying data lakes. Because of this organizations copy and move data from their cloud data lakes into expensive, proprietary data warehouses and data mart solutions for BI to ingest. This takes time and is expensive. With Starburst, customers get enterprise-grade query speed, governance and security directly on data lake storage. 

In the 350-e LTS release, we’ve added support for a new protocol for Delta Lake and Hive-Ranger Support for Delta Lake. This means Starburst customers are in alignment with updated protocols and can apply Hive/Ranger policies to Delta Lake. 

The Starburst Hive Connector already had significant added features compared to OS Trino - you can learn more about it here.  It enables access to popular distributed storage like S3, ALDS, GCS, and more.  This release includes CDH 6.3 support for Hive, nanosecond support for Hive, and Hive Views. Starburst customers can now benefit from a library to translate Hive Views to SQL compatible with Trino. 

RDBMS and other connectors:

In addition to improvement to distributed storage, we are also excited to share new and improved enhancements to RDBMS and other connectors. The 350-e LTS adds a new Starburst Remote Connector and Starburst Azure Synapse Connector. Furthermore, we’ve added enhancements to the Starburst Kafta Connector, Starburst Redshift Connector, and Starburst SAP HANA Connector. 

Starburst Remote Connector:

To meet an emerging demand for hybrid deployments (on-prem-to-cloud, multi-cloud, multi-region, etc.), we’ve developed a Starburst Remote Connector. This connector (informally a Trino-to-Trino connector)  enables access to one catalog in a remote Starburst instance. It ensures customers can keep data in specific regions or locations and minimize data movement to optimize performance and control cloud data egress costs. This way data consumers see an identical landscape on-premises and in the cloud. The Starburst Remote Connector can access any resource exposed in the remote catalog’s metastore. It also supports dynamic filtering, aggregate  pushdown, impersonation.


Starburst Azure Synapse Connector

The Synapse connector allows querying an external Azure Synapse SQL database in Starburst Enterprise. The connector also includes performance enhancement features such as table statistics, dynamic filtering, aggregate pushdown and caching table projections (see below). For security enhancements, the Synapse connector supports user impersonation. 

Other Connector Enhancements

We’ve also beefed up some existing connectors with some new bells and whistles. First, the Starburst Kafka connector now supports Confluent schema registry for Kafka’s metastore. It also now supports Protobuf serde and SSO. 

Table (and column) statistics and user impersonation support have been added to the Starburst Redshift connector. We’ve also updated the connector with aggregate pushdown functionality and dynamic filtering to improve performance. The Redshift connector now uses Amazon’s Redshift JDBC driver 2.0.

Our SAP HANA connector received  an upgrade as well in this release. We’ve extended support for views, including SAP HANA calculation views. For performance enhancements, the SAP HANA connector now includes aggregate pushdown functionality, time column predicate pushdown, and table and column statistics for use with the cost-based optimizer. 

Performance Improvements 

The Starburst Enterprise 350-e LTS release introduces a cache service that provides the ability to configure and automate the management of table scan redirections. Caching table projections connect to an existing Starburst Enterprise installation to run queries for copying data from the source RDBMS catalog to the target catalog. That target catalog is regularly synchronized with the source and then used a cache. 

For many customers this will improve performance as caching table projections enables the offload of data access to tables accessed in one catalog to equivalent tables accessed in another catalog - thereby transparently shifting data access to a more performant system. A typical use case is the redirection from a catalog configuring a relational database to a catalog using the Hive connector to access a data lake. The result is a positive impact to performance, and reduction of cost and load on underlying systems. 

Dynamic filtering, released in 345-e LTS, has been enhanced with more performance opportunities for federated joins. In this release, dynamic filtering is now supported for inequality joins, and have been added to RDBMS connectors and the Starburst JDBC connector.

Security Improvements

As always with our major LTS releases, we have a focus on security. Accessibility to data is our backbone, but secure access to data is our circulatory system. The 350-e LTS release is no different, as we’ve introduced client side OIDC support (SSO) for compliance and reduced operationally complexity, Ranger 2.1.0 compatibility, and integration with Immuta. 

Client side OIDC allows organizations to centrally manage authorization. Starburst now integrates with such identity providers when using our JDBC driver and supports OAuth2 authentication to the Web UI. For more information on OAuth2 and OIDC please see this illustrated guide.

Ranger 2.1.0 compatibility is essential for Cloudera as CDP 7 uses Ranger 2.1.0 for role based access control (RBAC). The Ranger 2.0 client library with Starburst modifications is now updated to work with Ranger 2.1. 

And finally, we’re excited to include integration with Immuta, a market leader in ABAC and data governance for regulated data (HIPAA, GDPR, public sector, etc.). To help data teams achieve faster, safer, more cost efficient analytics and data science initiatives, we have formed a strategic alliance with Immuta. Joint customers using Immuta and Starburst will now be able to unlock sensitive data in the cloud with automated data access control, security and privacy protection, allowing them to maximize the value of their data — even the most sensitive data.

Download Starburst Free

We’re extremely excited for the continued growth and development of Starburst Enterprise beyond 350-e, and would encourage everyone to benefit from the latest and greatest that Starburst has to offer with our free version which you can download here. For licensed proprietary features, please reach out to us directly and we’ll be happy to assist you with all your data access needs.

Dan Brault

Director, Product Marketing

Your Comments :

data-mesh-email-signature

From Facebook

Read more of what you like.

By | on 03, Dec 2020 |   presto aws starburst data aws aws data lake partner

Starburst helps to support the launch of Professional Services in AWS Marketplace

By | on 22, Jul 2020 |   hive starburst data presto release release hadoop teradata apache ranger

The Starburst Enterprise Presto 338-e LTS release includes many significant features that help Starburst customers with overall performance, improved connectivity, and enhanced security. It is primari[...]

By | on 04, May 2020 |   starburst data presto

 It’s May the Fourth! We wanted to honor Star Wars day by sharing a light tribute to the franchise while explaining why Starburst = The Millenium Falcon, Presto = HyperDrive, and together with Starbur[...]