Explore the next frontier of data

Read the latest news and opinions from our experts

 

Featured Post

Recent Posts

Announcing Starburst’s AWS EKS Marketplace Listing

This blog was written in collaboration with Bartosz Owczarek, Software Engineer at Starburst, and Antony Prasad Thevaraj, Sr. Partner Solutions Architect at AWS.

Starburst Enterprise is a versatile and easy-to-use analytics engine, but when deployed through the AWS platform, it can sometimes be challenging to install and configure. Now deploying Starburst in the AWS Cloud just became easier with the introduction of Starburst Enterprise for EKS.

Starburst has been available in the AWS Marketplace for some time as both a CFT and AMI deployment. With Starburst Enterprise for EKS, a user can deploy a fully optimized installation of the Starburst application running on Kubernetes, combined with native EKS features like autoscaling and Spot instance integration, in a completely self-service manner. It's a fully-functioning version of the platform offered as a pay-as-you-go service with all billing handled through your existing Amazon Web Services account.

To demonstrate how simple this can be, let’s walk through an actual deployment.

Prerequisites

To get started, you will need:

  1. An AWS account with Marketplace access
  2. Helm
  3. The AWS CLI
  4. eksctl
  5. kubectl
  6. Lens

Install all of the listed components to your local machine and configure your client connection to your AWS environment (see AWS documentation for how to do this.)

Setting up your EKS cluster

There are a number of ways you can set up an EKS cluster, but I think the easiest and most repeatable way is to use the eksctl utility. This GO application leverages Cloud Formation to build out all the necessary infrastructure required to support an EKS deployment.

A convenient eksctl template that you can use to deploy your own EKS cluster capable of running Starburst is available here. It’s preconfigured to set up a simple cluster with two node pools, one of which leverages both EC2 autoscaling groups and Spot instances. This is perfect for running Starburst in a cost + performance capacity while retaining a high degree of stability . . . but more on that later.

From the template, edit the placeholder variables in bold to suit your environment. This install uses modestly sized m5/m5a/m5ad EC2 machine types, so ensure you have capacity for these in your cloud region!

Once you have copied down the template and edited it to suit your environment, go ahead and run this command to create the cluster:

  eksctl create cluster -f eksctl.yaml

 

 

An eksctl job run on the command line 

Once the deployment is complete, your local Kubernetes configuration file will be automatically updated. At this point you can fire up Lens, see the running cluster, and start to move on to the Starburst installation.

The Lens application view of the nodes in your cluster

Installing Starburst

Now that you have a fully functioning EKS cluster, it's time to start installing Starburst.

            1. To begin, login to the AWS portal.
            2. Type ‘marketplace’ in the search bar and select ‘AWS Marketplace Subscriptions’

3. Click on ‘Discover Products’ in the left-navigation:

4. Search the Marketplace listings for ‘Starburst EKS’:

5. Select the ‘Starburst Enterprise for EKS PayGo’ option from the list and click through the subscribe and then configure pages.

6. On the configuration page, ensure that it is set to the ‘Helm Chart CLI installation’ and the latest version of the software, and then click on ‘Continue to Launch’:

7.The ‘Usage Instructions’ button reveals the detailed steps, which I have also included below for convenience:

 

  • Create a Kubernetes namespace on the cluster for Starburst.
  kubectl create ns starburst-enterprise
  • Associate an IAM OIDC provider for your cluster
  eksctl utils associate-iam-oidc-provider 
     --region YOUR_CLUSTER_REGION 
     --cluster YOUR_CLUSTER_NAME 
     --approve
  • Create an IAM Service Account for your cluster
  eksctl create iamserviceaccount 
     --name starburst-enterprise-sa 
     --namespace starburst-enterprise 
     --cluster YOUR_CLUSTER_NAME 
     --region YOUR_CLUSTER_REGION 
     --attach-policy-arn arn:aws:iam::aws:policy/AWSMarketplaceMeteringFullAccess 
     --approve 
     --override-existing-serviceaccounts

 

  • Create the AWS Marketplace Pull Secret (so we can download the Helm chart package)
  kubectl create secret docker-registry awsmp-registry-pull-secret 
     --docker-server=709825985650.dkr.ecr.us-east-1.amazonaws.com 
     --docker-username=AWS 
     --docker-password=$(aws ecr get-login-password --region us-east-1) 
     --namespace starburst-enterprise
  • For older versions of Helm (i.e. before v3.8.0), you need to set this environment parameter:
  export HELM_EXPERIMENTAL_OCI=1

 

  • Next, authenticate to the Container Registry (ECR):
  aws ecr get-login-password 
    --region us-east-1 | helm registry login 
    --username AWS 
    --password-stdin 709825985650.dkr.ecr.us-east-1.amazonaws.com

 

  • Then pull the Helm chart down from the ECR:
  helm pull oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/starburst/starburst-enterprise-helm-chart-paygo --version 370.2.1-paygo.aws.2

 

  • At this point you could just run the basic Helm install with all the defaults, but this doesn’t give us a lot of optionality. Let’s say I’d also like to point Starburst to my AWS Glue catalog, set my pod sizes to fit the nodes in my cluster, and ensure that I have autoscaling enabled for my Starburst worker pods.

To start, I will need to set up a yaml file. I will then be able to include this values file in my Helm deployment. This is the values.yaml file that I will be using:

  catalogs:
  glue:
    connector.name=hive
    hive.metastore=glue
coordinator:
  nodeSelector:
    starburstpool: base
  resources:
    limits:
      cpu: 5
      memory: 24Gi
    requests:
      cpu: 5
      memory: 24Gi
worker:
  autoscaling:
    enabled: true
    maxReplicas: 4
    minReplicas: 1
  nodeSelector:
    starburstpool: worker
  resources:
    limits:
      cpu: 3
      memory: 12Gi
    requests:
      cpu: 3
      memory: 12Gi

 

  • Install Starburst through the Helm command. Note that I do not need to extract the tar package beforehand. I can just point Helm to it. Also, note where I’ve included my custom values.yaml file.
  helm install starburst-enterprise 
    --namespace starburst-enterprise starburst-enterprise-helm-chart-paygo-370.2.1-paygo.aws.2.tgz 
    --values values.yaml 
    --set imagePullSecrets[0].name=awsmp-registry-pull-secret 
    --set serviceAccountName=starburst-enterprise-sa

 

  • Once the Helm deployment has completed, you should check Lens and verify that all the pods have started up correctly.

Running pods as viewed in Lens

  • Lastly, since we are running the Starburst service in Kubernetes using the default ‘clusterIP,’ which doesn’t allow direct access from the internet, we’ll just use Lens to port-forward the connection to our local machine.

The Starburst service running in Kubernetes

 

Forwarding the connection to my local machine

Accessing the application

Once the application is installed and I have forwarded my connection using Lens, a browser window will pop up and I’ll be presented with a login screen. SInce I am not running the application over https, I do not need to authenticate. I can just enter whatever value I like as the username and will get access to the application. Once inside, I will be able to see the Glue catalog (as long as I gave the appropriate permissions), alongside the tpch data sample.

The Starburst login screen

The Starburst application running in my browser

Final thoughts

The Starburst Enterprise for EKS offering in the AWS Marketplace provides a simplified self-service approach to getting set up with a Starburst cluster that offers an overall better experience than the existing CFT deployment. Everything you need to run an optimized, full version of the platform, using your own data sources, and with billing handled through your existing AWS account is already in place. All you need to do is just ensure that you have an EKS cluster setup and execute the deployment steps as outlined above. Then you’re off and running, ready to take advantage of the analytics engine for all your data. 

Shaun Van Staden

Shaun is a Partner Solutions Architect at Starburst.

Your Comments :

data-mesh-email-signature
Datanova 2022

From Facebook

Read more of what you like.