cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

DLT can't authenticate with kinesis using instance profile

israelst
New Contributor II

When running my notebook using personal compute with instance profile I am indeed able to readStream from kinesis. But adding it as a DLT with UC, while specifying the same instance-profile in the DLT pipeline setting - causes a "MissingAuthentication" exception from kinesis...

If I am using the hive_metastore in the DLT - it does work! Why in UC it does not?!

7 REPLIES 7

brockb
Databricks Employee
Databricks Employee

Hi,

Please review the "Limitations" section of this Unity Catalog DLT document. Do one or more of these described situations applicable to you, such as possibly:

Existing pipelines that use the Hive metastore cannot be upgraded to use Unity Catalog. To migrate an existing pipeline that writes to Hive metastore, you must create a new pipeline and re-ingest data from the data source(s).

https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#limitations

Thanks.

Mathias_Peters
Contributor

Hi, were you able to solve this problem? If so, what was the solution?

Hello, Did you fix this problem ? We are having similar issue with SQS permission.

We switched to the preview channel and used a roleArn as param. That worked with DLT.

Babu_Krishnan
Contributor

@Mathias_Peters , Thanks for the details. Curious how make the roleAan part work , we are able to make it work only with passing accessKey and Secret key, not with roleArn. if you are using SQL based DLT tables , Could you please share some code samples on how you are passing the roleArn info?

Mathias_Peters
Contributor

We have used the roleArn and role session name like this:

 

CREATE STREAMING TABLE table_name
 as SELECT * FROM STREAM read_kinesis (
        streamName => 'stream',
        initialPosition => 'earliest',
        roleArn => 'arn:aws:iam::ACCT_ID:role/ROLE_NAME',
        roleSessionName => 'databricks'
        );

The service principal executing the pipeline has to be able to assume the role referenced by roleArn. 

Thanks for sharing the details @Mathias_Peters  , Let us try this.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group