cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

shawnbarrick
by New Contributor III
  • 6959 Views
  • 2 replies
  • 2 kudos

Resolved! How to resolve SAT driver errors

I was able to follow the SAT setup instructions, but ran into the same error whether I ran it "manually" or via terraform. The initialization seemed to run fine. Can anyone suggest any steps to troubleshoot this?

Screenshot 2023-05-01 at 10.19.23 AM
  • 6959 Views
  • 2 replies
  • 2 kudos
Latest Reply
shawnbarrick
New Contributor III
  • 2 kudos

Thanks - I also spoke with Arun, who was very helpful. Our databricks admin users all require an okta login, which is causing the error. We're looking into a "break glass" admin user for this purpose.

  • 2 kudos
1 More Replies
xneg
by Contributor
  • 3412 Views
  • 5 replies
  • 4 kudos

Is there a way to clone job cluster or edit cluster using JSON?

I've created workflow job (let say job A) and set up job cluster configuration for it.Now I want to create another workflow job (job B) but use almost the same settings for job cluster.I can see cluster settings in JSON (for both jobs) but I can't ed...

  • 3412 Views
  • 5 replies
  • 4 kudos
Latest Reply
artsheiko
Valued Contributor III
  • 4 kudos

Also you can use terraform exporter with -match flag to get a .tf definition for a job A. Once initialized, you can define job B.Another option is to use dbx

  • 4 kudos
4 More Replies
Leszek
by Contributor
  • 1879 Views
  • 1 replies
  • 1 kudos

IDENTITY column duplication when using BY DEFAULT parameter

Hi, I created delta table with identity column using this syntax:Id BIGINT GENERATED BY DEFAULT AS IDENTITYMy steps:1) Created table with Id using syntax above.2) Added two rows with Id = 1 and Id = 2 (BY DEFAULT allows to do that).3) Run Insert (wit...

image.png
  • 1879 Views
  • 1 replies
  • 1 kudos
Latest Reply
dileep_vikram
New Contributor II
  • 1 kudos

Use below alter command to sync the identity column.alter table table_name change column col_name sync identity

  • 1 kudos
DataEngineer92
by New Contributor II
  • 1179 Views
  • 3 replies
  • 0 kudos

databricks-connect in Azure DevOps Pipeline jobs runs not showing on remote cluster

Hi Team,I am trying to run Azure DevOps pipeline as mentioned in the blog below.​​https://benalexkeen.com/unit-testing-with-databricks-part-2/​The pipeline is running successfully however I am not able to see any runs in remote cluster.​Does databric...

  • 1179 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Rey Jhon​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your...

  • 0 kudos
2 More Replies
thib
by New Contributor III
  • 3883 Views
  • 3 replies
  • 2 kudos

Can we use multiple git repos for a job running multiple tasks?

I have a job running multiple tasks :Task 1 runs a machine learning pipeline from git repo 1Task 2 runs an ETL pipeline from git repo 1Task 2 is actually a generic pipeline and should not be checked in repo 1, and will be made available in another re...

image
  • 3883 Views
  • 3 replies
  • 2 kudos
Latest Reply
trijit
New Contributor II
  • 2 kudos

The way to go about this would be to create Databricks repos in the workspace and then use that in the task formation. This way we can refer multiple repos in different tasks.

  • 2 kudos
2 More Replies
JonsData
by New Contributor II
  • 1136 Views
  • 2 replies
  • 1 kudos

DataBricks Extension on Azure using SPN

Is there any extension for deploying Databricks in Azure DevOps using SPN?

  • 1136 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Amadin Naomi​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 1 kudos
1 More Replies
Anonymous
by Not applicable
  • 378 Views
  • 0 replies
  • 2 kudos

community.databricks.com

We're excited to announce the first four winners of our Raffle, and we want to thank everyone who has participated so far. If you haven't yet entered, don't worry! We still have four more tickets to give away for the world's largest Data + AI summit...

  • 378 Views
  • 0 replies
  • 2 kudos
psps
by New Contributor III
  • 3074 Views
  • 3 replies
  • 4 kudos

Databricks Job run logs only shows prints/logs from driver and not executors

Hi,​In Databricks Job run output, only logs from driver are displayed. We have a function parallelized to run on executor nodes. The logs/prints from that function are not displayed in job run output. Is there a way to configure and show those logs i...

  • 3074 Views
  • 3 replies
  • 4 kudos
Latest Reply
psps
New Contributor III
  • 4 kudos

Thanks @Debayan Mukherjee​ . This is to enable executor logging. However, the executor logs do not appear in Databricks Job run output. Only driver logs are displayed.

  • 4 kudos
2 More Replies
Tsar
by New Contributor III
  • 9010 Views
  • 12 replies
  • 12 kudos

Limitations with UDFs wrapping modules imported via Repos files?

We have been importing custom module wheel files from our AzDevOps repository. We are pushing to use the Databricks Repos arbitrary files to simplify this but it is breaking our spark UDF that wraps one of the functions in the library with a ModuleNo...

  • 9010 Views
  • 12 replies
  • 12 kudos
Latest Reply
Scott_B
New Contributor III
  • 12 kudos

If your notebook is in the same Repo as the module, this should work without any modifications to the sys path.If your notebook is not in the same Repo as the module, you may need to ensure that the sys path is correct on all nodes in your cluster th...

  • 12 kudos
11 More Replies
Ojas1990
by New Contributor
  • 849 Views
  • 0 replies
  • 0 kudos

Why not choose ORC over Parquet?

What Spark/Delta Lake choose ORC vs Parquet file format? I learnt ORC is much faster when querying, It is much compression efficient than parquet and has most the feature which parquet has on top of it? Why not choose ORC? Am I missing something? Ple...

  • 849 Views
  • 0 replies
  • 0 kudos
LukeWarm
by New Contributor II
  • 1834 Views
  • 5 replies
  • 2 kudos

Password reset window freezes for DB community edition

HiI've been trying to reset my DB community edition password. I receive the email ok and change the PW, slick submit but the window just hangs ( for ever ).See attached screen grab

  • 1834 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Jason Roche​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

  • 2 kudos
4 More Replies
37319
by New Contributor II
  • 2193 Views
  • 2 replies
  • 3 kudos

Resolved! Integrating Databricks SQL with git repo

Hello, I'm using Databricks premium version on GCP.I've integrated my git repo (bitbucket) with Databricks successfully and I can read and write notebooks from it.I'd like to do the same thing with Databricks SQL, but when I switch to SQL mode the re...

  • 2193 Views
  • 2 replies
  • 3 kudos
Latest Reply
artsheiko
Valued Contributor III
  • 3 kudos

With a Unified Navigation, you can see all menu tabs via single unified menu. Once you need to trace sql query in git, you'll need simply create new .sql file in Repos and commit it

  • 3 kudos
1 More Replies
Oliver_Angelil
by Valued Contributor II
  • 5052 Views
  • 4 replies
  • 0 kudos

Resolved! Python code linter in Databricks notebook

Is it possible to get syntax linting in a DB notebook? Say with flake8, like I do in VS code?

  • 5052 Views
  • 4 replies
  • 0 kudos
Latest Reply
artsheiko
Valued Contributor III
  • 0 kudos

No linting in a DB notebook available for now. The Notebook is currently in the process of adopting Monaco as the underlying code editor which will offer an improved code authoring experience for notebook cells.Some of the Monaco editor features enab...

  • 0 kudos
3 More Replies
Ryan512
by New Contributor III
  • 3905 Views
  • 2 replies
  • 5 kudos

Resolved! Does the `pathGlobFilter` option work on the entire file path or just the file name?

I'm working in the Google Cloud environment. I have an Autoloader job that uses the cloud files notifications to load data into a delta table. I want to filter the files from the PubSub topic based on the path in GCS where the files are located, not...

  • 3905 Views
  • 2 replies
  • 5 kudos
Latest Reply
Ryan512
New Contributor III
  • 5 kudos

Thank you for confirming what I observed that differed from the documentation.

  • 5 kudos
1 More Replies
Jason_923248
by New Contributor III
  • 2055 Views
  • 2 replies
  • 3 kudos

Resolved! In Data Explorer, how do you Refresh a table definition?

In Data Science & Engineering -> Data -> Data Explorer, if I expand the hive_metastore, then expand a schema and choose a table, and then view the "Sample Data", I receive this error:[DEFAULT_FILE_NOT_FOUND] It is possible the underlying files have b...

  • 2055 Views
  • 2 replies
  • 3 kudos
Latest Reply
padmajaa
New Contributor III
  • 3 kudos

Try refreshing all cached entries that are associated with the table that might helpREFRESH TABLE [db_name.]table_name

  • 3 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels