Data Engineering

Forum Posts

Sorted by:

by varonis_evgeniy • Visitor

6 hours ago

20 Views
2 replies
0 kudos

Single task job that runs SQL notebook, can't retrieve results

Hello,We are integrating databricks and I need to run a job with single task that will run notebok with SQL query in it. I Can only use SQL warehouse and no cluster, I need to retrieve a result of the the notebook task but I can't see the results. Is...

Data Engineering

dbutils

Notebook

sql

20 Views
2 replies
0 kudos

6 hours ago

View Replies

Latest Reply

adriennn
Contributor II

38m ago

0 kudos

> I need to retrieve a result of the the notebook taskIf you want to know if the task run has succeeded or not, you can enable the "lakeflow" system schema and you'll find the logs of jobs and task runs.You could then use the above info to execute a...

0 kudos

38m ago

1 More Replies

by databrickser • Visitor

6 hours ago

24 Views
1 replies
0 kudos

Updating records with auto loader

I want to ingest JSON files from an S3 bucket into a Databricks table using an autoloader.A job runs every few hours to write the combined JSON data to the table.Some records might be updates to existing records, identifiable by a specific key.I want...

Data Engineering

24 Views
1 replies
0 kudos

6 hours ago

View Replies

Latest Reply

adriennn
Contributor II

45m ago

0 kudos

you can use forEachBatch to write an arbitrary table. See this page for an example of running arbitrary SQL code during an autoloader/streaming run.

0 kudos

45m ago

by ismaelhenzel • New Contributor III

5 hours ago

116 Views
1 replies
3 kudos

Databricks Loosing RLS when join in shared clusters

Our company is encountering an unusual issue today. Some tables with Row-Level Security (RLS) applied, when joined together, are returning results that do not respect the RLS policies. This problem has never occurred before. Interestingly, when we sw...

Data Engineering

116 Views
1 replies
3 kudos

5 hours ago

View Replies

Latest Reply

IgorFlachLima
New Contributor II

2 hours ago

3 kudos

Same problem here.

3 kudos

2 hours ago

by sungsoo • New Contributor

a week ago

60 Views
1 replies
0 kudos

AWS Role of NACL outbound 3306 port

When using databricks for AWSI need to open 3306 to the NACL outbound port of the subnet where the endpoint is locatedI understand this is to communicate with the meta store of Databricks on the instanceAm I right to understand?If not, please let me ...

Data Engineering

60 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

Walter_C
Honored Contributor

3 hours ago

0 kudos

You are correct, the intention of this port is to connect to the Hive metastore

0 kudos

3 hours ago

by Brad • Contributor

Tuesday

75 Views
1 replies
0 kudos

How databricks assign memory and cores

Hi team,We are using job cluster with node type 128G memory+16cores for a workflow. From document we know one worker is one node and is one executor. From Spark UI env tab we can see the spark.executor.memory is 24G, and from metrics we can see the m...

Data Engineering

75 Views
1 replies
0 kudos

Tuesday

View Replies

Latest Reply

Walter_C
Honored Contributor

3 hours ago

0 kudos

Databricks allocates resources to executors on a node based on several factors, and it appears that your cluster configuration is using default settings since no specific Spark configurations were provided. Executor Memory Allocation: The spark.exec...

0 kudos

3 hours ago

by mickniz • Contributor

04-30-2024 7:31:22 AM

4834 Views
4 replies
0 kudos

Connect to Databricks from PowerApps

Hi All,Currently I trying to connect databricks Unity Catalog from Powerapps Dataflow by using spark connector specifying http url and using databricks personal access token as specified in below screenshot: I am able to connect but the issue is when...

Data Engineering

4834 Views
4 replies
0 kudos

04-30-2024 7:31:22 AM

View Replies

Latest Reply

arunavbarooah
Visitor

4 hours ago

0 kudos

I get an invalid credentials error on using this method. I have generated a token in databricks, any idea why?

0 kudos

4 hours ago

3 More Replies

by AxelM • Visitor

8 hours ago

27 Views
1 replies
0 kudos

Asset Bundles from Workspace for CI/CD

Hello there,I am exploring the possibilities for CI/CD from a DEV-Workspace to PROD. Besides the Notebooks (which can easily be handled by the GIT-provider) I am mainly interested in the Deployment o Jobs/Clusters/DDl...I can nowhere find a tutorial ...

Data Engineering

27 Views
1 replies
0 kudos

8 hours ago

View Replies

Latest Reply

datastones
Contributor

5 hours ago

0 kudos

i think the dab mlop stack template is pretty helpful re: how to bundle, schedule and trigger custom jobshttps://docs.databricks.com/en/dev-tools/bundles/mlops-stacks.htmlyou can bundle init it locally and it should give you the skeleton of how to bu...

0 kudos

5 hours ago

by balwantsingh24 • New Contributor

Friday

86 Views
3 replies
0 kudos

Resolved! java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMeta

Guys please help me too solve this issue, I need it very urgent basis.

Data Engineering

86 Views
3 replies
0 kudos

Friday

View Replies

Latest Reply

saikumar246
Contributor

5 hours ago

0 kudos

Hi @balwantsingh24 Internal Metastore:- Internal metastores are managed by Databricks and are typically used to store metadata about databases, tables, views, and user-defined functions (UDFs). This metadata is essential for operations like the SHOW...

0 kudos

5 hours ago

2 More Replies

by sticky • Visitor

5 hours ago

15 Views
0 replies
0 kudos

Running a cell with R-script keeps waiting status

So, i have a R-notebook with different cells and a '15.4 LTS ML (includes Apache Spark 3.5.0, Scala 2.12)' cluster.If i select 'run all' all cells will be run immediately and the run finishes quickly and fine. But if i would like to run the cells one...

Data Engineering

15 Views
0 replies
0 kudos

5 hours ago

by Frustrated_DE • New Contributor III

9 hours ago

46 Views
4 replies
0 kudos

Delta live tables multiple .csv diff schemas

Hi all, I have a fairly straight-forward task whereby I am looking to ingest six .csv file all with different names, schema's and blob locations into individual tables on one bronze schema. I have the files in my landing zone under different fol...

Data Engineering

46 Views
4 replies
0 kudos

9 hours ago

View Replies

Latest Reply

Frustrated_DE
New Contributor III

6 hours ago

0 kudos

The code follows similar pattern below to load the different tables. import dltimport reimport pyspark.sql.functions as Flanding_zone = '/Volumes/bronze_dev/landing_zone/'source = 'addresses'@Dlt.table(comment="addresses snapshot",name="addresses")de...

0 kudos

6 hours ago

3 More Replies

by sazreyruk • Visitor

7 hours ago

20 Views
0 replies
0 kudos

Does It Really Work?

We'll cover this in a minute. That fell on me like a ton of bricks as though doing that just wouldn't be the same without this. I have noticed over the last year a that empowers a space for a . Like counterparts say, "In for a penny, in for a pound....

Data Engineering

20 Views
0 replies
0 kudos

7 hours ago

by aleknandrius • Visitor

9 hours ago

18 Views
0 replies
0 kudos

# Databricks notebook source throws FileNotFoundError: [Errno 2] No such file with PyCharm plugin

I started using Databricks plugin in Pycharm.If I have first line in my code:# Databricks notebook source ...Running such a notebook with a plugin on cluster fails with the message:FileNotFoundError: [Errno 2] No such file or directory: '/Workspace/U...

Data Engineering

18 Views
0 replies
0 kudos

9 hours ago

by Braxx • Contributor II

09-15-2022 7:09:03 AM

5513 Views
4 replies
3 kudos

Resolved! cluster creation - access mode option

I am a bit lazy and trying to manually recreate a cluster I have in one workspace into another one. The cluster was created some time ago. Looking at the configuration, the access mode field is "custom": When trying to create a new cluster, I do not...

Data Engineering

5513 Views
4 replies
3 kudos

09-15-2022 7:09:03 AM

View Replies

Latest Reply

khushboo20
Visitor

9 hours ago

3 kudos

Hi All - I am new to databricks and trying to create my first workflow. For some reason, the cluster created is of type -"custom". I have not mentioned it anywhere in my asset bundle.Due to this - I cannot create get the Unity Catalog feature. Could ...

3 kudos

9 hours ago

3 More Replies

by tonyd • New Contributor II

Thursday

70 Views
1 replies
0 kudos

Getting error "Serverless Generic Compute Cluster Not Supported For External Creators."

Getting the above mentioned error while creating serverless compute. this is the request curl --location 'https://adb.azuredatabricks.net/api/2.0/clusters/create' \--header 'Content-Type: application/json' \--header 'Authorization: ••••••' \--data '{...

Data Engineering

70 Views
1 replies
0 kudos

Thursday

View Replies

Latest Reply

saikumar246
Contributor

10 hours ago

0 kudos

Hi @tonyd Thank you for reaching out to the Databricks Community. You are trying to create a Serverless Generic Compute Cluster which is not supported. You cannot create a Serverless compute Cluster. As per the below link, if you observe, there is no...

0 kudos

10 hours ago

by FedericoRaimond • New Contributor II

3 weeks ago

714 Views
9 replies
3 kudos

Azure Databricks Workflows with Git Integration

Hello,I receive a very weird error when attempting to connect my workflows tasks to a remote git on azure devops.As per documentation: "For a Git repository, the path relative to the repository root."Then, I use directly the name of the notebook file...

Data Engineering

714 Views
9 replies
3 kudos

3 weeks ago

View Replies

Latest Reply

madams
New Contributor II

3 weeks ago

3 kudos

Ah, I have had that same error before when cloning from git. I'm guessing you got the Repo URL by hitting the "clone" button in ADO and copy/pasting it in to Databricks. One thing I've done in the past is in the Repository URL, remove the "org@" pa...

3 kudos

3 weeks ago

8 More Replies

User

Count

1609

753

349

285

248

Databricks Community

Forum Posts

Single task job that runs SQL notebook, can't retrieve results

Updating records with auto loader

Databricks Loosing RLS when join in shared clusters

AWS Role of NACL outbound 3306 port

How databricks assign memory and cores

Connect to Databricks from PowerApps

Asset Bundles from Workspace for CI/CD

Resolved! java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMeta

Running a cell with R-script keeps waiting status

Delta live tables multiple .csv diff schemas

Does It Really Work?

# Databricks notebook source throws FileNotFoundError: [Errno 2] No such file with PyCharm plugin

Resolved! cluster creation - access mode option

Getting error "Serverless Generic Compute Cluster Not Supported For External Creators."

Azure Databricks Workflows with Git Integration

Connect with Databricks Users in Your Area

java.lang.RuntimeException: Unable to instantiate ...

fs ls lists files that i cannot navigate to an vie...

Serverless Compute no support for Caching data fra...

Delta Live Tables - CDC - Batching - Delta Tables

Unable to find data engineer learning path databas...