cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

varonis_evgeniy
by Visitor
  • 20 Views
  • 2 replies
  • 0 kudos

Single task job that runs SQL notebook, can't retrieve results

Hello,We are integrating databricks and I need to run a job with single task that will run notebok with SQL query in it. I Can only use SQL warehouse and no cluster, I need to retrieve a result of the the notebook task but I can't see the results. Is...

Data Engineering
dbutils
Notebook
sql
  • 20 Views
  • 2 replies
  • 0 kudos
Latest Reply
adriennn
Contributor II
  • 0 kudos

>  I need to retrieve a result of the the notebook taskIf you want to know if the task run has succeeded or not, you can enable the "lakeflow" system schema and you'll find the logs of jobs and task runs.You could then use the above info to execute a...

  • 0 kudos
1 More Replies
databrickser
by Visitor
  • 24 Views
  • 1 replies
  • 0 kudos

Updating records with auto loader

I want to ingest JSON files from an S3 bucket into a Databricks table using an autoloader.A job runs every few hours to write the combined JSON data to the table.Some records might be updates to existing records, identifiable by a specific key.I want...

  • 24 Views
  • 1 replies
  • 0 kudos
Latest Reply
adriennn
Contributor II
  • 0 kudos

you can use forEachBatch to write an arbitrary table. See this page for an example of running arbitrary SQL code during an autoloader/streaming run.

  • 0 kudos
ismaelhenzel
by New Contributor III
  • 116 Views
  • 1 replies
  • 3 kudos

Databricks Loosing RLS when join in shared clusters

Our company is encountering an unusual issue today. Some tables with Row-Level Security (RLS) applied, when joined together, are returning results that do not respect the RLS policies. This problem has never occurred before. Interestingly, when we sw...

  • 116 Views
  • 1 replies
  • 3 kudos
Latest Reply
IgorFlachLima
New Contributor II
  • 3 kudos

Same problem here.

  • 3 kudos
sungsoo
by New Contributor
  • 60 Views
  • 1 replies
  • 0 kudos

AWS Role of NACL outbound 3306 port

When using databricks for AWSI need to open 3306 to the NACL outbound port of the subnet where the endpoint is locatedI understand this is to communicate with the meta store of Databricks on the instanceAm I right to understand?If not, please let me ...

  • 60 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

You are correct, the intention of this port is to connect to the Hive metastore 

  • 0 kudos
Brad
by Contributor
  • 75 Views
  • 1 replies
  • 0 kudos

How databricks assign memory and cores

Hi team,We are using job cluster with node type 128G memory+16cores for a workflow. From document we know one worker is one node and is one executor. From Spark UI env tab we can see the spark.executor.memory is 24G, and from metrics we can see the m...

  • 75 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

Databricks allocates resources to executors on a node based on several factors, and it appears that your cluster configuration is using default settings since no specific Spark configurations were provided. Executor Memory Allocation: The spark.exec...

  • 0 kudos
mickniz
by Contributor
  • 4834 Views
  • 4 replies
  • 0 kudos

Connect to Databricks from PowerApps

Hi All,Currently I trying to connect databricks Unity Catalog from Powerapps Dataflow by using spark connector specifying http url and using databricks personal access token as specified in below screenshot: I am able to connect but the issue is when...

mickniz_0-1714487746554.png mickniz_1-1714487891958.png
  • 4834 Views
  • 4 replies
  • 0 kudos
Latest Reply
arunavbarooah
  • 0 kudos

I get an invalid credentials error on using this method. I have generated a token in databricks, any idea why?

  • 0 kudos
3 More Replies
AxelM
by Visitor
  • 27 Views
  • 1 replies
  • 0 kudos

Asset Bundles from Workspace for CI/CD

Hello there,I am exploring the possibilities for CI/CD from a DEV-Workspace to PROD. Besides the Notebooks (which can easily be handled by the GIT-provider) I am mainly interested in the Deployment o Jobs/Clusters/DDl...I can nowhere find a tutorial ...

  • 27 Views
  • 1 replies
  • 0 kudos
Latest Reply
datastones
Contributor
  • 0 kudos

i think the dab mlop stack template is pretty helpful re: how to bundle, schedule and trigger custom jobshttps://docs.databricks.com/en/dev-tools/bundles/mlops-stacks.htmlyou can bundle init it locally and it should give you the skeleton of how to bu...

  • 0 kudos
balwantsingh24
by New Contributor
  • 86 Views
  • 3 replies
  • 0 kudos

Resolved! java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMeta

Guys please help me too solve this issue, I need it very urgent basis.

Screenshot 2024-09-27 133729.png
  • 86 Views
  • 3 replies
  • 0 kudos
Latest Reply
saikumar246
Contributor
  • 0 kudos

Hi @balwantsingh24  Internal Metastore:- Internal metastores are managed by Databricks and are typically used to store metadata about databases, tables, views, and user-defined functions (UDFs). This metadata is essential for operations like the SHOW...

  • 0 kudos
2 More Replies
sticky
by Visitor
  • 15 Views
  • 0 replies
  • 0 kudos

Running a cell with R-script keeps waiting status

So, i have a R-notebook with different cells and a '15.4 LTS ML (includes Apache Spark 3.5.0, Scala 2.12)' cluster.If i select 'run all' all cells will be run immediately and the run finishes quickly and fine. But if i would like to run the cells one...

  • 15 Views
  • 0 replies
  • 0 kudos
Frustrated_DE
by New Contributor III
  • 46 Views
  • 4 replies
  • 0 kudos

Delta live tables multiple .csv diff schemas

Hi all,      I have a fairly straight-forward task whereby I am looking to ingest six .csv file all with different names, schema's and blob locations into individual tables on one bronze schema. I have the files in my landing zone under different fol...

  • 46 Views
  • 4 replies
  • 0 kudos
Latest Reply
Frustrated_DE
New Contributor III
  • 0 kudos

The code follows similar pattern below to load the different tables. import dltimport reimport pyspark.sql.functions as Flanding_zone = '/Volumes/bronze_dev/landing_zone/'source = 'addresses'@Dlt.table(comment="addresses snapshot",name="addresses")de...

  • 0 kudos
3 More Replies
sazreyruk
by Visitor
  • 20 Views
  • 0 replies
  • 0 kudos

Does It Really Work?

We'll cover this in a minute. That fell on me like a ton of bricks as though doing that just wouldn't be the same without this. I have noticed over the last year a  that empowers a space for a . Like counterparts say, "In for a penny, in for a pound....

  • 20 Views
  • 0 replies
  • 0 kudos
Braxx
by Contributor II
  • 5513 Views
  • 4 replies
  • 3 kudos

Resolved! cluster creation - access mode option

I am a bit lazy and trying to manually recreate a cluster I have in one workspace into another one. The cluster was created some time ago. Looking at the configuration, the access mode field is "custom": When trying to create a new cluster, I do not...

Captureaa Capturebb
  • 5513 Views
  • 4 replies
  • 3 kudos
Latest Reply
khushboo20
Visitor
  • 3 kudos

Hi All - I am new to databricks and trying to create my first workflow. For some reason, the cluster created is of type -"custom". I have not mentioned it anywhere in my asset bundle.Due to this - I cannot create get the Unity Catalog feature. Could ...

  • 3 kudos
3 More Replies
tonyd
by New Contributor II
  • 70 Views
  • 1 replies
  • 0 kudos

Getting error "Serverless Generic Compute Cluster Not Supported For External Creators."

Getting the above mentioned error while creating serverless compute. this is the request curl --location 'https://adb.azuredatabricks.net/api/2.0/clusters/create' \--header 'Content-Type: application/json' \--header 'Authorization: ••••••' \--data '{...

  • 70 Views
  • 1 replies
  • 0 kudos
Latest Reply
saikumar246
Contributor
  • 0 kudos

Hi @tonyd Thank you for reaching out to the Databricks Community. You are trying to create a Serverless Generic Compute Cluster which is not supported. You cannot create a Serverless compute Cluster. As per the below link, if you observe, there is no...

  • 0 kudos
FedericoRaimond
by New Contributor II
  • 714 Views
  • 9 replies
  • 3 kudos

Azure Databricks Workflows with Git Integration

Hello,I receive a very weird error when attempting to connect my workflows tasks to a remote git on azure devops.As per documentation: "For a Git repository, the path relative to the repository root."Then, I use directly the name of the notebook file...

  • 714 Views
  • 9 replies
  • 3 kudos
Latest Reply
madams
New Contributor II
  • 3 kudos

Ah, I have had that same error before when cloning from git.  I'm guessing you got the Repo URL by hitting the "clone" button in ADO and copy/pasting it in to Databricks.  One thing I've done in the past is in the Repository URL, remove the "org@" pa...

  • 3 kudos
8 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels
Top Kudoed Authors