cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

eager_to_learn
by New Contributor III
  • 6856 Views
  • 7 replies
  • 5 kudos

Resolved! Databricks pool - 2 instances are in running state without any job running in the system

We are using Azure Databricks pools, configured 16 max instances. Out of 16, 2 instances are in running state without any job in running condition, how & where can i check the usage of the instances ?p.s. SQL pool is also not running, so no chances o...

  • 6856 Views
  • 7 replies
  • 5 kudos
Latest Reply
eager_to_learn
New Contributor III
  • 5 kudos

@Kaniz Fatma​ / @Prabakar Ammeappin​ Any idea, how can we queue the jobs in the Resource pools, is it some setting which we need to switch on so the jobs are queued until instances are available or can you point some documentation for the same ?

  • 5 kudos
6 More Replies
ABAGRI
by New Contributor II
  • 2819 Views
  • 2 replies
  • 2 kudos

Resolved! Having Issues with extracting records from complex JSON

Hi Team,we are using delta live tables to ingest data from Kafka.the JSON file we receive is a complex JSON structure and we are trying to explode the file into its necessary columns and transactions, Thank youplease see attached sample file{ "Table...

  • 2819 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16753725469
Databricks Employee
  • 2 kudos

Hi @Lantis Pillay​ Could you please try to parse JSON records in the below way

  • 2 kudos
1 More Replies
MattM
by New Contributor III
  • 2959 Views
  • 0 replies
  • 0 kudos

Unstructured Data - PDF and a semi-structured data

I have a scenario where one source is unstructered pdf files and another source is semi-structered JSON files. I get files from these two sources on a daily basis into an ADLS storage. What is the best way to load this into a medallion structure by s...

  • 2959 Views
  • 0 replies
  • 0 kudos
Antoine_De_A
by New Contributor III
  • 3805 Views
  • 1 replies
  • 3 kudos

Resolved! Streaming data to CosmosDB

Hello everyone,Here is the problem I am facing. I'm currently working on streaming data to DataBricks, my goal is to create a data stream on a first notebook, and then on a second notebook to read this data stream, add all the new rows to a dataFrame...

  • 3805 Views
  • 1 replies
  • 3 kudos
Latest Reply
Antoine_De_A
New Contributor III
  • 3 kudos

Problem solved!Instead of trying to do everything directly with the .writeStream options I used the .forEachBatch() function which allows me to call a function outside the .writeStream().In this function I get a dataFrame in parameter which is my str...

  • 3 kudos
curious-case-of
by New Contributor II
  • 12201 Views
  • 1 replies
  • 4 kudos

Databricks notebook taking too long to run as a job compared to when triggered from within the notebook

I don't know if this question has been covered earlier, but here it goes - I have a notebook that I can run manually using the 'Run' button in the notebook or as a job.The runtime when I run from within the notebook directly is roughly 2 hours. But w...

  • 12201 Views
  • 1 replies
  • 4 kudos
Latest Reply
wvl
New Contributor II
  • 4 kudos

We're seeing the same behavior.. Good performance using interactive cluster.Using identically sized job cluster, performance is bad. Any ideas?

  • 4 kudos
data_engineer_0
by New Contributor II
  • 16058 Views
  • 3 replies
  • 2 kudos

How to run the .py file in databricks cluster

Hi team,I wants to run the below command in databricks and also need to capture the error and success message.Please help me out here,Thanks in advanceEx: python3 /mnt/users/code/x.py --arguments

  • 16058 Views
  • 3 replies
  • 2 kudos
Latest Reply
User16764241763
Databricks Employee
  • 2 kudos

Hello @Piper Wilson​ Would this task not help?https://docs.databricks.com/dev-tools/api/latest/examples.html#jobs-api-examples

  • 2 kudos
2 More Replies
User15787040559
by Databricks Employee
  • 3683 Views
  • 1 replies
  • 0 kudos

MicrosoftTeams-image

ERROR Max retries exceeded with url: /api/2.0/jobs/runs/get?run_id= Failed to establish a new connectionThis error can happen when exceeding the rate limits for all REST API calls as documented here.In the image shown for example we're using the Jobs...

  • 3683 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16764241763
Databricks Employee
  • 0 kudos

Hi @Carlos Morillo​  Are you facing this issue consistently or when you run a lot of jobs?We are internally tracking a similar issue. Could you please file a support request with Microsoft Support? Databricks and MSFT will collaborate and provide upd...

  • 0 kudos
chandan_a_v
by Valued Contributor
  • 31143 Views
  • 7 replies
  • 3 kudos
  • 31143 Views
  • 7 replies
  • 3 kudos
Latest Reply
Prabakar
Databricks Employee
  • 3 kudos

By any chance, was the cluster restarted after installing the libraries or was it detached and reattached from/to the notebook? Notebook-scoped libraries do not persist across sessions. You must reinstall notebook-scoped libraries at the beginning of...

  • 3 kudos
6 More Replies
Gopal_Sir
by New Contributor III
  • 41249 Views
  • 5 replies
  • 7 kudos

Resolved! How to convert a string column to Array of Struct ?

I have a nested struct , where on of the field is a string , it looks something like this ....string = "[{\"to_loc\":\"6183\",\"to_loc_type\":\"S\",\"qty_allocated\":\"18\"},{\"to_loc\":\"6137\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\"},{\"to_lo...

  • 41249 Views
  • 5 replies
  • 7 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 7 kudos

Can you mark the question as answered so others can find the solution?

  • 7 kudos
4 More Replies
kerala_tourism
by New Contributor
  • 841 Views
  • 0 replies
  • 0 kudos

Tourism attractions in kerala are described here. Kerala has a rich tourism background, which contributes much to the economy. Tourism is the way of i...

Tourism attractions in kerala are described here. Kerala has a rich tourism background, which contributes much to the economy. Tourism is the way of income for a large number of people in Kerala. National parks, wild life sanctuaries, etc. are the ma...

  • 841 Views
  • 0 replies
  • 0 kudos
LorenzoRovere
by New Contributor II
  • 2658 Views
  • 2 replies
  • 0 kudos

Hi all,My organization has changed our domain emails and now all databricks users can't login.We can only login into azure portal with our new dom...

Hi all,My organization has changed our domain emails and now all databricks users can't login.We can only login into azure portal with our new domain email.The message is the following (using the new domain)I wonder if there is a way to upload all us...

2022_06_08_14_41_25_Login_Databricks
  • 2658 Views
  • 2 replies
  • 0 kudos
Latest Reply
LorenzoRovere
New Contributor II
  • 0 kudos

Hi @Prabakar Ammeappin​ thanks for your response. I wanted to know if the domain name change is transparent within the same workspace. We don't need to migrate data, only replace old domain with new domain. Do you think this is possible?

  • 0 kudos
1 More Replies
Sunny
by New Contributor III
  • 15038 Views
  • 1 replies
  • 1 kudos

Resolved! Maximum duration of the Databricks job before it times out

May I know the duration (max) a job is allowed to run if Timeout is not sethttps://docs.databricks.com/data-engineering/jobs/jobs.html

  • 15038 Views
  • 1 replies
  • 1 kudos
Latest Reply
Sivaprasad1
Databricks Employee
  • 1 kudos

This is part of the configuration of the task itself, so if no timeout is specified, it can theoretically run forever (e.g. streaming use case). Please refer timeout section in below link.https://docs.databricks.com/dev-tools/api/latest/jobs.html#ope...

  • 1 kudos
mihai
by New Contributor III
  • 10205 Views
  • 7 replies
  • 31 kudos

Resolved! Workspace deployment on AWS - CloudFormation Issue

Hello,I have been trying to deploy a workspace on AWS using the quickstart feature, and I have been running into a problem where the stack fails when trying to create a resource.The following resource(s) failed to create: [CopyZips].From the CloudWat...

  • 10205 Views
  • 7 replies
  • 31 kudos
Latest Reply
GarethGraphy
New Contributor III
  • 31 kudos

Dropping by with my experience in case anyone lands here via Google.Note that the databricks-prod-public-cfts bucket is located in us-west-2.If your AWS organisation has an SCP which whitelists specific regions (such as this example) and us-west-2 is...

  • 31 kudos
6 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels