cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

cmilligan
by Contributor II
  • 877 Views
  • 1 replies
  • 1 kudos

Resolved! Database CICD Pipelines

My team has a shared codebase and we are running into issues as we migrate to Databricks when two people are doing development on connected sections of our codebase.For example if I add a column to a table for changes on my branch, other members on m...

  • 877 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@Coleman Milligan​ It's really hard to create something like this without basic knowledge about how CICD should work or even Terraform.You can start here, to understand some basics.https://servian.dev/how-to-hardening-azure-databricks-using-terraform...

  • 1 kudos
kilaki
by New Contributor II
  • 2502 Views
  • 3 replies
  • 0 kudos

Query fails with 'Error occurred while deserializing arrow data' on Databricks SQL with Channel set to Preview

Noticed with a query based on inline select and joins fails to the client with 'Error occurred while deserializing arrow data'  I.e the query succeeds on Databricks but client (DBeaver, AtScale) receives an errorThe error is only noticed with Databri...

Screen Shot 2023-01-24 at 2.08.54 PM Screen Shot 2023-01-24 at 2.11.20 PM Screen Shot 2023-01-24 at 2.03.21 PM
  • 2502 Views
  • 3 replies
  • 0 kudos
Latest Reply
franco_patano
New Contributor III
  • 0 kudos

Opened an ES on this, looks like an issue with the Preview channel. Thanks for your help!

  • 0 kudos
2 More Replies
rakeshprasad1
by New Contributor III
  • 2248 Views
  • 3 replies
  • 4 kudos

databricks autoloader not updating table immediately

I have a simple autoloader job which looks like thisdf_dwu_limit = spark.readStream.format("cloudFiles") \ .option("cloudFiles.format", "JSON") \ .schema(schemaFromJson) \ .load("abfss://synapse-usage@xxxxx.dfs.core.windows.net/synapse-us...

auto-loader issue
  • 2248 Views
  • 3 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

Can you share the whole code with the counts, which you mentioned?

  • 4 kudos
2 More Replies
Vsleg
by Contributor
  • 1801 Views
  • 2 replies
  • 1 kudos

Resolved! Deploying Databricks Workflows and Delta Live Table pipelines across workspaces

Hello,I was wondering if there is a way to deploy Databricks Workflows and Delta Live Table pipelines across Workspaces (DEV/UAT/PROD) using Azure DevOps.

  • 1801 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Yes, for sure, using Rest API Calls to https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-api-guide.htmlYou can create DLT manually from GUI and take JSON representative of it, tweak it (so it uses your env variables, for examp...

  • 1 kudos
1 More Replies
rammy
by Contributor III
  • 1834 Views
  • 2 replies
  • 3 kudos

How can we save a data frame in Docx format using pyspark?

  I am trying to save a data frame into a document but it returns saying that the below errorjava.lang.ClassNotFoundException: Failed to find data source: docx. Please find packages at http://spark.apache.org/third-party-projects.htm   #f_d...

  • 1834 Views
  • 2 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Moderator
  • 3 kudos

Hi,You cannot do it from Pyspark, but you can try to use Pandas to save to Excell. There is no Docx

  • 3 kudos
1 More Replies
Kv1
by New Contributor III
  • 2486 Views
  • 5 replies
  • 6 kudos

Databricks Professional Exam - . No Transcript provided when you fail

When I passed associate exam , I was provided with Transcript which showcase strength and weakness in learning area like the one below which help us to understand gap in skill set and which area to concentrate to improve furtherYesterday I failed i...

image
  • 2486 Views
  • 5 replies
  • 6 kudos
Latest Reply
jose_gonzalez
Moderator
  • 6 kudos

Adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 6 kudos
4 More Replies
SDG_Peter
by New Contributor III
  • 3925 Views
  • 7 replies
  • 7 kudos

Spark image failed to download or does not exist

Good morning, and thank you for the supportIn our scheduled job one cluster failed to start with the following error:```Run result unavailable: job failed with error messageUnexpected failure while waiting for the cluster to be ready.Cause Unexpected...

  • 3925 Views
  • 7 replies
  • 7 kudos
Latest Reply
LandanG
Honored Contributor
  • 7 kudos

@Rita Fernandes​ @Kajetan Gęgotek​ @Yoshi Coppens​ @Viktor Fulop​ @Andrius Vitkauskas​ @Pietro Maria Nobili​ It looks like an issue with Azure account limits, the Databricks eng team is looking into it. Apart from retries, I'd suggest running jobs no...

  • 7 kudos
6 More Replies
pawank9
by New Contributor II
  • 1698 Views
  • 2 replies
  • 0 kudos

SQL dashboard report: Select table column dynamically

I am using SQL Dashboard report, there are around 400 columns in a table, I want to display some of the columns based on data, not all 400 columns should be displayed.eg: a table have column named c1, c2,c3,c4 and rows r1, r2, r3, r4. The report shou...

  • 1698 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

You can build own logic using query parameters, and/or just select required columns based on case.

  • 0 kudos
1 More Replies
KVNARK
by Honored Contributor II
  • 1738 Views
  • 4 replies
  • 3 kudos

Error while reading a file from ADLS 2.0 in databricks platformbelow is the error. Anyone faced the similar issue and solution to fix this.

Error while reading a file from ADLS 2.0 in databricks platformbelow is the error. Anyone faced the similar issue and solution to fix this.

image
  • 1738 Views
  • 4 replies
  • 3 kudos
Latest Reply
Vivian_Wilfred
Honored Contributor
  • 3 kudos

Hey @KVNARK .​ did you run nslookup to the storage endpoint and confirm if this isn’t a DNS issue? If not, run this from the same notebook -%sh nslookup <storage account>.dfs.core.windows.net

  • 3 kudos
3 More Replies
Yoshe1101
by New Contributor III
  • 2435 Views
  • 2 replies
  • 0 kudos

Resolved! HTTP Response code: 302

When trying to access a Databricks Warehouse from an external workspace I get the following error.Both, the Warehouse and the workspace are hosted on different AWS subscriptions.But curiously, with the same script I can access the Databricks Warehous...

  • 2435 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

It is network related issue. Please check the private link, VPC peering, DNS names resolving to ip (from AWS and Azure), and routing of that IP inside AWS. You can also open a support ticket with AWS.

  • 0 kudos
1 More Replies
abhi_1825
by New Contributor III
  • 2384 Views
  • 6 replies
  • 1 kudos

Resolved! Databricks certified Data Engineer Associate V3 Exam - Voucher code not working.

So I have a voucher code which I received after completing the Lakehouse Fundamentals exam. However, I am not able to use it for Databricks certified Data Engineer Associate V3 Exam in public section. Its working for V2 version along with other exams...

  • 2384 Views
  • 6 replies
  • 1 kudos
Latest Reply
abhi_1825
New Contributor III
  • 1 kudos

Thanks guys. My problem of voucher code is resolved after raising a ticket with Training team.

  • 1 kudos
5 More Replies
Toy
by New Contributor II
  • 1891 Views
  • 3 replies
  • 0 kudos

Pipeline Error [Py4JJavaError] com.databricks.WorkflowException: com.databricks.NotebookExecutionException: FAILED

I have a pipeline that used used to run successfully and now all of a sudden is returning this error that I cannot resolve: [Py4JJavaError] 

image
  • 1891 Views
  • 3 replies
  • 0 kudos
Latest Reply
Toy
New Contributor II
  • 0 kudos

Hi Guys, You're right the problem is with the child notebook.All my notebooks are failing at this point. I can't seem to be wining with solving this error

  • 0 kudos
2 More Replies
Phani1
by Valued Contributor
  • 935 Views
  • 1 replies
  • 0 kudos

Databricks Issue with the returning results to PowerBI

While returning results to PowerBI, Databricks Completed the session (in 9 mins) but PowerBI waiting for the results (more than 7 hrs. for 20 GB of data).Could you please help us on this.

  • 935 Views
  • 1 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@Janga Reddy​ It's really hard to find a solution here without further investigating PBI Gateway performance/Data models etc.If Databricks completed the session in 9 mins then I assume that the issue could be with the performance of PBI datasets.

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels