cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

killjoy
by New Contributor III
  • 4900 Views
  • 2 replies
  • 0 kudos

Unexpected failure while fetching notebook - What can we do from our side?

Hello!We got some pipelines running in Azure Data Factory that call Databricks Notebooks to run data transformations. This morning at 6:21 AM (UTC) we got an error " Unexpected failure while fetching notebook" inside a notebook that calls another one...

  • 4900 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Rita Fernandes​ :Based on the error message you provided, it seems like the issue might be related to the version mismatch between the ANTLR tool used for code generation and the current runtime version. Additionally, the error message suggests that...

  • 0 kudos
1 More Replies
d_meaker
by New Contributor II
  • 1687 Views
  • 3 replies
  • 0 kudos

map_keys() returns an empty array in Delta Live Table pipeline.

We are exploding a map type column into multiple columns based on the keys of the map column. Part of this process is to extract the keys of a map type column called json_map as illustrated in the snippet below. The code executes as expected when run...

  • 1687 Views
  • 3 replies
  • 0 kudos
Latest Reply
d_meaker
New Contributor II
  • 0 kudos

Hi @Suteja Kanuri​ , Thank you for you response and explanation. The code I have shown above is not the exact snippet we are using. Please find the exact snippet below. We are dynamically extracting the keys of the map and then using getitem() to mak...

  • 0 kudos
2 More Replies
Neerajkirola
by New Contributor
  • 1105 Views
  • 0 replies
  • 0 kudos

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing dat...

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing data that the CPU (Central Processing Unit) needs to access quickly. It allows for faster data retrieva...

Head to Head Table
  • 1105 Views
  • 0 replies
  • 0 kudos
burhanudinera20
by New Contributor II
  • 10157 Views
  • 3 replies
  • 0 kudos

Cannot import name 'Test' from partially initialized module 'databricks_test_helper'

I have done install, with this command ' pip install databricks_test_helper 'next get ImportError messages when i try running this code on cloud databricks ;from databricks_test_helper import *expected = set([(s, 'double') for s in ('AP', 'AT', 'PE'...

  • 10157 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Burhanudin Badiuzaman​ :The error message suggests that there may be a circular import happening within the databricks_test_helper module, which is preventing the Test class from being properly imported.One possible solution is to import the Test cl...

  • 0 kudos
2 More Replies
rsamant07
by New Contributor III
  • 4707 Views
  • 11 replies
  • 2 kudos

Resolved! DBT Job Type Authenticating to Azure Devops for git_source

we are trying to execute the databricks jobs for dbt task type but it is failing to autheticate to git. Problem is job is created using service principal but service principal don't seem to have access to the repo. few questions we have:1) can we giv...

  • 4707 Views
  • 11 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rahul Samant​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest p...

  • 2 kudos
10 More Replies
sensanjoy
by Contributor
  • 4316 Views
  • 7 replies
  • 3 kudos

Authenticate Databricks REST API and access delta tables from external web service.

Hi All,We do have a requirement to access delta tables from external web service(Web UI). Presently we have tested it through jdbc connection and authenticated using PAT:Ex. jdbc:spark://[DATABRICKS_HOST]:443/default;transportMode=http;ssl=1;httpPath...

  • 4316 Views
  • 7 replies
  • 3 kudos
Latest Reply
sensanjoy
Contributor
  • 3 kudos

Hi @Suteja Kanuri​ , could you please help me with above queries.

  • 3 kudos
6 More Replies
Anonymous
by Not applicable
  • 6722 Views
  • 0 replies
  • 0 kudos

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innov...

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innovation. Regarding filling this critical leadership position, companies decide to either promote an existi...

  • 6722 Views
  • 0 replies
  • 0 kudos
Pien
by New Contributor II
  • 9506 Views
  • 5 replies
  • 0 kudos

Resolved! Getting date out of year and week

Hi all,I'm trying to get a date out of the columns year and week. The week format is not recognized.  df_loaded = df_loaded.withColumn("week_year", F.concat(F.lit("3"),F.col('Week'), F.col('Jaar')))df_loaded = df_loaded.withColumn("date", F.to_date(F...

  • 9506 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pien Derkx​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 0 kudos
4 More Replies
QuicKick
by New Contributor
  • 4126 Views
  • 2 replies
  • 0 kudos

How do I search for all the columns/field names starting with "XYZ"

I would like to do a big search on all field/columns names that contain "XYZ".I tried below sql but it's giving me an error.SELECT table_name,column_nameFROM information_schema.columnsWHERE column_name like '%<account>%'order by table_name, column_na...

  • 4126 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ian Fox​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

  • 0 kudos
1 More Replies
kaileena
by New Contributor
  • 1272 Views
  • 2 replies
  • 0 kudos

cannot install RMySQL "there is no package called ‘RMySQL’

cannot install RMySQL on databricks. i tried:install.packages("RMySQL")i got the error:Installing package into ‘/local_disk0/.ephemeral_nfs/envs/rEnv-c677bc4c-e6a3-40df-a5ab-bfd5d277e0c0’ (as ‘lib’ is unspecified) Warning: unable to access index for ...

  • 1272 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @miru miro​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 0 kudos
1 More Replies
Merchiv
by New Contributor III
  • 2991 Views
  • 4 replies
  • 0 kudos

Difference between Databricks and local pyspark split.

I have noticed some inconsistent behavior between calling the 'split' fuction on databricks and on my local installation.Running it in a databricks notebook givesspark.sql("SELECT split('abc', ''), size(split('abc',''))").show()So the string is split...

image.png
  • 2991 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ivo Merchiers​ :The behavior you are seeing is likely due to differences in the underlying version of Apache Spark between your local installation and Databricks. split() is a function provided by Spark's SQL functions, and different versions of Spa...

  • 0 kudos
3 More Replies
arw1070
by New Contributor II
  • 2119 Views
  • 3 replies
  • 0 kudos

Databricks extension is not configuring in VScode

I am trying to install and work with the Databricks vscode extensions. I installed it a few weeks ago, and it initially worked, but I mistyped some of the configuration so I tried to restart, since then it has not worked. Whenever I install the exten...

  • 2119 Views
  • 3 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@Anna Wuest​ I have Tried and not seeing any issues, which version of Vs code you are using. can you please try to update to latest Visual Studio Code version 1.77.1 and try to Install databricks plugin version and test .if you using windows--> pleas...

  • 0 kudos
2 More Replies
Nandini
by New Contributor II
  • 10161 Views
  • 10 replies
  • 7 kudos

Pyspark: You cannot use dbutils within a spark job

I am trying to parallelise the execution of file copy in Databricks. Making use of multiple executors is one way. So, this is the piece of code that I wrote in pyspark.def parallel_copy_execution(src_path: str, target_path: str): files_in_path = db...

  • 10161 Views
  • 10 replies
  • 7 kudos
Latest Reply
Etyr
Contributor
  • 7 kudos

If you have spark session, you can use Spark hidden File System:# Get FileSystem from SparkSession fs = spark._jvm.org.apache.hadoop.fs.FileSystem.get(spark._jsc.hadoopConfiguration()) # Get Path class to convert string path to FS path path = spark._...

  • 7 kudos
9 More Replies
GuMart
by New Contributor III
  • 1772 Views
  • 2 replies
  • 1 kudos

Delta Live Tables - RETRY_ON_FAILURE

Hi,Is it possible to set it up the RETRY_ON_FAILURE property for DLTs through the API?I'm not finding in the Docs (although it seems to exist in a response payload).https://docs.databricks.com/delta-live-tables/api-guide.html

  • 1772 Views
  • 2 replies
  • 1 kudos
Latest Reply
GuMart
New Contributor III
  • 1 kudos

Hi @Suteja Kanuri​ ,Thank you so much for the quick and complete answer!Regards,

  • 1 kudos
1 More Replies
alm
by New Contributor III
  • 3813 Views
  • 2 replies
  • 2 kudos

Resolved! Vectorized reading of parquet file containing decimal type column(s)

I was trying to read a parquet file, and write to a delta table, with a parquet file that contains decimal type columns. I encountered a problem that is pretty neatly described by this kb.databricks article, and which I solved by disabling the vector...

  • 3813 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Alberte Mørk​ :The behavior you observed is due to a known issue in Apache Spark when vectorized reading is used with Parquet files that contain decimal type columns. As you mentioned, the issue can be resolved by disabling vectorized reading for th...

  • 2 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels