cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Neerajkirola
by New Contributor
  • 2008 Views
  • 0 replies
  • 0 kudos

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing dat...

Types of RAM: An In-Depth OverviewRandom Access Memory (RAM) is an essential component of any computer system, responsible for temporarily storing data that the CPU (Central Processing Unit) needs to access quickly. It allows for faster data retrieva...

Head to Head Table
  • 2008 Views
  • 0 replies
  • 0 kudos
burhanudinera20
by New Contributor II
  • 13393 Views
  • 3 replies
  • 0 kudos

Cannot import name 'Test' from partially initialized module 'databricks_test_helper'

I have done install, with this command ' pip install databricks_test_helper 'next get ImportError messages when i try running this code on cloud databricks ;from databricks_test_helper import *expected = set([(s, 'double') for s in ('AP', 'AT', 'PE'...

  • 13393 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Burhanudin Badiuzaman​ :The error message suggests that there may be a circular import happening within the databricks_test_helper module, which is preventing the Test class from being properly imported.One possible solution is to import the Test cl...

  • 0 kudos
2 More Replies
rsamant07
by New Contributor III
  • 6088 Views
  • 11 replies
  • 2 kudos

Resolved! DBT Job Type Authenticating to Azure Devops for git_source

we are trying to execute the databricks jobs for dbt task type but it is failing to autheticate to git. Problem is job is created using service principal but service principal don't seem to have access to the repo. few questions we have:1) can we giv...

  • 6088 Views
  • 11 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rahul Samant​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest p...

  • 2 kudos
10 More Replies
sensanjoy
by Contributor
  • 6716 Views
  • 7 replies
  • 3 kudos

Authenticate Databricks REST API and access delta tables from external web service.

Hi All,We do have a requirement to access delta tables from external web service(Web UI). Presently we have tested it through jdbc connection and authenticated using PAT:Ex. jdbc:spark://[DATABRICKS_HOST]:443/default;transportMode=http;ssl=1;httpPath...

  • 6716 Views
  • 7 replies
  • 3 kudos
Latest Reply
sensanjoy
Contributor
  • 3 kudos

Hi @Suteja Kanuri​ , could you please help me with above queries.

  • 3 kudos
6 More Replies
Anonymous
by Not applicable
  • 6954 Views
  • 0 replies
  • 0 kudos

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innov...

As companies grow and evolve, a Chief Technology Officer (CTO) becomes crucial in shaping the organization's technical direction and driving innovation. Regarding filling this critical leadership position, companies decide to either promote an existi...

  • 6954 Views
  • 0 replies
  • 0 kudos
Pien
by New Contributor II
  • 11183 Views
  • 5 replies
  • 0 kudos

Resolved! Getting date out of year and week

Hi all,I'm trying to get a date out of the columns year and week. The week format is not recognized.  df_loaded = df_loaded.withColumn("week_year", F.concat(F.lit("3"),F.col('Week'), F.col('Jaar')))df_loaded = df_loaded.withColumn("date", F.to_date(F...

  • 11183 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Pien Derkx​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 0 kudos
4 More Replies
QuicKick
by New Contributor
  • 7691 Views
  • 2 replies
  • 0 kudos

How do I search for all the columns/field names starting with "XYZ"

I would like to do a big search on all field/columns names that contain "XYZ".I tried below sql but it's giving me an error.SELECT table_name,column_nameFROM information_schema.columnsWHERE column_name like '%<account>%'order by table_name, column_na...

  • 7691 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ian Fox​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

  • 0 kudos
1 More Replies
kaileena
by New Contributor
  • 1539 Views
  • 2 replies
  • 0 kudos

cannot install RMySQL "there is no package called ‘RMySQL’

cannot install RMySQL on databricks. i tried:install.packages("RMySQL")i got the error:Installing package into ‘/local_disk0/.ephemeral_nfs/envs/rEnv-c677bc4c-e6a3-40df-a5ab-bfd5d277e0c0’ (as ‘lib’ is unspecified) Warning: unable to access index for ...

  • 1539 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @miru miro​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 0 kudos
1 More Replies
Merchiv
by New Contributor III
  • 7612 Views
  • 4 replies
  • 0 kudos

Difference between Databricks and local pyspark split.

I have noticed some inconsistent behavior between calling the 'split' fuction on databricks and on my local installation.Running it in a databricks notebook givesspark.sql("SELECT split('abc', ''), size(split('abc',''))").show()So the string is split...

image.png
  • 7612 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Ivo Merchiers​ :The behavior you are seeing is likely due to differences in the underlying version of Apache Spark between your local installation and Databricks. split() is a function provided by Spark's SQL functions, and different versions of Spa...

  • 0 kudos
3 More Replies
arw1070
by New Contributor II
  • 2626 Views
  • 3 replies
  • 0 kudos

Databricks extension is not configuring in VScode

I am trying to install and work with the Databricks vscode extensions. I installed it a few weeks ago, and it initially worked, but I mistyped some of the configuration so I tried to restart, since then it has not worked. Whenever I install the exten...

  • 2626 Views
  • 3 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@Anna Wuest​ I have Tried and not seeing any issues, which version of Vs code you are using. can you please try to update to latest Visual Studio Code version 1.77.1 and try to Install databricks plugin version and test .if you using windows--> pleas...

  • 0 kudos
2 More Replies
Nandini
by New Contributor II
  • 12603 Views
  • 10 replies
  • 7 kudos

Pyspark: You cannot use dbutils within a spark job

I am trying to parallelise the execution of file copy in Databricks. Making use of multiple executors is one way. So, this is the piece of code that I wrote in pyspark.def parallel_copy_execution(src_path: str, target_path: str): files_in_path = db...

  • 12603 Views
  • 10 replies
  • 7 kudos
Latest Reply
Etyr
Contributor
  • 7 kudos

If you have spark session, you can use Spark hidden File System:# Get FileSystem from SparkSession fs = spark._jvm.org.apache.hadoop.fs.FileSystem.get(spark._jsc.hadoopConfiguration()) # Get Path class to convert string path to FS path path = spark._...

  • 7 kudos
9 More Replies
GuMart
by New Contributor III
  • 2402 Views
  • 2 replies
  • 1 kudos

Delta Live Tables - RETRY_ON_FAILURE

Hi,Is it possible to set it up the RETRY_ON_FAILURE property for DLTs through the API?I'm not finding in the Docs (although it seems to exist in a response payload).https://docs.databricks.com/delta-live-tables/api-guide.html

  • 2402 Views
  • 2 replies
  • 1 kudos
Latest Reply
GuMart
New Contributor III
  • 1 kudos

Hi @Suteja Kanuri​ ,Thank you so much for the quick and complete answer!Regards,

  • 1 kudos
1 More Replies
alm
by New Contributor III
  • 5302 Views
  • 2 replies
  • 2 kudos

Resolved! Vectorized reading of parquet file containing decimal type column(s)

I was trying to read a parquet file, and write to a delta table, with a parquet file that contains decimal type columns. I encountered a problem that is pretty neatly described by this kb.databricks article, and which I solved by disabling the vector...

  • 5302 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Alberte Mørk​ :The behavior you observed is due to a known issue in Apache Spark when vectorized reading is used with Parquet files that contain decimal type columns. As you mentioned, the issue can be resolved by disabling vectorized reading for th...

  • 2 kudos
1 More Replies
Anonymous
by Not applicable
  • 2215 Views
  • 2 replies
  • 2 kudos

Hello Everyone, I&#39;m interested to learn about the certifications you&#39;re pursuing to enhance your skills. Sharing your goals can inspire those ...

Hello Everyone,I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those who may have started their certification journey but struggled with motivation. Personally, I recently comple...

  • 2215 Views
  • 2 replies
  • 2 kudos
Latest Reply
FJ
Contributor III
  • 2 kudos

I'm trying the Data Engineering professional exam at the end of the month. It's like a shot in the dark because no practice exams stop are available and from what I've seen online from people who already passed it, the Advanced Data Engineering with ...

  • 2 kudos
1 More Replies
Anonymous
by Not applicable
  • 8080 Views
  • 8 replies
  • 0 kudos

Not able to connect to On-Prem Oracle from Databricks cluster

Hi Everyone,I was trying to connect to Oracle Instance from Databricks cluster and it is giving below error:java.sql.SQLTimeoutException: ORA-12170: Cannot connect. TCP connect timeout of 30000ms for host xx.x.x.*** port 1521. (CONNECTION_ID=CgM7V7UB...

  • 8080 Views
  • 8 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Satya89:The error message you received indicates that the TCP connection to the Oracle database timed out. This could be caused by a number of factors such as network issues, firewall restrictions, or the database being overloaded.Here are a few ste...

  • 0 kudos
7 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels