cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

vvt1976
by New Contributor
  • 2991 Views
  • 1 replies
  • 0 kudos

Create table using a location

Hi,Databricks newbie here. I have copied delta files from my Synapse workspace into DBFS. To add them as a table, I executed.create table audit_payload using delta location '/dbfs/FileStore/data/general/audit_payload'The command executed properly. Ho...

Data Engineering
data engineering
  • 2991 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

can you read the delta lake files using spark.read.format("delta").load("path/to/delta/table")?If not, it is not a valid delta lake table, which is my guess as creating a table from delta lake is nothing more than a semantic wrapper around the actual...

  • 0 kudos
PankajMendi
by New Contributor
  • 806 Views
  • 1 replies
  • 0 kudos

Error accessing Azure sql from Azure databricks using jdbc authentication=ActiveDirectoryInteractive

Getting below error while accessing Azure sql using jdbc from Azure databricks notebook,com.microsoft.sqlserver.jdbc.SQLServerException: Failed to authenticate the user p***** in Active Directory (Authentication=ActiveDirectoryInteractive). Unable to...

  • 806 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

it seems you are trying to do MFA authentication using jdbc.The used driver might not support that. It could also be a OS issue (if you are not using Windows f.e.) or a browser issue (the browser will have to open a window/tab).Can you try to authent...

  • 0 kudos
anuintuceo
by New Contributor
  • 973 Views
  • 1 replies
  • 0 kudos

unzip a password protected file using synapse notebook

I have a zipped file. It has 3 csv files.  It is password protected. When I tried extracting it manually it will extract only with 7zip. I moved my zipped file to ADLS automatically and want to extract it with the password. How to unzip the file and ...

  • 973 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

 it is most probably possible.If you use python, the zipfile library can do it, something like this:with zipfile.ZipFile(zip_file_path, 'r') as zip_ref: zip_ref.extractall(path=extract_to, pwd=bytes(password,'utf-8')) In scala there is f....

  • 0 kudos
Sushmg
by New Contributor
  • 1323 Views
  • 1 replies
  • 0 kudos

Rest Api call

There is a requirement to create a pipeline that calls a rest Api and we have to store the data in datawarehouse which is the best was to do this operation?

  • 1323 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

there are several ways to do this.you could use python (or scala or ...) to do the call. then transform it and write it to the dwh.Or you could do the call, write the raw data and process it later on.Or you could use an ETL/ELT tool that can do the r...

  • 0 kudos
amitkmaurya
by Contributor
  • 2722 Views
  • 2 replies
  • 4 kudos

Resolved! How to increase executor memory in Databricks jobs

May be I am new to Databricks that's why I have confusion.Suppose I have worker memory of 64gb in Databricks job max 12 nodes...and my job is failing due to Executor Lost due to 137 (OOM if found on internet).So, to fix this I need to increase execut...

  • 2722 Views
  • 2 replies
  • 4 kudos
Latest Reply
amitkmaurya
Contributor
  • 4 kudos

Hi @raphaelblg ,I have solved this issue. Yes, in my case data skewness was the issue that was causing this executor OOM, so adding repartition just before writing resolved this skewness. I didn't change any workers or driver memory.Thanks for your h...

  • 4 kudos
1 More Replies
amitkmaurya
by Contributor
  • 3756 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks job keep getting failed due to executor lost.

Getting following error while saving a dataframe partitioned by two columns.Job aborted due to stage failure: Task 5774 in stage 33.0 failed 4 times, most recent failure: Lost task 5774.3 in stage 33.0 (TID 7736) (13.2.96.110 executor 7): ExecutorLos...

Data Engineering
databricks jobs
spark
  • 3756 Views
  • 1 replies
  • 1 kudos
Latest Reply
amitkmaurya
Contributor
  • 1 kudos

Hi, I have solved the problem with the same workers and driver.In my case data skewness was the problem.Adding repartition to the dataframe just before writing, evenly distributed the data across the nodes and this stage failure resolved.Thanks @Reti...

  • 1 kudos
Sushmg
by New Contributor
  • 2343 Views
  • 0 replies
  • 0 kudos

Call rest api

Hi there is requirements to create a pipeline that calls api and store that data in datawarehouse. Can you suggest me the best way to do this

  • 2343 Views
  • 0 replies
  • 0 kudos
Mirza1
by New Contributor
  • 837 Views
  • 1 replies
  • 0 kudos

Error while Running a Table

Hi All,I am trying to run table schema and facing below error.Error - AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table.com.databricks.backend.common.rpc.SparkDriverExceptions$SQLExecutionException: org.apache...

  • 837 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @Mirza1 , Greetings!  Can you please confirm if it is an ADLS gen2 table? If yes then can you please give it a try to run the table schema by setting spark configs for gen2 at the cluster level?   You can refer to this document to set the spark co...

  • 0 kudos
Silabs
by New Contributor II
  • 6985 Views
  • 3 replies
  • 4 kudos

Resolved! Set up connection to on prem sql server

Ive just set up our databricks environment. Hosted in AWS. We have an on prem SQL server and would like to connect . How can i do that?

  • 6985 Views
  • 3 replies
  • 4 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 4 kudos

@Silabs good day! To connect your Databricks environment (hosted on AWS) to your on-premise SQL server, follow these steps: 1. Network Setup: Establish a connection between your SQL server and the Databricks virtual private cloud (VPC) using VPN or A...

  • 4 kudos
2 More Replies
dbengineer516
by New Contributor III
  • 4672 Views
  • 4 replies
  • 2 kudos

Resolved! Git Integration with Databricks Query Files and Azure DevOps

I’ve been trying to develop a solution for our team to be able to have Git integration between Databricks and Azure DevOps. However, the “query” file type/workspace item on Databricks can’t be committed and pushed to a Git repo, only regular file typ...

  • 4672 Views
  • 4 replies
  • 2 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 2 kudos

@dbengineer516 Good day! As per the Databricks documentation, only certain Databricks asset types are supported by Git folders. These include Files, Notebooks, and Folders. Databricks asset types that are currently not supported in Git folders includ...

  • 2 kudos
3 More Replies
SreeG
by New Contributor II
  • 1304 Views
  • 2 replies
  • 0 kudos

Error Reading Kafka message into Azure Databricks

TeamI am trying to test the connection to Kafka broker from Azure Databricks. Telnet and IP is successful.When I am trying to read the data, I am getting "Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid cer...

Data Engineering
Azure DB
kafka
  • 1304 Views
  • 2 replies
  • 0 kudos
Latest Reply
SreeG
New Contributor II
  • 0 kudos

Had to hold on this testing. But, when I get a chance to work on this, I will update my findings. Thank you!

  • 0 kudos
1 More Replies
PerformanceTest
by New Contributor
  • 845 Views
  • 1 replies
  • 0 kudos

Databricks to Jmteter connectivity issue

Hi All, we are conducting Databricks performance test with Apache Jmeter, after configuring JDBC config element getting below error Cannot create PoolableConnectionFactory ([Databricks][JDBCDriver](700120) Host xxxxx-xxxxx-xxxx.cloud.databricks.com c...

  • 845 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@PerformanceTest - can you please check your DB workspace terraform script to see if there is a different CNAME defined for your host workspace.  

  • 0 kudos
ByteForge
by New Contributor
  • 1174 Views
  • 1 replies
  • 0 kudos

How to import .dbc files above size limit?

 Above is the screenshot of error, is there any other way of processing dbc files? Do no have access/backup to previous workspace where this code is imported from

ByteForge_2-1715687151524.png
  • 1174 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@ByteForge - Kindly raise a support case with Databricks to work with Engg to increase the limits for your workspace.

  • 0 kudos
ac0
by Contributor
  • 1254 Views
  • 2 replies
  • 0 kudos

Resolved! Is it more performant to run optimize table commands on a serverless SQL warehouse or elsewhere?

Is it more performant to run optimize table commands on a serverless SQL warehouse or in a job or all-purpose compute cluster? I would presume a serverless warehouse would be faster, but I don't know how to test this.

  • 1254 Views
  • 2 replies
  • 0 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 0 kudos

@ac0 Good day! Serverless SQL warehouses are likely to execute "optimize table" commands faster than job or all-purpose compute clusters due to their rapid startup time, quick upscaling for low latency, and efficient handling of varying query demand....

  • 0 kudos
1 More Replies
NTRT
by New Contributor III
  • 1020 Views
  • 1 replies
  • 0 kudos

how to transform json-stat 2 filte to SparkDataFrame ? how to keep order on MapType structure ?

Hi,I am using different json files of type json-stat2.  These kind of json file is quite common used in national statistisc bureau. Its multi dimensional with multy arrays. Using python environment kan we use pyjstat package to easily  transform json...

  • 1020 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

MapType does not maintain order (json itself too).Can you apply the ordering yourself afterwards?

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels