Data Engineering

Forum Posts

Sorted by:

by dbengineer516 • New Contributor

13 hours ago

44 Views
1 replies
0 kudos

/api/2.0/preview/sql/queries API only returning certain queries

Hello,When using /api/2.0/preview/sql/queries to list out all available queries, I noticed that certain queries were being shown while others were not. I did a small test on my home workspace, and it was able to recognize certain queries when I defin...

Data Engineering

44 Views
1 replies
0 kudos

13 hours ago

View Replies

Latest Reply

brockb
New Contributor III

5 hours ago

0 kudos

Hi,How many queries were returned in the API call in question? The List Queries documentation describes this endpoint as supporting pagination with a default page size of 25, is that how many you saw returned? Query parameters page_size integer <= 10...

0 kudos

5 hours ago

by prabhu26 • Visitor

13 hours ago

43 Views
1 replies
0 kudos

Unable to enforce schema on data read from jsonl file in Azure Databricks using pyspark

I'm tring to build a ETL pipeline in which I'm reading the jsonl files from the azure blob storage, then trying to transform and load it to delta tables in databricks. I have created the below schema for loading my data : schema = StructType([ S...

Data Engineering

43 Views
1 replies
0 kudos

13 hours ago

View Replies

Latest Reply

DataEngineer
New Contributor II

9 hours ago

0 kudos

Try this.Add option("multiline","true")

0 kudos

9 hours ago

by mh_db • New Contributor II

9 hours ago

44 Views
0 replies
0 kudos

Unable to connect to oracle server from databricks notebook in AWS

I'm trying to connect to oracle server hosted in azure from AWS databricks notebook but seems the connection keeps timing out. I tested the connection IP using telnet <hostIP> 1521 command from another EC2 instance and that seems to reach the oracle ...

Data Engineering

AWS

oracle

TCP

44 Views
0 replies
0 kudos

9 hours ago

by DataEngineer • New Contributor II

9 hours ago

19 Views
0 replies
0 kudos

AWS Email sending challenge from Databricks with UNITY CATALOG and Multinode cluster

Hi,I have implemented the UNITY CATALOG with multinode cluster in databricks. The workspace instance profile with EC2 access is also created in IAM. but still having a challenge in sending emails from databricks using SES service.The same is working ...

Data Engineering

19 Views
0 replies
0 kudos

9 hours ago

by MarkD • New Contributor II

a week ago

377 Views
8 replies
0 kudos

SET configuration in SQL DLT pipeline does not work

Hi,I'm trying to set a dynamic value to use in a DLT query, and the code from the example documentation does not work.SET startDate='2020-01-01'; CREATE OR REFRESH LIVE TABLE filtered AS SELECT * FROM my_table WHERE created_at > ${startDate};It is g...

Data Engineering

Delta Live Tables

dlt

sql

377 Views
8 replies
0 kudos

a week ago

View Replies

Latest Reply

Hkesharwani
New Contributor III

15 hours ago

0 kudos

Hi @MarkD ,You may use set variable_name.var= '1900-01-01'to set the value of variable and in order to use the value of variable use ${automated_date.var} Example: set automated_date.var= '1800-01-01' select * from my table where date = CAST(${autom...

0 kudos

15 hours ago

7 More Replies

by pshuk • New Contributor III

Monday

104 Views
2 replies
1 kudos

upload file/table to delta table using CLI

Hi,I am using CLI to transfer local files to Databricks Volume. At the end of my upload, I want to create a meta table (storing file name, location, and some other information) and have it as a table on databricks Volume. I am not sure how to create ...

Data Engineering

104 Views
2 replies
1 kudos

Monday

View Replies

Latest Reply

Ayushi_Suthar
Honored Contributor

yesterday

1 kudos

Hi @pshuk , Greetings! We understand that you are looking for a CLI command to create a Table but at this moment Databricks doesn't support CLI command to create the table but you can use SQL Execution API -https://docs.databricks.com/api/workspace/...

1 kudos

yesterday

1 More Replies

by JOFinancial • Visitor

15 hours ago

42 Views
1 replies
0 kudos

No Data for External Table from Blob Storage

Hi All,I am trying to create an external table from a Azure Blob storage container. I recieve no errors, but there is no data in the table. The Blob Storage contains 4 csv files with the same columns and about 10k rows of data. Am I missing someth...

Data Engineering

42 Views
1 replies
0 kudos

15 hours ago

View Replies

Latest Reply

Hkesharwani
New Contributor III

14 hours ago

0 kudos

Hi, The code looks completely fine. please check if you have any other delimiter other than , .If your CSV files use a different delimiter, you can specify it in the table definition using the OPTIONS clause.Just to confirm I created a sample table a...

0 kudos

14 hours ago

by TinasheChinyati • New Contributor

12-08-2023 11:29:40 PM

1645 Views
2 replies
0 kudos

Is databricks capable of housing OLTP and OLAP?

Hi data experts.I currently have an OLTP (Azure SQL DB) that keeps data only for the past 14 days. We use Partition switching to achieve that and have an ETL (Azure data factory) process that feeds the Datawarehouse (Azure Synapse Analytics). My requ...

Data Engineering

1645 Views
2 replies
0 kudos

12-08-2023 11:29:40 PM

View Replies

Latest Reply

ChrisCkx
New Contributor II

14 hours ago

0 kudos

Hi @Kaniz I have looked at this topic extensively and have even tried to implement it.I am a champion of databricks at my organization, but I do not think that it currently enables the OLTP scenarios.The closest I have gotten to it is by using the St...

0 kudos

14 hours ago

1 More Replies

by dbal • New Contributor III

a week ago

518 Views
2 replies
0 kudos

withColumnRenamed does not work with databricks-connect 14.3.0

I am not able to run our unit tests suite due a possible bug in the databricks-connect library. The problem is with the Dataframe transformation withColumnRenamed. When I run it in a Databricks cluster (Databricks Runtime 14.3 LTS), the column is ren...

Data Engineering

518 Views
2 replies
0 kudos

a week ago

View Replies

Latest Reply

shan_chandra
Esteemed Contributor

yesterday

0 kudos

@dbal - can you please try withColumnsRenamed() instead Reference: https://docs.databricks.com/en/release-notes/dbconnect/index.html#databricks-connect-1430-python

0 kudos

yesterday

1 More Replies

by Sushmg • Visitor

yesterday

620 Views
1 replies
0 kudos

Call rest api

Hi there is requirements to create a pipeline that calls api and store that data in datawarehouse. Can you suggest me the best way to do this

Data Engineering

620 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Kaniz
Community Manager

17 hours ago

0 kudos

Hi @Sushmg, Please refer to the Databricks documentation and resources for more detailed instructions and examples.

0 kudos

17 hours ago

by Dhruv-22 • New Contributor III

17 hours ago

27 Views
0 replies
0 kudos

NamedStruct fails in the 'IN' query

I've posted the same question on stackoverflow (link) as well. I will post any solution I get there.I was trying to understand using many columns in the IN query and came across this statement. SELECT (1, 2) IN (SELECT c1, c2 FROM VALUES(1, 2), (3, 4...

Data Engineering

27 Views
0 replies
0 kudos

17 hours ago

by StephanKnox • Visitor

19 hours ago

33 Views
1 replies
1 kudos

Parametrized SQL - Pass column names as a parameter?

Hi all, Is there a way to pass a column name(not a value) in a parametrized Spark SQL query?I am trying to do it like so, however it does not work as I think column name get expanded like 'value' i.e. surrounded by single quotes: def count_nulls(df:D...

Data Engineering

33 Views
1 replies
1 kudos

19 hours ago

View Replies

Latest Reply

Kaniz
Community Manager

17 hours ago

1 kudos

Hi @StephanKnox , You can use string interpolation (f-strings) to dynamically insert the column name into your query.

1 kudos

17 hours ago

by Dhruv-22 • New Contributor III

04-03-2024 3:03:26 AM

209 Views
2 replies
0 kudos

Understanding least common type in databricks

I was reading the data type rules and found about least common type.I have a doubt. What is the least common type of STRING and INT? The referred link gives the following example saying the least common type is BIGINT.-- The least common type between...

Data Engineering

209 Views
2 replies
0 kudos

04-03-2024 3:03:26 AM

View Replies

Latest Reply

Kaniz
Community Manager

04-05-2024 3:44:58 AM

0 kudos

Hi @Dhruv-22, The concept of the least common type can indeed be a bit tricky, especially when dealing with different data types like STRING and INT. Let’s dive into this and clarify the behaviour in Apache Spark™ and Databricks. Coalesce Functi...

0 kudos

04-05-2024 3:44:58 AM

1 More Replies

by SparkMaster • New Contributor III

06-14-2023 6:38:42 AM

3674 Views
10 replies
1 kudos

Why can't I delete experiments without deleting the notebook? Or better Organize experiments into folders?

My Databricks Experiments is cluttered with a whole lot of experiments. Many of them are notebooks which are showing there for some reason (even though they didn't have an MLflow run associated with it). I would like to delete the experiments, but it...

Data Engineering

3674 Views
10 replies
1 kudos

06-14-2023 6:38:42 AM

View Replies

Latest Reply

mhiltner
New Contributor II

18 hours ago

1 kudos

Hey @Debayan @SparkMaster A bit late here, but I believe this is being caused by a click on the right side experiments icon. This may look like a meaningless click but it actually triggers a run.

1 kudos

18 hours ago

9 More Replies

by 210227 • New Contributor III

19 hours ago

66 Views
1 replies
0 kudos

Resolved! External table from external location

Hi, I'm creating external table from existing external location and am a bit puzzled as to what permissions I need for it or what is the correct way of defining the S3 path with wildcards. This:create external table if not exists test_catalogue_dev.b...

Data Engineering

66 Views
1 replies
0 kudos

19 hours ago

View Replies

Latest Reply

210227
New Contributor III

19 hours ago

0 kudos

Just for the reference, the wildcard is not needed in this case, just a misleading error message. In this case 's3://test-data/full/2023/01/' instead of 's3://test-data/full/2023/01/*/' was the correct PATH

0 kudos

19 hours ago

User

Count

1603

737

344

284

247

Databricks

Forum Posts

/api/2.0/preview/sql/queries API only returning certain queries

Unable to enforce schema on data read from jsonl file in Azure Databricks using pyspark

Unable to connect to oracle server from databricks notebook in AWS

AWS Email sending challenge from Databricks with UNITY CATALOG and Multinode cluster

SET configuration in SQL DLT pipeline does not work

upload file/table to delta table using CLI

No Data for External Table from Blob Storage

Is databricks capable of housing OLTP and OLAP?

withColumnRenamed does not work with databricks-connect 14.3.0

Call rest api

NamedStruct fails in the 'IN' query

Parametrized SQL - Pass column names as a parameter?

Understanding least common type in databricks

Why can't I delete experiments without deleting the notebook? Or better Organize experiments into folders?

Resolved! External table from external location

External table from external location

How to increase executor memory in Databricks jobs

Databricks job keep getting failed due to executor...

Set up connection to on prem sql server

Git Integration with Databricks Query Files and Az...