cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mr_robot
by New Contributor
  • 2793 Views
  • 3 replies
  • 3 kudos

Update datatype of a column in a table

I have a table in databricks with fields name: string, id: string, orgId: bigint, metadata: struct, now i want to rename one of the columns and change it type. In my case I want to update orgId to orgIds and change its type to map<string, string> One...

Data Engineering
tables delta-tables
  • 2793 Views
  • 3 replies
  • 3 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 3 kudos

You can use REPLACE COLUMNS.ALTER TABLE your_table_name REPLACE COLUMNS ( name STRING, id BIGINT, orgIds MAP<STRING, STRING>, metadata STRUCT<...> );

  • 3 kudos
2 More Replies
ashraf1395
by Honored Contributor
  • 839 Views
  • 1 replies
  • 1 kudos

Resolved! Querying Mysql db from Azure databricks where public access is disabled

Hi there,We are trying to setup a infra that ingest data from MySQL hosted on awa EC2 instance with pyspark and azure databricks and dump to the adls storage.Since databases has public accessibility disabled and how can I interact with MySQL from azu...

  • 839 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

you will need some kind of tunnel that opens the db server to external access.Perhaps a vpn is an option?If not: won't be possible.An alternative way would be to have some local LAN tool extract the data and then move it to S3/... and afterwards let ...

  • 1 kudos
vvzadvor
by New Contributor III
  • 2288 Views
  • 4 replies
  • 1 kudos

Resolved! Debugging python code outside of Notebooks

Hi experts,Does anyone know if there's a way of properly debugging python code outside of notebooks?We have a complicated python-based framework for loading files, transforming them according to the business specification and saving the results into ...

  • 2288 Views
  • 4 replies
  • 1 kudos
Latest Reply
vvzadvor
New Contributor III
  • 1 kudos

OK, I can now confirm that remote debugging with stepping into your own libraries installed on the cluster is possible and is actually pretty convenient using a combination of databricks-connect Python library and a Databricks extension for VSCode. S...

  • 1 kudos
3 More Replies
ashraf1395
by Honored Contributor
  • 1024 Views
  • 1 replies
  • 2 kudos

Resolved! Reading a materialised view locally or using databricks api

Hi there, This was my previous approach - I had a databricks notebook with a streaming table bronze level reading data from volumes which created a 2 downstream tables.- 1st A a materialised view gold level, another a table for storing ingestion_meta...

  • 1024 Views
  • 1 replies
  • 2 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 2 kudos

I used this approach - Querying the materialised view using databricks serverless SQL endpoint by connecting it with SQL connect. Its working right now. If I face any issues, I will write it into a normal table and delta share it.Thanks for your repl...

  • 2 kudos
MyTrh
by New Contributor III
  • 2875 Views
  • 7 replies
  • 3 kudos

Resolved! Delta table with unique columns incremental refresh

Hi Team,We have one huge streaming table from which we want to create another streaming table in which we will pick few columns from the original streaming table. But in this new table the rows must be unique.Can someone please help me with the imple...

  • 2875 Views
  • 7 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @MyTrh ,Ok, I think I created similiar use case to yours. I have streaming table with column structure based on your exampleCREATE OR REFRESH STREAMING TABLE clicks_raw AS SELECT *, current_timestamp() as load_time FROM cloud_files('/Volumes/dev/d...

  • 3 kudos
6 More Replies
Soma
by Valued Contributor
  • 1914 Views
  • 3 replies
  • 1 kudos

Resolved! Where does custom state store the data

There are couple of custom state functions like mapgroupswithstate,ApplyinpandaswithStateWhich has a internal state maintained is it maintained in same statestore(rocksdb) as aggregation state store function ​

  • 1914 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @somanath Sankaran​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best ans...

  • 1 kudos
2 More Replies
Box_clown
by New Contributor II
  • 1999 Views
  • 3 replies
  • 3 kudos

Set Not null changes Data type

Hello,Just found this issue this week and thought I would ask. An Alter Table alter column set not null is changing a varchar(x) data type to string type. I believe this should happen in most environments so I wouldn't need to supply code...Create a ...

  • 1999 Views
  • 3 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @Box_clown ,To be precise, Delta Lake format is based on parquet files. For strings, Parquet only has one data type: StringTypeSo, basically varchar(n) data type under the hood is represented as string with check constraint on the length of the st...

  • 3 kudos
2 More Replies
acegerace
by New Contributor II
  • 1083 Views
  • 1 replies
  • 1 kudos

RLS

When applying a function to a table for RLS, do users require SELECT privileges on the table used for RLS. And, do users also require EXECUTE privileges on the function. Not clear on this form doco.

  • 1083 Views
  • 1 replies
  • 1 kudos
Latest Reply
mahfooz_iiitian
New Contributor III
  • 1 kudos

Yes, you require select permission for the table.For functions, if it is a built-in function (such as is_account_group_member), then you do not require permission. However, if it is a custom function, you must have access to execute it.You can refer ...

  • 1 kudos
ayush25091995
by New Contributor III
  • 584 Views
  • 1 replies
  • 0 kudos

Get queries history run on UC enabled interactive cluster

Hi Team,I want to derived couple of kpis like most frequent queries, top queries, query type like select, insert or update on UC enabled interactive cluster. I know we can do this for SQL warehouse but what is the way we can do this interactive clust...

  • 584 Views
  • 1 replies
  • 0 kudos
Latest Reply
ayush25091995
New Contributor III
  • 0 kudos

@Retired_mod , this table will only the query history for sql warehouse cluster, i need for UC interactive/All purpose cluster. 

  • 0 kudos
Mathias_Peters
by Contributor
  • 1607 Views
  • 2 replies
  • 3 kudos

Resolved! Service principal seemingly cannot access its own workspace folder

We have implemented an asset bundle (DAB) that creates a wheel. During DAB deployment, the wheel is built and stored in the folder of the service principal running the deployment via GH workflow. The full path is/Workspace/Users/SERVICE-PRINCIPAL-ID/...

  • 1607 Views
  • 2 replies
  • 3 kudos
Latest Reply
Rishabh_Tiwari
Databricks Employee
  • 3 kudos

Thank you for sharing the solution that worked for you, I am sure it will help other community members. ThanksRishabh

  • 3 kudos
1 More Replies
Littlesheep_
by New Contributor
  • 4454 Views
  • 3 replies
  • 0 kudos

How to run a notebook in a .py file in databricks

The situation is that my colleague was using pycharm and now needs to adapt to databricks. They are now doing their job by connecting VScode to databricks and run the .py file using databricks clusters.The problem is they want to call a notebook in d...

  • 4454 Views
  • 3 replies
  • 0 kudos
Latest Reply
Rishabh_Tiwari
Databricks Employee
  • 0 kudos

Hi @Littlesheep_ , Thank you for reaching out to our community! We're here to help you.  To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your fe...

  • 0 kudos
2 More Replies
EdwardLui
by New Contributor
  • 802 Views
  • 1 replies
  • 0 kudos

How to extend the retention duration on steaming table created by DLT

The steaming table from DLT is default retention duration is 7 days. we would like to extend to 60 days. since we cannot alter the table properties, how can I achieve this change?

  • 802 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh_Tiwari
Databricks Employee
  • 0 kudos

Hi @EdwardLui , Thank you for reaching out to our community! We're here to help you.  To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your feedb...

  • 0 kudos
georgecalvert
by New Contributor
  • 1540 Views
  • 2 replies
  • 0 kudos

ConcurrentAppendException Liquid Clustered Table Different Row Concurrent Writes

I have multiple databricks jobs performing a MERGE command simultaneously into the same liquid clustered table but for different rows of data and I am receiving the following error message: [DELTA_CONCURRENT_APPEND] ConcurrentAppendException: Files w...

  • 1540 Views
  • 2 replies
  • 0 kudos
Latest Reply
Rishabh_Tiwari
Databricks Employee
  • 0 kudos

Hi @georgecalvert , Thank you for reaching out to our community! We're here to help you.  To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your f...

  • 0 kudos
1 More Replies
ibrar_aslam
by New Contributor
  • 1044 Views
  • 1 replies
  • 0 kudos

Delta live table not refreshing - window function

We have a list of streaming tables populated by Autoloader from files on S3, which serve as sources for our live tables. After the Autoloader Delta pipeline completes, we trigger a second Delta Live Tables (DLT) pipeline to perform a deduplication op...

  • 1044 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh_Tiwari
Databricks Employee
  • 0 kudos

Hi @ibrar_aslam , Thank you for reaching out to our community! We're here to help you.  To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your fee...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels