cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

BWong
by New Contributor III
  • 2451 Views
  • 2 replies
  • 1 kudos

Overwriting schema in Delta Live Tables

Hi allI have a table created by DLT. Initially I specified cloudFiles.inferColumnTypes to false and all columns are stored as strings. However, I now want to use cloudFiles.inferColumnTypes=true. I dropped the table and re-ran the pipeline, which fai...

  • 2451 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Billy Wong​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 1 kudos
1 More Replies
pt-jake
by New Contributor II
  • 2059 Views
  • 2 replies
  • 2 kudos

Arrays of complex type always evaluate to ARRAY<STRING>?

Arrays of complex types seemingly always evaluate to ARRAY<STRING>. Therefore, casting or attempting to load JSON data with empty array values fails. For example, attempting to cast a JSON value of {"likes": []...} on load to the following table sche...

  • 2059 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Jake Neyer​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 2 kudos
1 More Replies
akc
by New Contributor III
  • 1137 Views
  • 3 replies
  • 5 kudos

Resolved! Training models on big or small clusters

I have a workflow with a model which trains every sunday in Azure Databricks. Sometimes the workflow fails as the max wait time is exceeded (currently I am using 1200 seconds). To solve the problem I was thinking of either increasing the wait time or...

  • 1137 Views
  • 3 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Andreas Kaae​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 5 kudos
2 More Replies
Raghav_597352
by New Contributor II
  • 1068 Views
  • 2 replies
  • 4 kudos

Resolved! Workspace not getting created

Hey guys,I tried to create a workspace, I didn't encountered error like this. I provided everything correctly but don't know why I'm getting this. Tried doing it by using different Data bricks Id and AWS ID also access this on AWS Root account

Capture3
  • 1068 Views
  • 2 replies
  • 4 kudos
Latest Reply
Priyag1
Honored Contributor II
  • 4 kudos

https://docs.gcp.databricks.com/administration-guide/workspace/create-workspace.html

  • 4 kudos
1 More Replies
Prank
by New Contributor III
  • 689 Views
  • 1 replies
  • 1 kudos

Access DBU used per cluster using within Databricks Clusters

Could it be possible, we can retrieve the DBU's on cluster basis within Databricks Notebook itself?This info we get in the compute tab in Databricks for each cluster as Active DBU/hr.

  • 689 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

It wont be possible to access DBU used per cluster within DB Clusters.

  • 1 kudos
Chinu
by New Contributor III
  • 324 Views
  • 0 replies
  • 0 kudos

Pulling query history only for the last 5 mins using "/api/2.0/sql/history/queries" api

I know query history api provides filter_by option with start and end time in ms but I was wondering if I can get only the last 5 mins of query data every time I run the api call (using telegraf to call the api). Is it possible I can use relative dat...

  • 324 Views
  • 0 replies
  • 0 kudos
Enthusiastic_Da
by New Contributor II
  • 1418 Views
  • 0 replies
  • 0 kudos

how to read columns dynamically using pyspark

I have a table called MetaData and what columns are needed in the select are stored in MetaData.columnsI would like to read columns dynamically from MetaData.columns and create a view based on that.csv_values = "col1, col2, col3, col4"df = spark.crea...

  • 1418 Views
  • 0 replies
  • 0 kudos
drewtoby
by New Contributor II
  • 2248 Views
  • 2 replies
  • 1 kudos

Resolved! How to Pull Cached SQL Table into Python Dictionary?

Hello,I have been working on this issue as a proof of concept - it would be extremely helpful to iterate through tables via loops in a few scenarios. I have a simple three column dimension that I added to a cached table.cache lazy table hedis_cache s...

Method 1 Method 2
  • 2248 Views
  • 2 replies
  • 1 kudos
Latest Reply
drewtoby
New Contributor II
  • 1 kudos

Got it to work, thank you for the tip! I needed to convert the dataframe over to a pandas dataframehttps://www.geeksforgeeks.org/convert-pyspark-dataframe-to-dictionary-in-python/

  • 1 kudos
1 More Replies
AkasBala
by New Contributor III
  • 1144 Views
  • 4 replies
  • 3 kudos

Unity Catalog Primary key column taking duplicates

I have Updated a Hive Meta Store from a Unity Catalog. I have setup Primary keys on the table. When I try to insert duplicates its succeeding Inserts and seems like PK is not working. Anyone else seeing such behaviour ?

  • 1144 Views
  • 4 replies
  • 3 kudos
Latest Reply
AkasBala
New Contributor III
  • 3 kudos

@Debayan Mukherjee​ Any info on the above plz ??

  • 3 kudos
3 More Replies
Anonymous
by Not applicable
  • 241 Views
  • 0 replies
  • 0 kudos

docs.databricks.com

What Serverless features are you using on Databricks? I am curious to know.Is it Databricks SQL Serverless or Model Serving?Proceed here to Compare serverless compute to other Databricks architectureshttps://docs.databricks.com/serverless-compute/ind...

  • 241 Views
  • 0 replies
  • 0 kudos
Anuj93
by New Contributor III
  • 475 Views
  • 0 replies
  • 0 kudos

Change Azure Databricks cluster owner

I wanted to add secrets to spark conf of the cluster but i am not able to because i am not the cluster owner. I want to know how can we change the cluster owner?

  • 475 Views
  • 0 replies
  • 0 kudos
Ryu1
by New Contributor
  • 472 Views
  • 0 replies
  • 0 kudos

Other than the "account admin" permission, is there a small permission or role to collect only catalog information?

I am going to use an open source called "datahub" to collect and share metadata information of databricks. (https://datahubproject.io/)Recently, however, there has been a big challenge. That is, to collect the unity catalog information of databricks,...

  • 472 Views
  • 0 replies
  • 0 kudos
Dean_Lovelace
by New Contributor III
  • 2387 Views
  • 1 replies
  • 1 kudos

Resolved! Efficiently move multiple files with dbutils.fs.mv command on abfs storage

As part of my batch processing I archive a large number of small files received from the source system each day using the dbutils.fs.mv command. This takes hours as dbutils.fs.mv moves the files one at a time.How can I speed this up?

  • 2387 Views
  • 1 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@Dean Lovelace​ You can use multithreading.See example here: https://nealanalytics.com/blog/databricks-spark-jobs-optimization-techniques-multi-threading/

  • 1 kudos
Phani1
by Valued Contributor
  • 5887 Views
  • 2 replies
  • 2 kudos

Resolved! Web application integrated with Gradio or streamlit on Databricks

We are trying to run a web application integrated with Gradio on Databricks. Although, we have configured launch parameter with (share="True")The app executes and gives us output but it keeps on running with no Public URL is generated:o/p: Running on...

  • 5887 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Janga Reddy​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 2 kudos
1 More Replies
Labels
Top Kudoed Authors