cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Nathant93
by New Contributor III
  • 7268 Views
  • 1 replies
  • 0 kudos

Resolved! Date formatting

Does anyone know how to change the format of a date like this Dec 17 2016 8:22PMinto yyyy-MM-dd hh:mm:ss?Thanks

  • 7268 Views
  • 1 replies
  • 0 kudos
Latest Reply
Krishnamatta
Contributor
  • 0 kudos

 Convert to timestamp first and then format to stringselect  date_format(to_timestamp('Dec 17 2016 8:22PM', 'MMM dd yyyy h:ma'), "yyyy-MM-dd HH:mm:ss")Here is the documentation for this:https://docs.databricks.com/en/sql/language-manual/sql-ref-datet...

  • 0 kudos
Chris_sh
by Databricks Partner
  • 1626 Views
  • 0 replies
  • 0 kudos

Enhancement Request: DLT: Infer Schema Logic/Merge Logic

Currently when DLT runs it observes NULL values in a column and infers that that column should be a string by default. The next time that table runs numeric values are added and it infers that it is now a numeric column. DLT tries to merge these two ...

  • 1626 Views
  • 0 replies
  • 0 kudos
Randy
by New Contributor III
  • 2651 Views
  • 1 replies
  • 0 kudos

Resolved! Unable to Write Table to Synapse 'x' has a data type that cannot participate in a columnstore index.

We have a process that creates a table in Synapse then attempts to write the Data generated in Databricks to it. We are able to create the table no problem but when we go to copy the data we keep getting an error that the column has a data type that ...

  • 2651 Views
  • 1 replies
  • 0 kudos
Latest Reply
Randy
New Contributor III
  • 0 kudos

Resolved

  • 0 kudos
learnerbricks
by New Contributor II
  • 8533 Views
  • 4 replies
  • 0 kudos

Unable to save file in DBFS

I have took the azure datasets that are available for practice. I got the 10 days data from that dataset and now I want to save this data into DBFS in csv format. I have facing an error :" No such file or directory: 'No such file or directory: '/dbfs...

  • 8533 Views
  • 4 replies
  • 0 kudos
Latest Reply
pardosa
New Contributor II
  • 0 kudos

Hi,after some exercise you need to aware folder create in dbutils.fs.mkdirs("/dbfs/tmp/myfolder") it's created in /dbfs/dbfs/tmp/myfolderif you want to access path to_csv("/dbfs/tmp/myfolder/mytest.csv") you should created with this script dbutils.fs...

  • 0 kudos
3 More Replies
MarcintheCloud
by New Contributor II
  • 1712 Views
  • 0 replies
  • 1 kudos

Is it possible to clone/read an existing external Iceberg table in Databricks?

Hello, I've been experimenting with trying to read and/or clone an existing Iceberg table into Databricks/Delta. I have an Azure Blob Storage container (configured to use absf for access) that contains an existing Iceberg table structure (data in par...

  • 1712 Views
  • 0 replies
  • 1 kudos
ilarsen
by Contributor
  • 3194 Views
  • 1 replies
  • 0 kudos

Auto Loader and source file structure optimisation

Hi.  I have a question, and I've not been able to find an answer.  I'm sure there is one...I just haven't found it through searching and browsing the docs. How much does it matter (if it is indeed that simple) if source files read by auto loader are ...

  • 3194 Views
  • 1 replies
  • 0 kudos
Rubini_MJ
by New Contributor
  • 11135 Views
  • 1 replies
  • 0 kudos

Resolved! Other memory of the driver is high even in a newly spun cluster

Hi Team Experts,    I am experiencing a high memory consumption in the other part in the memory utilization part in the metrics tab. Right now am not running any jobs but still out of 8gb driver memory 6gb is almost full by other and only 1.5 gb is t...

  • 11135 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16539034020
Databricks Employee
  • 0 kudos

Hello,  Thanks for contacting Databricks Support.  Seems you are concern with high memory consumption in the "other" category in the driver node of a Spark cluster. As there are no logs/detail information provided, I only can address several potentia...

  • 0 kudos
saiprasadambati
by New Contributor III
  • 6914 Views
  • 7 replies
  • 1 kudos

Resolved! examples on python sdk for install libraries

Hi Everyone,I'm planning to use databricks python cli "install_libraries"can some one pls post examples on function install_libraries https://github.com/databricks/databricks-cli/blob/main/databricks_cli/libraries/api.py

  • 6914 Views
  • 7 replies
  • 1 kudos
Latest Reply
Loop-Insist
New Contributor II
  • 1 kudos

Here you go using Python SDKfrom databricks.sdk import WorkspaceClientfrom databricks.sdk.service import computew = WorkspaceClient(host="yourhost", token="yourtoken")# Create an array of Library objects to be installedlibraries_to_install = [compute...

  • 1 kudos
6 More Replies
JVesely
by New Contributor III
  • 2379 Views
  • 1 replies
  • 0 kudos

Resolved! DLT CDC SCD-1 pipeline not showing stats when reading from parquet file

Hi,I followed the tutorial here: https://docs.databricks.com/en/delta-live-tables/cdc.html#how-is-cdc-implemented-with-delta-live-tablesThe only change I did is that data is not appended to a table but is read from a parquet file. In practice this me...

  • 2379 Views
  • 1 replies
  • 0 kudos
Latest Reply
JVesely
New Contributor III
  • 0 kudos

My bad - waiting a bit and doing a proper screen refresh does show the numbers. 

  • 0 kudos
Anonymous
by Not applicable
  • 9516 Views
  • 8 replies
  • 2 kudos
  • 9516 Views
  • 8 replies
  • 2 kudos
Latest Reply
djhs
New Contributor III
  • 2 kudos

I also tried to leverage this endpoint (inferred from devtools): https://<workspace_id>.cloud.databricks.com/sql/api/dashboards/import with the exported dashboard (the dbdash file) in the request payload. It returns a 200 but nothing happens. Maybe s...

  • 2 kudos
7 More Replies
KiranKondamadug
by New Contributor II
  • 1756 Views
  • 0 replies
  • 0 kudos

Databricks Mosaic's grid_polyfill() is taking longer to explode the index when run using PySpark

Pyspark Configuration: pyspark --packages io.delta:delta-core_2.12:2.4.0,org.apache.hadoop:hadoop-aws:3.3.4,io.delta:delta-storage-s3-dynamodb:2.4.0 --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark...

Data Engineering
Delta Lake
Explode
mosaic
spark
  • 1756 Views
  • 0 replies
  • 0 kudos
burusam
by New Contributor
  • 7860 Views
  • 1 replies
  • 0 kudos

Row INSERT into table does not persist

I have this script:  from databricks import sql import os import pandas as pd databricksToken = os.environ.get('DATABRICKS_TOKEN') connection = sql.connect(server_hostname = "", http_path = "", access_token ...

  • 7860 Views
  • 1 replies
  • 0 kudos
Latest Reply
Emil_Kaminski
Contributor II
  • 0 kudos

HI, are you closing connection at the end?cursor.close() connection.close()Also, can you elaborate what you mean by saying "when I reload the table"Cheers

  • 0 kudos
giuseppegrieco
by New Contributor III
  • 22628 Views
  • 5 replies
  • 6 kudos

Workflow service principle owned can't checkout git repository

I am trying to deploy a workflow where the owner is a service principal and I am using git integration (backend with azure devops), when I run the workflow it says that it doesn't have permissions to checkout the repo.run failed with error message F...

  • 22628 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Giuseppe Grieco​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us ...

  • 6 kudos
4 More Replies
js54123875
by New Contributor III
  • 13097 Views
  • 2 replies
  • 0 kudos

Power BI - Databricks Connection using Service Principal PAT Refresh

What is best practice for automatically refreshing service princpal PAT in Power BI for a connection to a Databricks dataset? Ideally when the PAT is updated it will automatically be stored in Azure Key Vault, is there a way that Power BI can pick it...

Data Engineering
Azure Key Vault
Personal Access Token
Power BI
Service Principal
  • 13097 Views
  • 2 replies
  • 0 kudos
Latest Reply
SSundaram
Databricks Partner
  • 0 kudos

May be not the recommended way, but try to create a token that do not expire for such use cases. Ideally you will need like a custom powershell type of solution to get it automated completely. 

  • 0 kudos
1 More Replies
Labels