cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SOlivero
by New Contributor III
  • 1244 Views
  • 1 replies
  • 0 kudos

Scheduling Jobs with Multiple Git Repos on a Single Job Cluster

Hi,I'm trying to create a scheduled job that runs notebooks from three different repos. However, since a job can only be associated with one repo, I've had to create three separate jobs and a master job that triggers them sequentially.This setup work...

  • 1244 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi @SOlivero ,Try configuring a shared all-purpose cluster and set each job to use this existing cluster rather than creating new job-specific clusters, ensuring the cluster stays warm and avoiding startup delays. Another option is to restructure you...

  • 0 kudos
Erik
by Valued Contributor III
  • 6968 Views
  • 6 replies
  • 4 kudos

Resolved! Powerbi databricks connector should import column description

I posted this idea in ideas.powerbi.com as well, but it is quite unclear to me whether the powerbi databricks connector is in fact made by MS or Databricks, so I post it here as well!It is possible to add comments/descriptions to databricks database ...

  • 6968 Views
  • 6 replies
  • 4 kudos
Latest Reply
capstone
New Contributor II
  • 4 kudos

You can use this C# script in Tabular Editor to achieve this. Basically, all the comments can be accessed via the 'information_schema' in Databricks. Import the relevant columns from the schema using this query select * from samples.information_schem...

  • 4 kudos
5 More Replies
NaeemS
by New Contributor III
  • 2634 Views
  • 2 replies
  • 0 kudos

Handling Aggregations in Feature Function

Hi,Is it possible to cater aggregation using Feature Functions somehow. As we know that the logic defined in feature function is applied on a single row when a join is being performed. But do we have any mechanism to handle to aggregations too someho...

Data Engineering
Feature Functions
Feature Store
  • 2634 Views
  • 2 replies
  • 0 kudos
Latest Reply
rafaelsass
New Contributor II
  • 0 kudos

Hi @NaeemS !Have you managed to achieve this by any means? I'm facing the same question right now.

  • 0 kudos
1 More Replies
Paul92S
by New Contributor III
  • 14819 Views
  • 6 replies
  • 5 kudos

Resolved! DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Hi,I am having an issue of loading source data into a delta table/ unity catalog. The error we are recieving is the following:grpc_message:"[DELTA_EXCEED_CHAR_VARCHAR_LIMIT] Exceeds char/varchar type length limitation. Failed check: (isnull(\'metric_...

  • 14819 Views
  • 6 replies
  • 5 kudos
Latest Reply
willflwrs
New Contributor III
  • 5 kudos

Setting this config change before making the write command solved it for us:  spark.conf.set("spark.sql.legacy.charVarcharAsString", True) 

  • 5 kudos
5 More Replies
DataEnginerrOO1
by New Contributor II
  • 3583 Views
  • 5 replies
  • 0 kudos

Access for delta lake with serverless

I have an issue when trying to use the command display(dbutils.fs.ls("abfss://test@test.dfs.core.windows.net")). When I execute the command on my personal cluster, it works, and I can see the files. Before that, I set the following configurations:spa...

  • 3583 Views
  • 5 replies
  • 0 kudos
Latest Reply
Rjdudley
Honored Contributor
  • 0 kudos

Can your serverless compute access any storage in that storage account?  Something else to check is if your NCC is configured correctly: Configure private connectivity from serverless compute - Azure Databricks | Microsoft Learn.  However, if your se...

  • 0 kudos
4 More Replies
akshay716
by New Contributor III
  • 3598 Views
  • 7 replies
  • 1 kudos

Resolved! How to create Service Principal and access APIs like clusters list without adding to admin group

I have created a Databricks Managed Service Principal and trying to access the APIs like clusters list, job lists pipelines but without adding it to admin group I am getting empty list in response. There are other ways to get clusters by adding polic...

  • 3598 Views
  • 7 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Only admin access through account console can be given not read only unfortunately.

  • 1 kudos
6 More Replies
sachamourier
by Contributor
  • 3279 Views
  • 4 replies
  • 0 kudos

Use init script for Databricks job cluster via Azure Data Factory

Hello,I would like to install some libraries (both public and private) on a job cluster. I am using Azure Data Factory to run my Databricks notebooks and hence would like to use job clusters to run these jobs.I have passed my init script to the job c...

adf_init_script_config.png init_script.png
  • 3279 Views
  • 4 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @sachamourier, Have you considered using cluster libraries? The behavior you are observing you require additional debugging since init script is installed successfully, can you enable cluster logging and research through the logs: https://docs.dat...

  • 0 kudos
3 More Replies
staskh
by Contributor
  • 5655 Views
  • 1 replies
  • 1 kudos

Resolved! TIMESTAMP(NANOS,false) error

Hi,I'm getting Illegal Parquet type: INT64 (TIMESTAMP(NANOS,false)) error while trying to read a parquet file (generated outside of DataBricks). Unfortunately, due to security configuration, I do not have the ability to read it with pandas or similar...

  • 5655 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @staskh That error happens because data type is not supported natively. You can try with below spark setting: spark.conf.set(“spark.sql.legacy.parquet.nanosAsLong”, “true”)  

  • 1 kudos
LearningDatabri
by Contributor II
  • 16913 Views
  • 11 replies
  • 6 kudos

Resolved! Bootstrap Timeout during cluster start

sometimes while starting a cluster I am facing bootstrap timeout error, what is the reason? when I try the next time it starts the cluster.

  • 16913 Views
  • 11 replies
  • 6 kudos
Latest Reply
Amine8089
New Contributor II
  • 6 kudos

i have this issue, can someone help on that ?Instance bootstrap failed command: Bootstrap_e2e Instance bootstrap inferred timeout reason: Command_UpdateWorker_Slow Failure message (may be truncated): Bootstrap is terminated spontaneously by VM becaus...

  • 6 kudos
10 More Replies
narvinya
by New Contributor
  • 4078 Views
  • 1 replies
  • 0 kudos

What is the best approach to use Delta tables without Unity Catalog enabled?

Hello!I would like to work with delta tables outside of Databricks UI notebook. I know that the best option would be to use databricks-connect but I don’t have Unity Catalog enabled.What would be the most effective way to do so? I know that via JDBC ...

  • 4078 Views
  • 1 replies
  • 0 kudos
Latest Reply
NanthakumarYoga
New Contributor II
  • 0 kudos

Programatically, you can go for DeltaTables.forPath ( not forName which require Unity Catalog )... This works

  • 0 kudos
KSB
by Databricks Partner
  • 1622 Views
  • 1 replies
  • 0 kudos

databricks

Hi Team,Having excel file in sharepoint folder, and has to insert excel data into SQL table from databricks notebook . can i have clear steps on it. Dont have access to Azure Active Directory.  can anyone gives solution without using AZURE Active Dir...

  • 1622 Views
  • 1 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Databricks Partner
  • 0 kudos

Hi KSBYou could read direkt with Databricks from Excel Sharepoint with Graph API. Here is one possible way: https://community.databricks.com/t5/data-engineering/load-data-from-sharepoint-site-to-delta-table-in-databricks/td-p/16410However, you need t...

  • 0 kudos
Sergio_Linares
by Databricks Partner
  • 1983 Views
  • 1 replies
  • 0 kudos

When Sign in databricks partner-academy i can not see the courses

Dear partner academy team, I am writing to report an issue I am experiencing when trying to access the partner academy courses. Despite using my credentials, I am unable to view any of the courses. Could you please look into this and assist me in res...

  • 1983 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika_
Databricks Employee
  • 0 kudos

Hello @Sergio_Linares! Please file a ticket with the Databricks support team to get assistance with this issue.

  • 0 kudos
BillBishop
by New Contributor III
  • 916 Views
  • 2 replies
  • 0 kudos

DAB for_each_task python wheel fail

using python_wheel_wrapper experimental true allows me to use python_wheel_task on an older cluster.However, if I embed the python_wheel_task in a for_each_task it fails at runtime with: "Library installation failed for library due to user error.  Er...

  • 916 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @BillBishop, I will check on this internally as outcome does not seem to be correct. If possible, upgrade your cluster to DBR 14.1 or later. This would resolve the issue without relying on the experimental feature

  • 0 kudos
1 More Replies
rushi29
by New Contributor III
  • 5671 Views
  • 5 replies
  • 0 kudos

sparkContext in Runtime 15.3

Hello All, Our Azure databricks cluster is running under "Legacy Shared Compute" policy with 15.3 runtime. One of the python notebooks is used to connect to an Azure SQL database to read/insert data. The following snippet of code is responsible for r...

  • 5671 Views
  • 5 replies
  • 0 kudos
Latest Reply
jayct
New Contributor II
  • 0 kudos

@rushi29 @GangsterI ended up implementing pyodbc with the mssql driver using init scripts.Spark context is no longer usable in shared compute so that was the only approach we could take. 

  • 0 kudos
4 More Replies
ila-de
by New Contributor III
  • 5196 Views
  • 7 replies
  • 1 kudos

Resolved! databricks workspace import_dir not working without any failure message

Morning everyone!I`m trying to copy from the repo into the databricks workspace all the notebooks. I`m using the command: databricks workspace import_dir . /Shared/Notebooks, it will just print all the info regarding the Workspace API.If I launch dat...

  • 5196 Views
  • 7 replies
  • 1 kudos
Latest Reply
ila-de
New Contributor III
  • 1 kudos

Hi all,I`ve disinstalled and installed again databricks-cli and now worked.Is not a real solution but still it worked after one week...

  • 1 kudos
6 More Replies
Labels