cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ls
by New Contributor III
  • 672 Views
  • 2 replies
  • 1 kudos

Resolved! Are lambda functions considered bad practice?

As the title suggests I have a bunch of lambda functions within my notebooks and I wanted to know if it is considered to be "bad" to have them in there.output_list = json_files.mapPartitions(lambda partition: iter([process_partition(partition)])) \.f...

  • 672 Views
  • 2 replies
  • 1 kudos
Latest Reply
Satyadeepak
Databricks Employee
  • 1 kudos

Using lambda functions within notebooks is not inherently "bad," but there are some considerations to keep in mind. While this code is functional, chaining multiple lambda functions can reduce readability and debugging capabilities in Databricks note...

  • 1 kudos
1 More Replies
lauraxyz
by Contributor
  • 383 Views
  • 1 replies
  • 0 kudos

Is there a way to analyze/monitor WRITE operations in a Notebook

I have user input as a Notebook, which process data and save it to a global temp view.    Now I have my caller notebook to execute the input Notebook with dbutils.notebook API. Since the user can do anything in their notebook, I would like to analyze...

  • 383 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @lauraxyz, I think you can use the system table and audit logs to achieve that monitoring:  https://docs.databricks.com/en/admin/account-settings/audit-logs.html

  • 0 kudos
greenned
by New Contributor
  • 2936 Views
  • 1 replies
  • 0 kudos

Resolved! not using defined clusters when deploying workflows in development mode by asset bundle

Hi, I'm using databricks asset bundle to deploy workflows.but when i deploy in development mode. the workflows do not use the new clusters, just using existing clusterscan i deploy with the defined new clusters in development mode?? 

greenned_0-1724930366152.png
  • 2936 Views
  • 1 replies
  • 0 kudos
Latest Reply
Satyadeepak
Databricks Employee
  • 0 kudos

You could use mode: development and then deploy with --compute-id and specify the ID of your personal compute cluster to replace the existing clusters. Only with mode: development will the compute ID replace existing, or per-task cluster specs.

  • 0 kudos
manuel-barreiro
by New Contributor II
  • 951 Views
  • 5 replies
  • 0 kudos

Unable to view hive_metastore schemas although I have the same permissions as co-workers who can

Hello! I'm having trouble accessing the schemas of the hive_metastore. I have the same level of permissions as my fellow coworkers who don't have any trouble viewing the schemas. Please I would really appreciate it if you could help me with this beca...

manuelbarreiro_0-1736274758836.png
  • 951 Views
  • 5 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Where you able to get this issue resolved after looking at the permissions level on your schema and tables?

  • 0 kudos
4 More Replies
yevsh
by New Contributor II
  • 1628 Views
  • 4 replies
  • 0 kudos

UDF java can't access files in Unity Catalog - Operation not permitted

I am using Databricks on Azure.in pyspark I register UDF java functionspark.udf.registerJavaFunction("foo", "com.foo.Foo", T.StringType())Foo tries to load a file,  using Files.readAllLines(), located in the Databricks unity catalog .stderr log:Tue J...

  • 1628 Views
  • 4 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

To address the issue of needing to run initialization code that reads file content during the load of a UDF (User Defined Function) in Databricks, you should avoid performing file operations in the constructor due to security restrictions. Instead, y...

  • 0 kudos
3 More Replies
Michael_Appiah
by Contributor
  • 11717 Views
  • 14 replies
  • 8 kudos

Parameterized spark.sql() not working

Spark 3.4 introduced parameterized SQL queries and Databricks also discussed this new functionality in a recent blog post (https://www.databricks.com/blog/parameterized-queries-pyspark)Problem: I cannot run any of the examples provided in the PySpark...

Michael_Appiah_0-1704459542967.png Michael_Appiah_1-1704459570498.png
  • 11717 Views
  • 14 replies
  • 8 kudos
Latest Reply
adriennn
Valued Contributor
  • 8 kudos

option 2 can be done with TEMPORARY LIVE VIEWs (or TEMPORARY STREAMING TABLE) over a unity catalog table, so not "permanent" I guess.> for the gold layer is to save that spark SQL code into .py files for each table and import them in the DLT pipeline...

  • 8 kudos
13 More Replies
jeremy98
by Honored Contributor
  • 2706 Views
  • 7 replies
  • 6 kudos

Migrating logic from Airflow DAGs to Databricks Workflow

Hello community,I'm planning to migrate some logics of Airflow DAGs on Databricks Workflow. But, I was facing out to some doubts that I have in order to migrate (to find the respective) the logic of my actual code from DAGs to Workflow.There are two ...

  • 2706 Views
  • 7 replies
  • 6 kudos
Latest Reply
Walter_C
Databricks Employee
  • 6 kudos

You can use Asset Bundles https://docs.databricks.com/en/dev-tools/bundles/index.html 

  • 6 kudos
6 More Replies
Paul92S
by New Contributor III
  • 2149 Views
  • 12 replies
  • 5 kudos

Delta sharing service Issue making requests to Unity System Access tables

Hi all, We have been having an issue as of yesterday which I believe is related to queries against the system.access.table_linage in Unity Catalogs. This issue still persists todayWe get the following error:AnalysisException: [RequestId= ErrorClass=B...

table lineage.png delta sharing issue.png
  • 2149 Views
  • 12 replies
  • 5 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 5 kudos

Thanks team, please let me know if you need any other help!

  • 5 kudos
11 More Replies
jar
by Contributor
  • 1073 Views
  • 8 replies
  • 1 kudos

Databricks single user compute cannot write to storage

I've deployed unrestricted single user compute for each developer in our dev workspace and everything works fine except for writing to storage where the cell will continuously run but seemingly not execute anything. If I switch to an unrestricted sha...

  • 1073 Views
  • 8 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Adding to @saurabh18cs comments, also check if any instance profile attached to the cluster. What is the difference between the clusters, only access mode?

  • 1 kudos
7 More Replies
Anirudh077
by New Contributor III
  • 829 Views
  • 1 replies
  • 0 kudos

Resolved! Cannot create serverless sql warehouse, only classic and pro option available

Hey teamI am using databricks on Azure(East US region) and i have enabled serverless compute in Settings -> Feature Enablement. When i click on create sql workspace, i do not see serverless option.Any setting i am missing ?

  • 829 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anirudh077
New Contributor III
  • 0 kudos

I found the root cause for this issue, In Security and Compliance we had PCI-DSS selected and according to this doc we can not have that instead we can select HIPAA

  • 0 kudos
eballinger
by Contributor
  • 1842 Views
  • 4 replies
  • 2 kudos

Resolved! DLT notebook dynamic declaration

Hi Guys,We have a DLT pipeline that is reading data from landing to raw (csv files into tables) for approximately 80 tables. In our first attempt at this we declared each table separately in a python notebook. One @Dlt table declared per cell. Then w...

  • 1842 Views
  • 4 replies
  • 2 kudos
Latest Reply
VZLA
Databricks Employee
  • 2 kudos

Good catch and glad to hear you've identified the source of delay!

  • 2 kudos
3 More Replies
shubham_007
by Contributor III
  • 3264 Views
  • 5 replies
  • 3 kudos

Resolved! What are powerfull data quality tools/libraries to build data quality framework in Databricks ?

Dear Community Experts,I need your expert advice and suggestions on development of data quality framework. What are powerfull data quality tools or libraries are good to go for development of data quality framework in Databricks ? Please guide team.R...

  • 3264 Views
  • 5 replies
  • 3 kudos
Latest Reply
Takuya-Omi
Valued Contributor III
  • 3 kudos

Any short guidance on how to implement data quality framework in databricks ?With dbdemos, you can learn a practical architecture for data quality testing using the expectations feature of DLT. I hope this helps! (Please note that some DLT syntax mig...

  • 3 kudos
4 More Replies
Dean_Lovelace
by New Contributor III
  • 27256 Views
  • 13 replies
  • 2 kudos

How can I deploy workflow jobs to another databricks workspace?

I have created a number of workflows in the Databricks UI. I now need to deploy them to a different workspace.How can I do that?Code can be deployed via Git, but the job definitions are stored in the workspace only.

  • 27256 Views
  • 13 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

@itacdonev great option provided, @Dean_Lovelace you can also select the option View JSON on the Workflow and move to the option create, with this code you can use the API https://docs.databricks.com/api/workspace/jobs/create and create the job in th...

  • 2 kudos
12 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels