cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

somedeveloper
by New Contributor III
  • 374 Views
  • 1 replies
  • 0 kudos

Modifying size of /var/lib/lxc

Good morning,When running a library (sparkling water) for a very large dataset, I've noticed that during an export procedure the /var/lib/lxc storage is being used. Since the storage seems to be at a static 130GB of memory, this is a problem because ...

  • 374 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Unfortunately this is a setting that cannot be increased on customer side

  • 0 kudos
ChristianRRL
by Valued Contributor
  • 364 Views
  • 1 replies
  • 0 kudos

Databricks Workflows - Generate Tasks Programmatically

Hi there,I've used databricks workflows to explicitly create tasks with known input parameters (either user input or default parameters). But I'm wondering, what if I want the output of one task to be a list of specific ID's (e.g. id = [7,8,10,13,27]...

  • 364 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

This sounds like a great fit for the For Each task type! Here is the blog, and the documentation

  • 0 kudos
jeroaranda
by New Contributor II
  • 1302 Views
  • 1 replies
  • 0 kudos

How to pass task name as parameter in scheduled job that will be used as a schema name in query

I want to run a parametrized sql query in a task. Query: select * from {{client}}.catalog.table with client value being {{task.name}}.if client is a string parameter, it is replaced with quotes which throws an error.if table is a dropdown list parame...

  • 1302 Views
  • 1 replies
  • 0 kudos
Latest Reply
Zach_Jacobson23
Databricks Employee
  • 0 kudos

Try this:select * from identifier(:catalog||'.schema.table') The :catalog is a parameter within DBSQLReplace schema and table with actual names

  • 0 kudos
VicS
by New Contributor III
  • 765 Views
  • 3 replies
  • 2 kudos

How to use custom whl file + pypi repo with a job cluster in asset bundles?

I tried looking through the documentation but it is confusing at best and misses important parts at worst.  Is there any place where the entire syntax and ALL options for asset bundle YAMLs are described? I found this https://docs.databricks.com/en/d...

  • 765 Views
  • 3 replies
  • 2 kudos
Latest Reply
VicS
New Contributor III
  • 2 kudos

It took me a while to realize the distinction of the keys inside the task - so for anyone else looking into this: only one of the following keys can exist in a task definition:    tasks: - task_key: ingestion_delta # existing_c...

  • 2 kudos
2 More Replies
ns_casper
by New Contributor II
  • 1114 Views
  • 4 replies
  • 1 kudos

Databricks Excel ODBC driver bug

Hello!I might have experienced a bug with the ODBC driver. We have an issue where given certain priviledges in databricks, the ODBC driver is unable to show any schemas/tables.When we click the 'expand' button on any catalog in the list (of which we ...

  • 1114 Views
  • 4 replies
  • 1 kudos
Latest Reply
jbibs
New Contributor II
  • 1 kudos

Following this post - we are also faced with the same issue. @KTheJoker- when I'm connecting and trying to expand a catalog, I do see the query fire off in the SQL Warehouse query history but in Excel nothing is returned. I can see the schemas/tables...

  • 1 kudos
3 More Replies
JissMathew
by Contributor III
  • 780 Views
  • 6 replies
  • 1 kudos

Structured streaming in Databricks using delta table

Hi everyone, I’m new to Databricks and exploring its features. I’m trying to implement Change Data Capture (CDC) from the bronze layer to the silver layer using streaming. Could anyone share sample code or reference materials for implementing CDC wit...

  • 780 Views
  • 6 replies
  • 1 kudos
Latest Reply
Mike_Szklarczyk
Contributor
  • 1 kudos

You can also look at https://www.databricks.com/resources/demos#tutorials 

  • 1 kudos
5 More Replies
guangyi
by Contributor III
  • 375 Views
  • 2 replies
  • 0 kudos

DLT pipeline observability questions (and maybe suggestions)

All my questions is around this code block@Dlt.append_flow(target=”target_table”): def flow_01(): df = spark.readStream.table(“table_01”) @dlt.append_flow(target=”target_table”): def flow_02(): df = spark.readStream.table(“table_02”)The first qu...

  • 375 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nam_Nguyen
Databricks Employee
  • 0 kudos

Hello @guangyi , I am getting back to you with some insights Regarding your first question about checkpointing You can manually check the checkpointing location of your stream table. The checkpoints of your Delta Live Tables are under Storage locatio...

  • 0 kudos
1 More Replies
madams
by Contributor
  • 484 Views
  • 1 replies
  • 0 kudos

Databricks Asset Bundle - variables for job trigger

I'm using the Databricks CLI to deploy an asset bundle for a job.  I'm attempting to setup the configuration such that the "dev" target does not have a trigger on it, and the "prod" target does.  Essentially the dev job is not scheduled to run and th...

  • 484 Views
  • 1 replies
  • 0 kudos
Latest Reply
SigaEd
New Contributor II
  • 0 kudos

in your dev target, you can add mode to pause all trigger.targets: dev: mode: development DAB also have new update,you can also use preset  to handle different target setting.targets: dev: presets: name_prefix: "testing_" # pre...

  • 0 kudos
AWS1567
by New Contributor III
  • 20958 Views
  • 10 replies
  • 6 kudos

We've encountered an error logging you in.

I'm trying to login for past two days and i'm still facing this error: "We've encountered an error logging you in." I've tried to reset the password multiple times and nothing happened. My friend is also not able to login. I request you to resolve t...

Databricks_login_issue
  • 20958 Views
  • 10 replies
  • 6 kudos
Latest Reply
rmutili
New Contributor II
  • 6 kudos

 Hey, I am not able to login to my work databrick's account. I am getting the above errors.

  • 6 kudos
9 More Replies
JensV
by New Contributor II
  • 280 Views
  • 1 replies
  • 0 kudos

Resolved! Using sql inside Notebook using question marks

Hi all,I have a very quick question that I hope someone can help with.I want to execute a very simple sql statement like %sql select * from json.`/Volumes/adfmeta/Objects.json` where ObjectName like '%SGm$RITWebsader$911a%'However, the sql does not ...

  • 280 Views
  • 1 replies
  • 0 kudos
Latest Reply
JAHNAVI
Databricks Employee
  • 0 kudos

Hi @JensV , Question marks are used as parameter placeholders, so could you please try to escape the question mark using Backslashes?select * from json.`/Volumes/adfmeta/Objects.json` where ObjectName like '%SGm\\$RITWebsader\\$911a%'Alternatively, w...

  • 0 kudos
angel_ba
by New Contributor II
  • 1713 Views
  • 1 replies
  • 0 kudos

use_cached_result

i am trying to execute same query on 3 different platforms - dbeaver, python notebook and sql workflow.I was expecting after first execution of the query irrespective of the platform, subsequent execution of same query should NOT re-compute. However ...

  • 1713 Views
  • 1 replies
  • 0 kudos
Latest Reply
SenthilRT
New Contributor III
  • 0 kudos

I don't think its possible unless the results written into a  table and its being used in the queries across the client. Pls refer to this https://docs.databricks.com/en/sql/user/queries/query-caching.html

  • 0 kudos
filipjankovic
by New Contributor
  • 3957 Views
  • 1 replies
  • 0 kudos

JSON string object with nested Array and Struct column to dataframe in pyspark

I am trying to convert JSON string stored in variable into spark dataframe without specifying schema, because I have a big number of different tables, so it has to be dynamically. I managed to do it with sc.parallelize, but since we are moving to Uni...

  • 3957 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

Hi filipjankovic, SparkContext sc is a Spark 1.0 API and is deprecated on Standard and Serverless compute. However, your input data is a list of dictionaries, which are supported with spark.createDataFrame. This should give you identical output witho...

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 19124 Views
  • 12 replies
  • 12 kudos

Resolved! dbutils or other magic way to get notebook name or cell title inside notebook cell

Not sure it exists but maybe there is some trick to get directly from python code:NotebookNameCellTitlejust working on some logger script shared between notebooks and it could make my life a bit easier

  • 19124 Views
  • 12 replies
  • 12 kudos
Latest Reply
rtullis
New Contributor II
  • 12 kudos

I got the solution to work in terms of printing the notebook that I was running; however, what if you have notebook A that calls a function that prints the notebook name, and you run notebook B that %runs notebook A?  I get the notebook B's name when...

  • 12 kudos
11 More Replies
rushi29
by New Contributor III
  • 922 Views
  • 2 replies
  • 3 kudos

Using Managed Identity Authentication in Unity Catalog using pyodbc

Hello,I am having trouble using Managed Identity Authentication in Unity Catalog using pyodbc in Azure Databricks. The same code works on a "Legacy Shared Compute". The code snippet is below: import pyodbc jdbc_url = (    "DRIVER={ODBC 17 DRIVER PATH...

  • 922 Views
  • 2 replies
  • 3 kudos
Latest Reply
mbenavent
New Contributor II
  • 3 kudos

Thank you very much!I have spent an enormous amount of hours fighting with this and in the end it was the type of cluster... I hope that this problem will be solved in the future, because affects the developments when you use databricks-connect and s...

  • 3 kudos
1 More Replies
Nes_Hdr
by New Contributor III
  • 1073 Views
  • 1 replies
  • 0 kudos

Path based access not supported for tables with row filters?

Hello,  I have encountered an issue recently and was not able to find a solution yet. I have a job on databricks that creates a table using dbt (dbt-databricks>=1.0.0,<2.0.0). I am setting the location_root configuration so that this table is externa...

Data Engineering
dbt
row_filter
  • 1073 Views
  • 1 replies
  • 0 kudos
Latest Reply
Nes_Hdr
New Contributor III
  • 0 kudos

To recreate the issue:PS. Good to know: using dbt to create materialized tables is equivalent to running "create or replace table table_name"The following code will create an external table with row security:create or replace table table_name using d...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels