cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826994223
by Databricks Employee
  • 2380 Views
  • 2 replies
  • 0 kudos

Don't want checkpoint in delta

Suppose I am not interested in checkpoints, how can I disable Checkpoints write in delta

  • 2380 Views
  • 2 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

Writing statistics in a checkpoint has a cost which is visible usually only for very large tables. However it is worth mentioning that, this statistics would be very useful for data skipping which speeds up subsequent operations. In Databricks Runti...

  • 0 kudos
1 More Replies
Digan_Parikh
by Databricks Employee
  • 2164 Views
  • 1 replies
  • 0 kudos

Resolved! Delta Live Table - landing database?

Where do you specify what database the DLT tables land in?

  • 2164 Views
  • 1 replies
  • 0 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 0 kudos

The target key, when creating the pipeline specifies the database that the tables get published to. Documented here - https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-user-guide.html#publish-tables

  • 0 kudos
Anonymous
by Not applicable
  • 3066 Views
  • 1 replies
  • 0 kudos

Resolved! Questions on using Docker image with Databricks Container Service

Specifically, we have in mind:* Create a Databricks job for testing API changes (the API library is built in a custom Jar file)* When we want to test an API change, build a Docker image with the relevant changes in a Jar file* Update the job configur...

  • 3066 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

>Where do we put custom Jar files when building the Docker image? /databricks/jars>How do we update the job configuration so that the job’s cluster will be built with this new Docker image, and how long do we expect this re-configuring process to tak...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 3249 Views
  • 1 replies
  • 0 kudos

Resolved! Z-order or Partitioning? Which is better for Data skipping?

For Delta tables, among Z-order and Partioning which is recommended technique for efficient Data Skipping

  • 3249 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Partition pruning is the most efficient way to ensure Data skipping. However, choosing the right column for partitioning is very important. It's common to see choosing the wrong column for partitioning can cause a large number of small file problems ...

  • 0 kudos
Srikanth_Gupta_
by Databricks Employee
  • 2097 Views
  • 2 replies
  • 0 kudos

I have several thousands of Delta tables in my Production, what is the best way to get counts

if I might need a dashboard to see increase in number of rows on day to day basis, also a dashboard that shows size of Parquet/Delta files in my Lake?

  • 2097 Views
  • 2 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

val db = "database_name" spark.sessionState.catalog.listTables(db).map(table=>spark.sessionState.catalog.externalCatalog.getTable(table.database.get,table.table)).filter(x=>x.provider.toString().toLowerCase.contains("delta"))The above code snippet wi...

  • 0 kudos
1 More Replies
User16826992666
by Databricks Employee
  • 7520 Views
  • 2 replies
  • 0 kudos
  • 7520 Views
  • 2 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

If the read stream definition has something similar to val df = spark .read .format("kafka") .option("kafka.bootstrap.servers", "host1:port1,host2:port2") .option("subscribePattern", "topic.*") .option("startingOffsets", "earliest")resettin...

  • 0 kudos
1 More Replies
Anonymous
by Not applicable
  • 2194 Views
  • 2 replies
  • 0 kudos

Changing default Delta behavior in DBR 8.x for writes

Is there anyway to add a Spark Config that reverts the default behavior when doing tables writes from Delta to Parquet in DBR 8.0+? I know you can simply specify .format("parquet") but that could involve a decent amount of code change for some client...

  • 2194 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Thanks @Ryan Chynoweth​ !

  • 0 kudos
1 More Replies
User15761966159
by Databricks Employee
  • 1554 Views
  • 1 replies
  • 0 kudos

Does removing a User from the workspace automatically invalidate their tokens

If you have a user that is removed from the workspace, are the tokens they've created automatically invalidated?

  • 1554 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Yes, PAT tokens will be invalid if a user is removed since those tokens are attached to their current credentials and access.

  • 0 kudos
Digan_Parikh
by Databricks Employee
  • 2472 Views
  • 1 replies
  • 0 kudos

Resolved! Package cells for Python notebooks

Do we have an analogous concept to package cells for Python notebooks?

  • 2472 Views
  • 1 replies
  • 0 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 0 kudos

You can just declare your classes and in one cell, and use them in the others. It is recommended to get all your classes in one notebook, and use %run in the other to "import" those classes.The one thing you cannot do is to literally import a folder/...

  • 0 kudos
User16826987838
by Databricks Employee
  • 1597 Views
  • 1 replies
  • 1 kudos
  • 1597 Views
  • 1 replies
  • 1 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 1 kudos

@Rathna Sundaralingam​  Yes, in the visualization editor select the following:Type: MapUnder General: Map: USAKey Column: you need a state column here (for ex: CA, NY)Target Field: USPS AbbreviationValue Column: your desired value for the heatmap.

  • 1 kudos
Digan_Parikh
by Databricks Employee
  • 1952 Views
  • 1 replies
  • 0 kudos

Resolved! %run in R?

Is %run magic command supported in R notebook? 

  • 1952 Views
  • 1 replies
  • 0 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 0 kudos

The %/magic commands are notebook commands and not tied to any language so R notebook also supports %run.

  • 0 kudos
User16826992666
by Databricks Employee
  • 2192 Views
  • 1 replies
  • 0 kudos

Resolved! In Databricks SQL how can I tell if my query is using Photon?

I have turned Photon on in my endpoint, but I don't know if it's actually being used in my queries. Is there some way I can see this other than manually testing queries with Photon turned on and off?

  • 2192 Views
  • 1 replies
  • 0 kudos
Latest Reply
Digan_Parikh
Databricks Employee
  • 0 kudos

@Trevor Bishop​ If you go to the History tab in DBSQL, click on the specific query and look at the execution details. At the bottom, you will see "Task time in Photon".

  • 0 kudos
Srikanth_Gupta_
by Databricks Employee
  • 1846 Views
  • 1 replies
  • 1 kudos
  • 1846 Views
  • 1 replies
  • 1 kudos
Latest Reply
User15787040559
Databricks Employee
  • 1 kudos

Only Delta Sharing will be initially OSS, see here.DLT and Unity Catalog will be Databricks only.

  • 1 kudos
Labels