cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

User16826994223
by Honored Contributor III
  • 506 Views
  • 0 replies
  • 0 kudos

Spark 3.0 Pandas UDF  Old vs New Pandas UDF interfaceThis slide shows the difference between the old and the new interface. The same here. The new int...

Spark 3.0 Pandas UDF Old vs New Pandas UDF interfaceThis slide shows the difference between the old and the new interface. The same here. The new interface can also be used for the existing Grouped Aggregate Pandas UDFs. In addition, the old Pandas U...

  • 506 Views
  • 0 replies
  • 0 kudos
User16826994223
by Honored Contributor III
  • 665 Views
  • 0 replies
  • 0 kudos

Cluster Sizees on DB sql Cluster size Driver size Worker count 2X-Small i3.2xlarge 1 X-Small i3.2xlarge 2 Small i3.4xlarge 4 Med...

Cluster Sizees on DB sql Cluster size Driver size Worker count 2X-Small i3.2xlarge 1 X-Small i3.2xlarge 2 Small i3.4xlarge 4 Medium i3.8xlarge 8 Large i3.8xlarge 16 X-Large i3.16xlarge 32 2X-Large i3.16xlarge...

  • 665 Views
  • 0 replies
  • 0 kudos
User16826994223
by Honored Contributor III
  • 796 Views
  • 0 replies
  • 0 kudos

Muti Cluster Load balancing Multi-cluster Load Balancing: the minimum and maximum number of clusters over which queries sent to the endpoint are distr...

Muti Cluster Load balancingMulti-cluster Load Balancing: the minimum and maximum number of clusters over which queries sent to the endpoint are distributed. The default is Off with a maximum of 1 cluster. When set to On, the default is minimum 1 clus...

  • 796 Views
  • 0 replies
  • 0 kudos
User16826992666
by Valued Contributor
  • 1921 Views
  • 1 replies
  • 0 kudos

Can you restrict the type of clusters users are allowed to create?

I would like to make it so users can only create job clusters and not interactive clusters. Is it possible to do this in a workspace?

  • 1921 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826992666
Valued Contributor
  • 0 kudos

This can be accomplished with cluster policies. You can use a policy similar to this example to restrict certain users or groups to only have permission to create job clusters.

  • 0 kudos
jose_gonzalez
by Databricks Employee
  • 26031 Views
  • 1 replies
  • 0 kudos

Resolved! What's the difference between mode("append") and mode("overwrite") on my Delta table

I would like to know the difference between .mode("append") and .mode("overwrite") when writing my Delta table

  • 26031 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Mode "append" atomically adds new data to an existing Delta table and "overwrite" atomically replaces all of the data in a table.

  • 0 kudos
jose_gonzalez
by Databricks Employee
  • 2700 Views
  • 1 replies
  • 0 kudos

Resolved! Where does the schema for a Delta table set reside?

I would like to know where can I find the current schema information from my Delta table.

  • 2700 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

The table name, path, database info are stored in Hive metastore, the actual schema is stored in the "_delta_log" directory that should be in the root path location where you Delta table is stored.

  • 0 kudos
jose_gonzalez
by Databricks Employee
  • 6335 Views
  • 1 replies
  • 0 kudos

Resolved! How can I read a specific Delta table part file?

is there a way to read a specific part off a delta table? When I try to read the parquet file as parquet I get an error in the notebook that I’m using the incorrect format as it’s part of a delta table. I just want to read a single Parquet file, not ...

  • 6335 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Disable Delta format to read as Parquet you need to set to false the following Spark settings:>> SET spark.databricks.delta.formatCheck.enabled=false OR>> spark.conf.set("spark.databricks.delta.formatCheck.enabled", "false")its not recommended to re...

  • 0 kudos
jose_gonzalez
by Databricks Employee
  • 1966 Views
  • 1 replies
  • 0 kudos

Resolved! should I run ANALYZE TABLE on Delta tables?

I would like to know if it recommended to run Analyze table on Delta tables or not. If not, why?

  • 1966 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

You can run  ANALYZE TABLE  on Delta tables only on Databricks Runtime 8.3 and above. For more details please refer to the docs: https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-aux-analyze-table.html

  • 0 kudos
User16753724663
by Valued Contributor
  • 1774 Views
  • 1 replies
  • 1 kudos

Download private repo from GitHub Enterprise in Databricks notebook

We are trying to download our repository which is hosted on GitHub Enterprise to use its python libraries in our notebooks.Earlier we had issues with downloading our repository using the repos feature in Databricks platform since only notebooks can b...

  • 1774 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16753724663
Valued Contributor
  • 1 kudos

To fix the issue, we need to pass the token in the header itself git clone https://<token>:x-oauth-basic@github.com/owner/repo.gitExample:%sh   git clone https://<token>@github.com/darshanbargal4747/databricks.git

  • 1 kudos
User16753724663
by Valued Contributor
  • 1092 Views
  • 1 replies
  • 0 kudos

Unable to use on prem Mysql server as we are not able to resolve the hostname

while connecting from notebook, it returns the error unable to resolve name.

  • 1092 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16753724663
Valued Contributor
  • 0 kudos

Since we are unable to resolve hostname, it point towards the DNS issue. We can use custom dns using init script and add in the cluster:%scala dbutils.fs.put("/databricks/<directory>/dns-masq.sh";,""" #!/bin/bash #####################################...

  • 0 kudos
User16783853906
by Contributor III
  • 904 Views
  • 0 replies
  • 0 kudos

Verify auto-optimize from delta history

How can I verify if auto-optimize is activated from Delta history for the two scenarios below? Will the DESC history show the details in both the cases? 1). Auto-optimize set on the table properties2). Auto-optimize enabled in spark sessionP.S. - I'm...

  • 904 Views
  • 0 replies
  • 0 kudos
User16753724663
by Valued Contributor
  • 1373 Views
  • 1 replies
  • 0 kudos

Resolved! Unable to create a token while deploying the workspace using terraform

we have automated out deployment with python API's however we have been caught in a situation which we cannot yet solve.We are looking to collect a token during the first deployment within the environment. currently our API requires a token.Is there...

  • 1373 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16753724663
Valued Contributor
  • 0 kudos

We can use below API to create a token and use the username and passwordcurl -X POST -u "admin_email":"xxxx" https://host/api/2.0/token/create -d' { "lifetime_seconds": 100, "comment": "this is an example token" }'

  • 0 kudos
User16826992666
by Valued Contributor
  • 8577 Views
  • 1 replies
  • 1 kudos

Resolved! Can you import a Jupyter notebook to a Databricks workspace?

Also curious if you can export a notebook created in Databricks as a Jupyter notebook

  • 8577 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16826992666
Valued Contributor
  • 1 kudos

Yes, the .ipynb format is a supported file type which can be imported to a Databricks workspace. Note that some special configurations may need to be adjusted to work in the Databricks environment. Additional accepted file formats which can be import...

  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels