cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826992666
by Databricks Employee
  • 1775 Views
  • 2 replies
  • 0 kudos

Can I query tables I have created in my Databricks workspace using Tableau?

I have created Delta tables in my Databricks workspace and would like to access them from Tableau. Is this possible?

  • 1775 Views
  • 2 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 0 kudos

Yeah - here is the link with details on how to integrate with Tabelu's different products https://docs.databricks.com/integrations/bi/tableau.html

  • 0 kudos
1 More Replies
User16826987838
by Databricks Employee
  • 2263 Views
  • 1 replies
  • 0 kudos

Is there a way to change the default cluster setting after a notebook has been created?

When you create a notebook, you are prompted to specify a default cluster that it will connect to. Is there a way to change that setting after the notebook is created?

  • 2263 Views
  • 1 replies
  • 0 kudos
Latest Reply
Mooune_DBU
Databricks Employee
  • 0 kudos

Yes of course, notebooks are not exclusively tied to a specific cluster so you can pick any available/visible cluster to attach the notebook on when you want to run it.Also please keep in mind that by doing this half-way through executing a notebook,...

  • 0 kudos
User16826992783
by Databricks Employee
  • 3908 Views
  • 1 replies
  • 0 kudos

Find Databricks SQL endpoints runtime

Is there a way to find out which runtime SQL endpoints are running?

  • 3908 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

In the UI, Databricks will list the running endpoints on top. Programmatically you can get information about the endpoints using the REST APIs. You will likely need to use a combo of the list endpoint to get all the endpoints. The for each endpoint u...

  • 0 kudos
craig_ng
by Databricks Employee
  • 1718 Views
  • 1 replies
  • 1 kudos
  • 1718 Views
  • 1 replies
  • 1 kudos
Latest Reply
craig_ng
Databricks Employee
  • 1 kudos

Yes, you can use the SCIM API integration to provision both users and groups. We have examples for Okta, Azure AD and OneLogin, but any SCIM-enabled IdP should suffice.

  • 1 kudos
sajith_appukutt
by Databricks Employee
  • 2123 Views
  • 1 replies
  • 0 kudos

Resolved! Can I schedule Databricks pools to have different minimum idle instance counts at different times of the day

I have few jobs configured to run against a pool at 10 PM every night. After running some tests, I found that increasing minimum idle instance counts improves the job latencies. However, It wouldn't be needed to have so many VMs idle at other times...

  • 2123 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Yes you can do so programmatically using the REST APIs. You can edit the settings of a Databricks Pool by using the Instance Pool Edit endpoint and provide the min idle that you desire. This cannot be done via the web UI.

  • 0 kudos
User16826992666
by Databricks Employee
  • 2621 Views
  • 1 replies
  • 0 kudos

Resolved! When should I turn on multi-cluster load balancing on SQL Endpoints?

I see the option to enable multi-cluster load balancing when creating a SQL Endpoint, but I don't know if I should be using it or not. How do I know when I should enable it?

  • 2621 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

It is best to enable multi-cluster load balance on sql endpoints when a lot of users will be running queries concurrently. Load balancing will help isolate the queries and ensure the best performance for all users. If you only have a few users runnin...

  • 0 kudos
User16856693631
by Databricks Employee
  • 8191 Views
  • 1 replies
  • 0 kudos
  • 8191 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16856693631
Databricks Employee
  • 0 kudos

Yes you can. Databricks maintains a history of your job runs for up to 60 days. If you need to preserve job runs, Databricks recommends that you export results before they expire. For more information, see https://docs.databricks.com/jobs.html#export...

  • 0 kudos
User16826992666
by Databricks Employee
  • 2257 Views
  • 1 replies
  • 0 kudos

Resolved! How much space does the metadata for a Delta table take up?

If you have a lot of transactions in a table it seems like the Delta log keeping track of all those transactions would get pretty large. Does the size of the metadata become a problem over time?

  • 2257 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Yes, the size of the metadata can become a problem over time but not because of performance but because of storage costs. Delta performance will not degrade due to the size of the metadata, but your cloud storage bill can increase. By default Delta h...

  • 0 kudos
Anonymous
by Not applicable
  • 1740 Views
  • 1 replies
  • 0 kudos

Resolved! Delta Sharing internally?

If we don't have any datasets to be shared with external companies, does that mean Delta Sharing is not valid for our org? Is there any use case to use it internally?

  • 1740 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Delta sharing can be done externally and internally. One use case for sharing internally would be if two separate business units would like to share data with each other without exposing their Lakehouse with the other unit.

  • 0 kudos
User16830818524
by Databricks Employee
  • 1725 Views
  • 1 replies
  • 0 kudos

Is it possible to read a Delta table directly using Koalas?

Can I read a Delta table directly using Koalas or do I need to read using Spark and then convert the Spark dataframe to a Koalas dataframe?

  • 1725 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Yes, you can use the "read_delta" function. Documentation.

  • 0 kudos
sajith_appukutt
by Databricks Employee
  • 2164 Views
  • 1 replies
  • 2 kudos

Resolved! Unable to get mlflow central model registry to work with dbconnect.

I'm working on setting up tooling to allow team members to easily register and load models from a central mlflow model registry via dbconnect. However after following the instructions at the public docs , hitting this error raise _NoDbutilsError mlfl...

  • 2164 Views
  • 1 replies
  • 2 kudos
Latest Reply
sajith_appukutt
Databricks Employee
  • 2 kudos

You could monkey patch MLFlow's _get_dbutils() with something similar to this to get this working while connecting from dbconnectspark = SparkSession.builder.getOrCreate() # monkey-patch MLFlow's _get_dbutils() def _get_dbutils(): return DBUtils(...

  • 2 kudos
aladda
by Databricks Employee
  • 2195 Views
  • 1 replies
  • 0 kudos
  • 2195 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 0 kudos

Generally, interactive clusters and jobs are better suited for data engineering and transformations as they support more than just SQL. However, if you are using pure SQL, then endpoints can be used for data transformations. All of the Spark SQL fun...

  • 0 kudos
aladda
by Databricks Employee
  • 1613 Views
  • 1 replies
  • 0 kudos

Resolved! Does the Jobs API allow executing an older version of a Notebook using version history?

I see the revision_timestamp paramater on NotebookTask https://docs.databricks.com/dev-tools/api/latest/jobs.html#jobsnotebooktask. An example of how to invoke it would be helpful

  • 1613 Views
  • 1 replies
  • 0 kudos
Latest Reply
aladda
Databricks Employee
  • 0 kudos

You can use the databricks built in version control feature, coupled with the NotebookTask Jobs API to specify a specific version of the notebook based on the timestamp of the save defined in unix timestamp formatcurl -n -X POST -H 'Content-Type: app...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels