cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nskiran
by New Contributor III
  • 1738 Views
  • 3 replies
  • 0 kudos

How to bring in databricks dbacademy courseware

I have created an account in dbacademy and signed up for advanced data engineering with databricks course. Also, I have subscribed to Vocareum lab as well. During the demo, tutor/trainer opened 'ADE 1.1 - Follow Along Demo - Reading from a Streaming ...

  • 1738 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

So, it appears that we no longer make the notebooks available with self-paced training.  They are not available for download.

  • 0 kudos
2 More Replies
jiteshraut20
by New Contributor III
  • 3028 Views
  • 2 replies
  • 0 kudos

Deploying Overwatch on Databricks (AWS) with System Tables as the Data Source

IntroductionOverwatch is a powerful tool for monitoring and analyzing your Databricks environment, providing insights into resource utilization, cost management, and system performance. By leveraging system tables as the data source, you can gain a c...

  • 3028 Views
  • 2 replies
  • 0 kudos
Latest Reply
raghu2
Databricks Partner
  • 0 kudos

hi @jiteshraut20, Thanks for your post. From my set up, validation seems to work.Wrote 32 bytes. Validation report has been saved to dbfs:/mnt/overwatch_global/multi_ow_dep/report/validationReport Validation report details Total validation count: 35 ...

  • 0 kudos
1 More Replies
johnnwanosike
by New Contributor III
  • 2493 Views
  • 6 replies
  • 0 kudos

Hive metastore federation, internal and external unable to connect

I enabled the internal hive on the metastore federation using this  query commandCREATE CONNECTION IF NOT EXISTS internal-hive TYPE hive_metastoreOPTIONS (builtin true);But I can't get a password or username to access the JDBC URL. 

  • 2493 Views
  • 6 replies
  • 0 kudos
Latest Reply
johnnwanosike
New Contributor III
  • 0 kudos

Not really, what I want to achieve is connecting to an external hive but I do want to configure the external hive on our server to be able to interact with the Databricks cluster in such a way that I could have access to thrift protocol.

  • 0 kudos
5 More Replies
_deepak_
by New Contributor II
  • 4140 Views
  • 4 replies
  • 0 kudos

Databricks regression test suite

Hi, I am new to Databricks and setting up the non-prod environment. I am wanted to know, IS there any way by which I can run a regression suite so that existing setup should not break in case of any feature addition and also how can I make available ...

  • 4140 Views
  • 4 replies
  • 0 kudos
Latest Reply
grkseo7
New Contributor II
  • 0 kudos

Regression testing after code changes can be automated easily. Once you’ve created test cases with Pytest or Great Expectations, you can set up a CI/CD pipeline using tools like Jenkins or GitHub Actions. For a non-prod setup, Docker is great for rep...

  • 0 kudos
3 More Replies
hari-prasad
by Valued Contributor II
  • 1220 Views
  • 3 replies
  • 1 kudos

Optimize Cluster Uptime by Avoiding Unwanted Library or Jar Installations

Whenever we discuss clusters or nodes in any service, we need to address the cluster bootstrap process. Traditionally, this involves configuring each node using a startup script (startup.sh).In this context, installing libraries in the cluster is par...

Data Engineering
cluster
job
jobs
Nodes
  • 1220 Views
  • 3 replies
  • 1 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 1 kudos

I'm sharing my experience here. Thank you for follow up!

  • 1 kudos
2 More Replies
korijn
by New Contributor II
  • 1198 Views
  • 1 replies
  • 0 kudos

How to set environment (client) on notebook via API/Terraform provider?

I am deploying a job with a notebook task via the Terraform provider. I want to set the client version to 2. I do NOT need to install any dependencies. I just want to use the new client version for the serverless compute. How do I do this with the Te...

  • 1198 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Unfortunately, there is no direct way to set the client version for a notebook task via the Terraform provider or the API without using the UI. The error message suggests that the %pip magic command is the recommended approach for installing dependen...

  • 0 kudos
Binnisb
by Databricks Employee
  • 1289 Views
  • 4 replies
  • 2 kudos

model_serving_endpoints in DAB updates every time

Love the model_serving_endpoints in the dab, but now it takes over 6 minutes to deploy resources when they already exist. It says (updating) in the serving tab in the side bar even if nothing has changed.Is there a way to not update the endpoints aft...

  • 1289 Views
  • 4 replies
  • 2 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 2 kudos

I have created an internal feature request for this behavior: DB-I-13108

  • 2 kudos
3 More Replies
hari-prasad
by Valued Contributor II
  • 1964 Views
  • 0 replies
  • 2 kudos

Databricks UniForm - Bridging Delta Lake and Iceberg

Databricks UniForm, enables seamless integration between Delta Lake and Iceberg formats. Databricks UniForm key features include:Interoperability: Read Delta tables with Iceberg clients without rewriting data.Automatic Metadata Generation: Asynchrono...

  • 1964 Views
  • 0 replies
  • 2 kudos
martindlarsson
by New Contributor III
  • 5007 Views
  • 2 replies
  • 0 kudos

Jobs indefinitely pending with libraries install

I think I found a bug where you get Pending indefinitely on jobs that has a library requirement and the user of the job does not have Manage permission on the cluster.In my case I was trying to start a dbt job with dbt-databricks=1.8.5 as library. Th...

  • 5007 Views
  • 2 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thanks for your feedback! Just checking is this still an issue for you? would you share more details? if I wanted to reproduce this for example.

  • 0 kudos
1 More Replies
ls
by New Contributor III
  • 9827 Views
  • 10 replies
  • 0 kudos

Resolved! Py4JJavaError: An error occurred while calling o552.count()

Hey! I'm new to the forums but not Databricks, trying to get some help with this question:The error also is also fickle since it only appears what seems to be random. Like when running the same code it works then on the next run with a new set of dat...

  • 9827 Views
  • 10 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@ls Agree, it doesn't seem to be fixed. Maybe on DBR 16 memory management is better optimized, hence I'd like to suggest going through the methods mentioned earlier in this post: Memory Profiling: Try freezing the dataset that reproduces the problem...

  • 0 kudos
9 More Replies
Maatari
by New Contributor III
  • 1887 Views
  • 1 replies
  • 0 kudos

How Dedicated Access mode work ?

I have a question about the dedicated access modehttps://docs.databricks.com/en/compute/group-access.htmlIt is stated that:"Dedicated access mode is the latest version of single user access mode. With dedicated access, a compute resource can be assig...

  • 1887 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The dedicated access mode allows multiple users within the assigned group to access and use the compute resource simultaneously. This is different from the traditional single-user access mode, as it enables secure sharing of the resource among group ...

  • 0 kudos
ashraf1395
by Honored Contributor
  • 3464 Views
  • 1 replies
  • 1 kudos

Resolved! referencing external locations in python notebooks

How can I refrence external lcoations in python notebook. I got the docs for referencing it in python : https://docs.databricks.com/en/sql/language-manual/sql-ref-external-locations.html.But how to do it in python. I am not able to understand. Do we ...

  • 3464 Views
  • 1 replies
  • 1 kudos
Latest Reply
fmadeiro
Contributor II
  • 1 kudos

@ashraf1395 ,Referencing external locations in a Databricks Python notebook, particularly for environments like Azure DevOps with different paths for development (dev) and production (prod), can be effectively managed using parameterized variables. H...

  • 1 kudos
Avinash_Narala
by Databricks Partner
  • 916 Views
  • 1 replies
  • 1 kudos

Resolved! which type of cluster to use

Hi,Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).May I know which type of cluster should I use and the driver...

  • 916 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 1 kudos

Hi @Avinash_Narala , Good Day!  For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for t...

  • 1 kudos
Michael_Galli
by Databricks Partner
  • 4322 Views
  • 5 replies
  • 1 kudos

Resolved! Importing data into Excel from Databricks over ODBC OAuth / Simba Spark Driver

Hi all,I am refering to this articleConnect to Azure Databricks from Microsoft Excel - Azure Databricks | Microsoft LearnI use the latest SimbaSparkODBC-2.8.2.1013-Windows-64bit driver and configured in like in that documentation.In Databricks I use ...

  • 4322 Views
  • 5 replies
  • 1 kudos
Latest Reply
Aydin
New Contributor II
  • 1 kudos

Hi @Michael_Galli, we're currently experiencing the same issue. I've just asked our internal support team to raise a ticket with Microsoft but thought it would be worth reaching out to you. Have you had any luck resolving this issue?

  • 1 kudos
4 More Replies
sgannavaram
by New Contributor III
  • 4586 Views
  • 3 replies
  • 1 kudos

How to connect to IBM MQ from Databricks notebook?

We are trying to connect to IBM MQ and post message to MQ, which eventually consumed by mainframe application.What are the IBM MQ clients .jars / libraries installed in cluster ? if you have any sample code for connectivity that would be helpful.

  • 4586 Views
  • 3 replies
  • 1 kudos
Latest Reply
none_ranjeet
New Contributor III
  • 1 kudos

Were you able to do this connection other than rest API which have problem in reading Binary messages, Please suggest

  • 1 kudos
2 More Replies
Labels