cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ADBQueries
by New Contributor
  • 2778 Views
  • 1 replies
  • 0 kudos

DBEAVER Connection to Sql Warehouse in Databricks

I'm trying to connect to SQL warehouse in Azure Datebricks with DBEAVER application.I'm creating a jdbc connection string as mentioned here: https://docs.databricks.com/en/integrations/jdbc/authentication.htmlHere is a sample connection link I have c...

  • 2778 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @ADBQueries , Good Day!  Could you please try running the code again to generate another access token and, once generated, check it on this page, https://jwt.ms, to confirm that the token has not expired? Also, if not done yet, please review the f...

  • 0 kudos
acagatayyilmaz
by New Contributor
  • 2600 Views
  • 1 replies
  • 0 kudos

How to find consumed DBU

Hi All,I'm trying to understand my databricks consumption to purchase a reservation. However, I couldnt find the consumed DBU in both Azure Portal and Databricks workspace.I'm also exporting and processing Azure Cost data daily. When I check the reso...

  • 2600 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @acagatayyilmaz , Hope you are doing well!  You can refer to the Billable usage system table to find the records of consumed DBU. You can go through the below document to understand more about the System tables:  https://learn.microsoft.com/en-us/...

  • 0 kudos
vanepet
by New Contributor II
  • 18281 Views
  • 5 replies
  • 2 kudos

Is it possible to use multiprocessing or threads to submit multiple queries to a database from Databricks in parallel?

We are trying to improve our overall runtime by running queries in parallel using either multiprocessing or threads. What I am seeing though is that when the function that runs this code is run on a separate process it doesnt return a dataFrame with...

  • 18281 Views
  • 5 replies
  • 2 kudos
Latest Reply
BapsDBS
New Contributor II
  • 2 kudos

Thanks for the links mentioned above. But both of them uses raw python to achieve parallelism. Does this mean Spark (read PySpark) does exactly provisions for parallel execution of functions or even notebooks ? We used a wrapper notebook with ThreadP...

  • 2 kudos
4 More Replies
RIDBX
by New Contributor II
  • 2263 Views
  • 1 replies
  • 0 kudos

What is the bestway to handle huge gzipped file dropped to S3 ?

What is the bestway to handle huge gzipped file dropped to S3 ?=================================================I find some intereting suggestions for posted questions. Thanks for reviewing my threads. Here is the situation we have.We are getting dat...

Data Engineering
bulkload
S3
  • 2263 Views
  • 1 replies
  • 0 kudos
zerodarkzone
by New Contributor III
  • 1814 Views
  • 1 replies
  • 1 kudos

Cannot create vnet peering on Azure Databricks

Hi,I'm trying to create a VNET peering using to SAP hana using the default VNET created by databricks but it is not possible.I'm getting the following errorNo se pudo agregar el emparejamiento de red virtual "PeeringSAP" a "workers-vnet". Error: El c...

Data Engineering
Azure Databricks
peering
vnet
  • 1814 Views
  • 1 replies
  • 1 kudos
jx1226
by New Contributor III
  • 2816 Views
  • 2 replies
  • 0 kudos

Connect Workspace EnableNoPublicIP=No and VnetInject=No to storage account with Private Endpoint.

We know that Databricks with VNET injection (our own VNET) allows is to connect to blob storage/ ADLS Gen2 over private endpoints and peering. This is what we typically do.We have a client who created Databricks with EnableNoPublicIP=No (secure clust...

  • 2816 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16539034020
Databricks Employee
  • 0 kudos

Hello,  Thanks for contacting Databricks Support.You need to enable EnableNoPublicIP,  otherwise, you will get the error message "cannot be deployed on subnet containing Basic SKU Public IP addresses or Basic SKU Load Balancer. NIC", it was usually t...

  • 0 kudos
1 More Replies
User16752240150
by New Contributor II
  • 1440 Views
  • 1 replies
  • 1 kudos
  • 1440 Views
  • 1 replies
  • 1 kudos
Latest Reply
holly
Databricks Employee
  • 1 kudos

Hi there! Appreciate this reply is 3 years later than it was originally asked, but people might be coming across it still. A few things: Koalas was deprecated in spark 3.2 (runtime 10.4). Instead, the recommendation is to use pandas on spark with `im...

  • 1 kudos
manish1987c
by New Contributor III
  • 1855 Views
  • 1 replies
  • 0 kudos

Delta Live table expectations

I am able to ues expectation feature in delta live table using by creating the expectations as below   checks = {}checks["validate circuitId col for null values"] = "(circuitId IS not NULL)"checks["validate name col for null values"] = "(name IS not ...

  • 1855 Views
  • 1 replies
  • 0 kudos
Spenyo
by New Contributor II
  • 1271 Views
  • 0 replies
  • 1 kudos

Delta table size not shrinking after Vacuum

Hi team.Everyday once we overwrite the last X month data in tables. So it generate a every day a larger amount of history. We don't use time travel so we don't need it.What we done:SET spark.databricks.delta.retentionDurationCheck.enabled = false ALT...

chrome_KZMxPl8x1d.png
  • 1271 Views
  • 0 replies
  • 1 kudos
Gilg
by Contributor II
  • 1126 Views
  • 0 replies
  • 0 kudos

Best Practices Near Real-time Processing

HI All,We are ingesting 1000 files in json format and different sizes per minute. DLT is in continuous mode. Unity Catalog is enabled workspace.  We are using the default setting of Autoloader (Directory Listing) and Silver has CDC as well.We aim to ...

  • 1126 Views
  • 0 replies
  • 0 kudos
RajeshRK
by Contributor II
  • 8543 Views
  • 3 replies
  • 0 kudos

Need help to analyze databricks logs for a long-running job.

Hi Team,We have a job it completes in 3 minutes in one Databricks cluster, if we run the same job in another databricks cluster it is taking 3 hours to complete.I am quite new to Databricks and need your guidance on how to find out where databricks s...

  • 8543 Views
  • 3 replies
  • 0 kudos
Latest Reply
AmitKP
New Contributor II
  • 0 kudos

Hi @Retired_mod ,I am saving logs of my databricks Job Compute From ADF, How can i open those files that present in dbfs location.

  • 0 kudos
2 More Replies
Gilg
by Contributor II
  • 2506 Views
  • 1 replies
  • 0 kudos

Move files

HiI am using DLT with Autoloader.DLT pipeline is running in Continuous mode.Autoloader is in Directory Listing mode (Default)Question.I want to move files that has been processed by the DLT to another folder (archived) and planning to have another no...

  • 2506 Views
  • 1 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels