cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

de-hru
by New Contributor III
  • 31754 Views
  • 4 replies
  • 1 kudos

Resolved! How to add pre-commit hook to the Git Client on Databricks Cluster?

I'd like to add a Git pre-commit hook to the Databricks Cluster.This pre-commit hook should be executed when pushing to GitHub.Why would I need a pre-commit hook on a Databricks Cluster?My goal is to run blackbricks and format all notebooks automatic...

  • 31754 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Dejan Hrubenja​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 1 kudos
3 More Replies
sanjay
by Valued Contributor II
  • 8252 Views
  • 0 replies
  • 0 kudos

autoloader with real time and batch processing concurrently

Hi,I have data pipeline which is running continuously, processes the micro batch data and store data in delta lake. This is taking care of any new data.But at times, I need to process historical data without disturbing real time data processing.Is th...

  • 8252 Views
  • 0 replies
  • 0 kudos
ahs
by New Contributor
  • 2699 Views
  • 2 replies
  • 1 kudos

DNS resolution on E2 workspaces?

I am trying to find documents/flows that show Databricks' network setup for e2 workspaces. More specifically, I'm interested in how DNS is resolved on AWS. All the pages I could find were regarding using route53 and privatelink for custom dns. But pl...

  • 2699 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @A H​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
Aj2
by Databricks Partner
  • 16847 Views
  • 4 replies
  • 1 kudos

Resolved! How to connect to DB2-AS400?

What are the steps needed to connect to a DB2-AS400 source to pull data to lake using Databricks? I believe it requires establishing a jdbc connection, but I couldnot find much details online

  • 16847 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Ajay Menon​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
3 More Replies
vichus1995
by New Contributor
  • 7747 Views
  • 2 replies
  • 0 kudos

Mounted Azure Storage shows mount.err inside folder while reading from Azure Databricks

I'm using Azure Databricks notebook to read a excel file from a folder inside a mounted Azure blob storage. The mounted excel location is like : "/mnt/2023-project/dashboard/ext/Marks.xlsx". 2023-project is the mount point and dashboard is the name o...

  • 7747 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @vichus1995​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
Jits
by New Contributor II
  • 2734 Views
  • 2 replies
  • 3 kudos

Getting Error when Inserting data into table with the column as bigint

Hi All,I am creating table using Databricks SQL editor. The table definition isDROP TABLE IF EXISTS [database].***_test;CREATE TABLE [database].***_jitu_test(  id bigint)USING deltaLOCATION 'test/raw/***_jitu_test'TBLPROPERTIES ('delta.minReaderVersi...

  • 2734 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @jitendra goswami​ We haven't heard from you since the last response from @Werner Stinckens​ r​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpf...

  • 3 kudos
1 More Replies
jhgorse
by New Contributor III
  • 3756 Views
  • 2 replies
  • 2 kudos

Resolved! autoloader break on migration from community to trial premium with s3 mount

in dbx community edition, the autoloader works using the s3 mount. s3 mount, autoloader:dbutils.fs.mount(f"s3a://{access_key}:{encoded_secret_key}@{aws_bucket_name}", f"/mnt/{mount_name}from pyspark.sql import SparkSession from pyspark.sql.functions ...

  • 3756 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Joe Gorse​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 2 kudos
1 More Replies
signo
by New Contributor II
  • 10050 Views
  • 2 replies
  • 2 kudos

Delta lake schema enforcement allows datatype mismatch on write using MERGE-operation [python]

Databricks Runtime: 12.2 LTS, Spark: 3.3.2, Delta Lake: 2.2.0A target table with schema ([c1: integer, c2: integer]), allows us to write into target table using data with schema ([c1: integer, c2: double]). I expected it to throw an exception (same a...

  • 10050 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sigrun Nordli​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

  • 2 kudos
1 More Replies
kll
by New Contributor III
  • 13326 Views
  • 2 replies
  • 3 kudos

Nested struct type not supported pyspark error

I am attempting to apply a function to a pyspark DataFrame and save the API response to a new column and then parse using `json_normalize`. This works fine in pandas, however, I run into an exception with `pyspark`.  import pyspark.pandas as ps   i...

  • 13326 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Keval Shah​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 3 kudos
1 More Replies
Divya_Bhadauria
by New Contributor III
  • 7490 Views
  • 2 replies
  • 2 kudos

Running databricks job with different parameter automatically

I have a python script running as databricks job. Is there a way I can run this job with different set of parameters automatically or programmatically without using run with different parameter option available in UI ?

  • 7490 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Divya Bhadauria​ We haven't heard from you since the last response from @Lakshay Goel​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 2 kudos
1 More Replies
Khalil
by Contributor
  • 3315 Views
  • 0 replies
  • 0 kudos

Snowpark vs Spark on Databricks

Why / When should we choose Spark on Databricks over Snowpark if the data we are processing is underlying in Snowflake?

  • 3315 Views
  • 0 replies
  • 0 kudos
Gilg
by Contributor II
  • 2373 Views
  • 0 replies
  • 0 kudos

Databricks Runtime 12.1 spins VM in Ubuntu 18.04 LTS

Hi Team,Our cluster is currently in DBR 12.1 but it spins up a VMs with Ubuntu 18.04 LTS. 18.04 will be EOL soon. According to this https://learn.microsoft.com/en-us/azure/databricks/release-notes/runtime/12.1 OS version should be 20.04 and now a bit...

  • 2373 Views
  • 0 replies
  • 0 kudos
Rajaniesh
by New Contributor III
  • 3043 Views
  • 0 replies
  • 0 kudos

Resource Not Found

Hi,I am trying to write a simple code in databricks using langchain. But it is throwing this error: Resource not %pip install --upgrade openai%pip install langchain --upgrade%pip install pymssql --upgrade%pip install SQLAlchemy%pip install pyodbcimpo...

  • 3043 Views
  • 0 replies
  • 0 kudos
David_K93
by Contributor
  • 7245 Views
  • 1 replies
  • 2 kudos

Resolved! Building a Document Store on Databricks

Hello,I am somewhat new to Databricks and am trying to build a Q&A application based on a collection of documents. I need to move .pdf and .docx files from my local machine to storage in Databricks and eventually a document store. My questions are:Wh...

  • 7245 Views
  • 1 replies
  • 2 kudos
Latest Reply
David_K93
Contributor
  • 2 kudos

Hi all,I took an initial stab at task one with some success using the Databricks CLI. Here are the steps below:Open Command/Anaconda prompt and enter: pip install databricks-cliGo to your Databricks console and under settings find "User Settings" and...

  • 2 kudos
Labels