cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

BradSheridan
by Valued Contributor
  • 4157 Views
  • 9 replies
  • 4 kudos

Resolved! How to use cloudFiles to completely overwrite the target

Hey there Community!! I have a client that will produce a CSV file daily that needs to be moved from Bronze -> Silver. Unfortunately, this source file will always be a full set of data....not incremental. I was thinking of using AutoLoader/cloudFil...

  • 4157 Views
  • 9 replies
  • 4 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 4 kudos

I "up voted'" all of @werners suggestions b/c they are all very valid ways of addressing my need (the true power/flexibility of the Databricks UDAP!!!). However, turns out I'm going to end up getting incremental data afterall :). So now the flow wi...

  • 4 kudos
8 More Replies
Deepak_Goldwyn
by New Contributor III
  • 860 Views
  • 0 replies
  • 0 kudos

Pass parameter value from Job to DLT pipeline

We are investigating how to pass parameter from Databricks Job to DLT pipeline. Our process orchestrator is Azure Data Factory from where we trigger the Databricks Job using Jobs API. As part of the 'run-now' request, we would like to pass a paramete...

  • 860 Views
  • 0 replies
  • 0 kudos
BkP
by Contributor
  • 735 Views
  • 0 replies
  • 0 kudos

Hi, I am getting an error while creating a cluster and trying to open a notebook to run. How to overcome this error ? I have sent an email to databric...

Hi,I am getting an error while creating a cluster and trying to open a notebook to run. How to overcome this error ? I have sent an email to databricks support but not received any response till now. Please help and guide.

databricks error in community edition
  • 735 Views
  • 0 replies
  • 0 kudos
explore
by New Contributor
  • 1500 Views
  • 0 replies
  • 0 kudos

Hi, Can we connect to the Teradata vantage installed in a vm via the community notebook. I am working on a POC to fetch data from Teradata vantate (just a teradata as it uses the jdbc) and process it in community notebook. Downloaded the terajdbc4.jar

from pyspark.sql import SparkSessionspark = SparkSession.builder.getOrCreate()def load_data(driver, jdbc_url, sql, user, password):  return spark.read \    .format('jdbc') \    .option('driver', driver) \    .option('url', jdbc_url) \    .option('dbt...

  • 1500 Views
  • 0 replies
  • 0 kudos
youngchef
by New Contributor
  • 2024 Views
  • 3 replies
  • 3 kudos

Resolved! AWS Instance Profiles and DLT Pipelines

Hey everyone! I'm building a DLT pipeline that reads files from S3 (or tries to) and then writes them into different directories in my s3 bucket. The problem is I usually access S3 with an instance profile attached to a cluster, but DLT does not give...

  • 2024 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

{ "clusters": [ { "label": "default", "aws_attributes": { "instance_profile_arn": "arn:aws:..." } }, { "label": "maintenance", "aws_attributes": { "instance_profile_arn": "arn:aws:..." ...

  • 3 kudos
2 More Replies
ricard98
by New Contributor II
  • 4220 Views
  • 3 replies
  • 5 kudos

How do you connect a folder path from your desktop to DB notebook?

I have a folder with multiples excel files that contains information from different cost centers, these files get update every week , im trying to upload all these files to the DB notebook , is there a way to connect the path directly to the DBFS to...

  • 4220 Views
  • 3 replies
  • 5 kudos
Latest Reply
User16873043099
Contributor
  • 5 kudos

Hello, Thanks for your question.You can mount a cloud object storage to dbfs and use them in a notebook. Please refer here.It is not possible to mount a local folder from desktop to dbfs. But you should be able to use the Databricks CLI to copy the e...

  • 5 kudos
2 More Replies
StephanieAlba
by Databricks Employee
  • 2193 Views
  • 3 replies
  • 6 kudos
  • 2193 Views
  • 3 replies
  • 6 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 6 kudos

Hi @Stephanie Rivera​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 6 kudos
2 More Replies
Karl
by New Contributor II
  • 16698 Views
  • 2 replies
  • 3 kudos

PySpark column object not callable using "when otherwise" transformation

The very first "when" function results in the posted error message (see image). The print statement of the count of df_td_amm works. A printSchema of the "df_td_amm" data frame confirms that "AGE" is a column. A select statement is also successful, s...

Error
  • 16698 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

the syntax is when(....).otherwise(...), not other(...)And there are some backslashes missing.

  • 3 kudos
1 More Replies
vk217
by Contributor
  • 834 Views
  • 0 replies
  • 0 kudos

Databricks migration from 7.3 LTS to X.x

We are currently on 7.3 LTS version with Python 3.7. I see that we are several versions off the latest 11.1 release https://docs.databricks.com/release-notes/runtime/releases.html.I see that the End of support for 11.1 is earlier than 10.4 LTS. What ...

  • 834 Views
  • 0 replies
  • 0 kudos
antoniodavideca
by New Contributor III
  • 3362 Views
  • 5 replies
  • 1 kudos

Resolved! Jobs REST Api - Run a Job that is connected to a git_source

On Jobs REST API is possible to create a new Job, specifying a git_source.My question is about triggering the job.Still on Jobs REST Api is possible to trigger a job using the job_id, but I don't find a way to tell anyhow to Databricks, what's the en...

  • 3362 Views
  • 5 replies
  • 1 kudos
Latest Reply
Prabakar
Databricks Employee
  • 1 kudos

Ah. Got it. So is your issue resolved or are you looking for further information.

  • 1 kudos
4 More Replies
Gabriel0007
by New Contributor III
  • 2221 Views
  • 0 replies
  • 3 kudos

How to save json data to Delta Table: ParseError on Insert

I'm trying to save the returned json data from a requests API call to a Delta Table. I get a ParseError when I INSERT the response object which is in json format. The error shows the json data and a marker that states a ' or } or ) is missing. I v...

  • 2221 Views
  • 0 replies
  • 3 kudos
Kit
by New Contributor III
  • 4471 Views
  • 7 replies
  • 1 kudos

Resolved! Can't run a job that use GitHub as source

I have a list of jobs that are using the code in GitHub as source.Everything worked fine until yesterday. Yesterday, I noticed that all the job that were using GitHub as source were failing. Because of the following error: ``` Run result unavailable:...

  • 4471 Views
  • 7 replies
  • 1 kudos
Latest Reply
User16766737456
Databricks Employee
  • 1 kudos

Just an update, to round this out. We investigated further internally, and found that although we have a cleanup process in place to remove the internal repos that are being checked out for workflows, it was failing to catch up due to the sheer volum...

  • 1 kudos
6 More Replies
antoniodavideca
by New Contributor III
  • 2391 Views
  • 2 replies
  • 0 kudos

Jobs REST Api - Create new Job with a new Cluster, and install a Maven Library on the Cluster

I would need to use the Job REST API to create a Job on our databrick Cluster.At the Job Creation, is possible to specify an existing cluster, or, create a new one.I can forward alot of information to the Cluster, but what I would like to specify is ...

  • 2391 Views
  • 2 replies
  • 0 kudos
Latest Reply
Prabakar
Databricks Employee
  • 0 kudos

@Antonio Davide Cali​ You can use the existing cluster in your json to use it for the job.To update or push libraries to the job, you can use the JobsUpdate API. As you want to push libraries to the cluster, you can push them using the new setting an...

  • 0 kudos
1 More Replies
Lazloo
by New Contributor III
  • 1062 Views
  • 0 replies
  • 2 kudos

Cannot load spark-avro jars with databricksversion 10.4

Currently, I am facing an issue since the `databricks-connect` runtime on our cluster was updated to 10.4. Since then, I cannot load the jars for spark-avro anymore. By Running the following code from pyspark.sql import SparkSession   spark = SparkSe...

  • 1062 Views
  • 0 replies
  • 2 kudos
Anonymous
by Not applicable
  • 547 Views
  • 0 replies
  • 5 kudos

www.databricks.com

New and Exciting! Databricks and Jupyter: Announcing ipywidgets in the Databricks NotebookBringing the interactivity of the Jupyter ecosystem into the LakehouseWe are excited to announce a deeper integration between the Databricks Notebook and the e...

  • 547 Views
  • 0 replies
  • 5 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels