cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

BradSheridan
by Valued Contributor
  • 1962 Views
  • 9 replies
  • 4 kudos

Resolved! How to use cloudFiles to completely overwrite the target

Hey there Community!! I have a client that will produce a CSV file daily that needs to be moved from Bronze -> Silver. Unfortunately, this source file will always be a full set of data....not incremental. I was thinking of using AutoLoader/cloudFil...

  • 1962 Views
  • 9 replies
  • 4 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 4 kudos

I "up voted'" all of @werners suggestions b/c they are all very valid ways of addressing my need (the true power/flexibility of the Databricks UDAP!!!). However, turns out I'm going to end up getting incremental data afterall :). So now the flow wi...

  • 4 kudos
8 More Replies
Deepak_Goldwyn
by New Contributor III
  • 454 Views
  • 0 replies
  • 0 kudos

Pass parameter value from Job to DLT pipeline

We are investigating how to pass parameter from Databricks Job to DLT pipeline. Our process orchestrator is Azure Data Factory from where we trigger the Databricks Job using Jobs API. As part of the 'run-now' request, we would like to pass a paramete...

  • 454 Views
  • 0 replies
  • 0 kudos
BkP
by Contributor
  • 448 Views
  • 0 replies
  • 0 kudos

Hi, I am getting an error while creating a cluster and trying to open a notebook to run. How to overcome this error ? I have sent an email to databric...

Hi,I am getting an error while creating a cluster and trying to open a notebook to run. How to overcome this error ? I have sent an email to databricks support but not received any response till now. Please help and guide.

databricks error in community edition
  • 448 Views
  • 0 replies
  • 0 kudos
explore
by New Contributor
  • 903 Views
  • 0 replies
  • 0 kudos

Hi, Can we connect to the Teradata vantage installed in a vm via the community notebook. I am working on a POC to fetch data from Teradata vantate (just a teradata as it uses the jdbc) and process it in community notebook. Downloaded the terajdbc4.jar

from pyspark.sql import SparkSessionspark = SparkSession.builder.getOrCreate()def load_data(driver, jdbc_url, sql, user, password):  return spark.read \    .format('jdbc') \    .option('driver', driver) \    .option('url', jdbc_url) \    .option('dbt...

  • 903 Views
  • 0 replies
  • 0 kudos
chandan_a_v
by Valued Contributor
  • 918 Views
  • 1 replies
  • 1 kudos

Can't import local files under repo

I have a yaml file inside one of the sub dir in Databricks, I have appended the repo path to sys. Still I can't access this file. https://docs.databricks.com/_static/notebooks/files-in-repos.html

image
  • 918 Views
  • 1 replies
  • 1 kudos
Latest Reply
chandan_a_v
Valued Contributor
  • 1 kudos

@Kaniz Fatma​ ,Could you please help me out here?

  • 1 kudos
youngchef
by New Contributor
  • 1001 Views
  • 3 replies
  • 3 kudos

Resolved! AWS Instance Profiles and DLT Pipelines

Hey everyone! I'm building a DLT pipeline that reads files from S3 (or tries to) and then writes them into different directories in my s3 bucket. The problem is I usually access S3 with an instance profile attached to a cluster, but DLT does not give...

  • 1001 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

{ "clusters": [ { "label": "default", "aws_attributes": { "instance_profile_arn": "arn:aws:..." } }, { "label": "maintenance", "aws_attributes": { "instance_profile_arn": "arn:aws:..." ...

  • 3 kudos
2 More Replies
ricard98
by New Contributor II
  • 2497 Views
  • 3 replies
  • 5 kudos

How do you connect a folder path from your desktop to DB notebook?

I have a folder with multiples excel files that contains information from different cost centers, these files get update every week , im trying to upload all these files to the DB notebook , is there a way to connect the path directly to the DBFS to...

  • 2497 Views
  • 3 replies
  • 5 kudos
Latest Reply
User16873043099
Contributor
  • 5 kudos

Hello, Thanks for your question.You can mount a cloud object storage to dbfs and use them in a notebook. Please refer here.It is not possible to mount a local folder from desktop to dbfs. But you should be able to use the Databricks CLI to copy the e...

  • 5 kudos
2 More Replies
gazzyjuruj
by Contributor II
  • 5900 Views
  • 4 replies
  • 9 kudos

Cluster start is currently disabled ?

Hi, i'm trying to run the notebooks but it doesn't do any activity.I had to create a cluster in order to start my code.pressing the play button inside of notebook does nothing at all.and the 'compute' , pressing play there on the clusters gives the e...

  • 5900 Views
  • 4 replies
  • 9 kudos
Latest Reply
jose_gonzalez
Moderator
  • 9 kudos

Hi @Ghazanfar Uruj​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 9 kudos
3 More Replies
StephanieRivera
by Valued Contributor II
  • 1065 Views
  • 3 replies
  • 6 kudos
  • 1065 Views
  • 3 replies
  • 6 kudos
Latest Reply
jose_gonzalez
Moderator
  • 6 kudos

Hi @Stephanie Rivera​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 6 kudos
2 More Replies
Karl
by New Contributor II
  • 13216 Views
  • 2 replies
  • 3 kudos

PySpark column object not callable using "when otherwise" transformation

The very first "when" function results in the posted error message (see image). The print statement of the count of df_td_amm works. A printSchema of the "df_td_amm" data frame confirms that "AGE" is a column. A select statement is also successful, s...

Error
  • 13216 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

the syntax is when(....).otherwise(...), not other(...)And there are some backslashes missing.

  • 3 kudos
1 More Replies
antoniodavideca
by New Contributor III
  • 1685 Views
  • 5 replies
  • 1 kudos

Resolved! Jobs REST Api - Run a Job that is connected to a git_source

On Jobs REST API is possible to create a new Job, specifying a git_source.My question is about triggering the job.Still on Jobs REST Api is possible to trigger a job using the job_id, but I don't find a way to tell anyhow to Databricks, what's the en...

  • 1685 Views
  • 5 replies
  • 1 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 1 kudos

Ah. Got it. So is your issue resolved or are you looking for further information.

  • 1 kudos
4 More Replies
Gabriel0007
by New Contributor III
  • 1518 Views
  • 0 replies
  • 3 kudos

How to save json data to Delta Table: ParseError on Insert

I'm trying to save the returned json data from a requests API call to a Delta Table. I get a ParseError when I INSERT the response object which is in json format. The error shows the json data and a marker that states a ' or } or ) is missing. I v...

  • 1518 Views
  • 0 replies
  • 3 kudos
Kit
by New Contributor III
  • 2296 Views
  • 7 replies
  • 1 kudos

Resolved! Can't run a job that use GitHub as source

I have a list of jobs that are using the code in GitHub as source.Everything worked fine until yesterday. Yesterday, I noticed that all the job that were using GitHub as source were failing. Because of the following error: ``` Run result unavailable:...

  • 2296 Views
  • 7 replies
  • 1 kudos
Latest Reply
User16766737456
New Contributor III
  • 1 kudos

Just an update, to round this out. We investigated further internally, and found that although we have a cleanup process in place to remove the internal repos that are being checked out for workflows, it was failing to catch up due to the sheer volum...

  • 1 kudos
6 More Replies
antoniodavideca
by New Contributor III
  • 1270 Views
  • 2 replies
  • 0 kudos

Jobs REST Api - Create new Job with a new Cluster, and install a Maven Library on the Cluster

I would need to use the Job REST API to create a Job on our databrick Cluster.At the Job Creation, is possible to specify an existing cluster, or, create a new one.I can forward alot of information to the Cluster, but what I would like to specify is ...

  • 1270 Views
  • 2 replies
  • 0 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 0 kudos

@Antonio Davide Cali​ You can use the existing cluster in your json to use it for the job.To update or push libraries to the job, you can use the JobsUpdate API. As you want to push libraries to the cluster, you can push them using the new setting an...

  • 0 kudos
1 More Replies
Lazloo
by New Contributor III
  • 629 Views
  • 0 replies
  • 2 kudos

Cannot load spark-avro jars with databricksversion 10.4

Currently, I am facing an issue since the `databricks-connect` runtime on our cluster was updated to 10.4. Since then, I cannot load the jars for spark-avro anymore. By Running the following code from pyspark.sql import SparkSession   spark = SparkSe...

  • 629 Views
  • 0 replies
  • 2 kudos
Labels
Top Kudoed Authors