cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Vishalkhode1206
by New Contributor III
  • 449 Views
  • 1 replies
  • 0 kudos

Certification discount voucher

HiCould you please provide the information to get a free voucher for the Data Engineer Associate Exam that I am planning for this week?Thanks in Advance

  • 449 Views
  • 1 replies
  • 0 kudos
Latest Reply
florence023
New Contributor III
  • 0 kudos

@Vishalkhode1206 wrote:HiCould you please provide the information to get a free voucher for the Data Engineer Associate Exam that I am planning for this week? Official SiteThanks in AdvanceHello,I'm also looking for information on how to get a free v...

  • 0 kudos
Gubbanoa
by New Contributor II
  • 658 Views
  • 3 replies
  • 0 kudos

Pycharm and Unit Testing

What is currently the best way of doing unit testing from pycharm into databricks? I have previously used databricks connect. However after upgrades, and now that even unit catalog has become a requirement, it appears quirky. Is it possible to use th...

  • 658 Views
  • 3 replies
  • 0 kudos
Latest Reply
florence023
New Contributor III
  • 0 kudos

@Gubbanoa wrote:What is currently the best way of doing unit testing from pycharm into databricks? I have previously used databricks connect. However after upgrades, and now that even unit catalog has become a requirement, it appears quirky. Is it po...

  • 0 kudos
2 More Replies
guangyi
by Contributor III
  • 958 Views
  • 1 replies
  • 0 kudos

Creating job under the all-purpose cluster type policy always failed

Here is the policy I just created: { "node_type_id": { "defaultValue": "Standard_D8s_v3", "type": "allowlist", "values": [ "Standard_D8s_v3", "Standard_D16s_v3" ] }, "num_workers": {...

  • 958 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @guangyi ,This may be related to the fact that DAB does not support this type of clusters. Unfortunately, this is not very well documented but look at below thread. This feature is requested and it should be available in the future:Creating All Pu...

  • 0 kudos
luiz_santana
by New Contributor
  • 573 Views
  • 1 replies
  • 0 kudos

Create external location referencing another cloud.

I have an Azure Databricks, but my data lake is in another cloud (AWS).Is it possible for me to create an external location in Azure Databricks, pointing to the container contained in the S3 Bucket.

  • 573 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @luiz_santana ,This is currently not supported. However, you can reference data in S3 bucket using below method: https://learn.microsoft.com/en-us/azure/databricks/connect/storage/amazon-s3

  • 0 kudos
jg
by New Contributor
  • 2474 Views
  • 1 replies
  • 0 kudos

build failed

hi everyone.I'm trying to deploy my first databricks project. getting this error when I try to deploy with the command databricks bundle deploy$ Building my_project...Error: build failed my_project, error: exit status 1, output: usage: setup.py [glob...

  • 2474 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Honored Contributor
  • 0 kudos

Hi @jg, How are you doing today?Try installing the wheel package with pip install wheel to resolve the missing bdist_wheel command. Make sure your environment has setuptools and wheel installed and up-to-date by running pip install --upgrade setuptoo...

  • 0 kudos
xecel
by New Contributor II
  • 2540 Views
  • 1 replies
  • 0 kudos

import error with typing_extensions, issue with pyiceberg and pydantic

Hello All,I am currently working in a Databricks environment where I am trying to use the `pyiceberg` library to interact with Iceberg table metadata directly in Unity catalog enabled. However, I'm encountering an issue with package compatibility rel...

  • 2540 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Honored Contributor
  • 0 kudos

Hi @xecel, How are you doing today?As per my understanding, Ensure you're using a compatible version of typing_extensions by installing a specific version like 4.4.0 that might work with pyiceberg. Try reinstalling the libraries (pyiceberg and typing...

  • 0 kudos
Avinash_Narala
by Valued Contributor
  • 2746 Views
  • 3 replies
  • 2 kudos

create notebook programatically

Hello,I have json content of the notebook with me.Can I know is there a way to create notebook with that content using python?

  • 2746 Views
  • 3 replies
  • 2 kudos
Latest Reply
rtreves
Contributor
  • 2 kudos

@Avinash_Narala @renancy Check out the databricks sdk documentation on `databricks.sdk.WorkspaceClient.workspace`: https://databricks-sdk-py.readthedocs.io/en/latest/workspace/workspace/workspace.html Something like the following may be helpful:w.wor...

  • 2 kudos
2 More Replies
brickster
by New Contributor II
  • 1387 Views
  • 7 replies
  • 0 kudos

Cannot up cast sizeInBytes from string to bigint

I am creating a basic delta table using CREATE SQL queryCREATE TABLE test_transact (transaction_id string, post_date date)and running this query throws an error "Cannot up cast sizeInBytes from string to bigint"Even if I try to create a dataframe and...

  • 1387 Views
  • 7 replies
  • 0 kudos
Latest Reply
filipniziol
Contributor III
  • 0 kudos

Hi @brickster ,The error message in the screenshot indicates that there is an issue with casting sizeInBytes from STRING torelated to the SnapshotState in Delta Lake. This is not caused by the columns you are trying to create in your Delta table but ...

  • 0 kudos
6 More Replies
jayj_us
by New Contributor
  • 531 Views
  • 1 replies
  • 0 kudos

Intellisense doesnt work most of the time

I have noticed that in the databricks SQL editor, the intellisense doesnt work most of the time. Is there a setting for this to work always. Its very anti productive to go look for table columns manually.

  • 531 Views
  • 1 replies
  • 0 kudos
Latest Reply
florence023
New Contributor III
  • 0 kudos

@jayj_us wrote:I have noticed that in the databricks SQL editor, the intellisense doesnt work most of the time. Official SiteIs there a setting for this to work always. Its very anti productive to go look for table columns manually.Hello,I understand...

  • 0 kudos
talenik
by New Contributor III
  • 714 Views
  • 1 replies
  • 0 kudos

Not able to access dbfs in init script GCP databricks

Hi Everyone, I am trying to access DBFS files while cluster is starting in init script on GCP databricks, but I am not able to list files which are there on DBFS. I tried to download files from GCS bucket as well but init script throws timeout errors...

Data Engineering
Databricks
GCP databricks
spark
  • 714 Views
  • 1 replies
  • 0 kudos
Latest Reply
jason34
New Contributor II
  • 0 kudos

Hello,To access DBFS files or download from GCS bucket within a Databricks cluster's init script, consider the following approaches:Install Databricks Connect on your local machine. Connect to your Databricks cluster using Databricks Connect. Use the...

  • 0 kudos
ggsmith
by New Contributor III
  • 729 Views
  • 2 replies
  • 0 kudos

Resolved! DLT Streaming Schema and Select

I am reading JSON files written to adls from Kafka using dlt and spark.readStream to create a streaming table for my raw ingest data. My schema is two arrays at the top levelNewRecord array, OldRecord array. I pass the schema and I run a select on Ne...

Data Engineering
dlt
streaming
  • 729 Views
  • 2 replies
  • 0 kudos
Latest Reply
ggsmith
New Contributor III
  • 0 kudos

I did a full refresh from the delta tables pipeline and that fixed it. I guess it was remembering the first run where I just had the top level arrays as two columns in the table. 

  • 0 kudos
1 More Replies
wesg2
by New Contributor
  • 502 Views
  • 1 replies
  • 0 kudos

Programmatically create Databricks Notebook

I am creating a databricks notebook via string concats (sample below)Notebook_Head = """# Databricks notebook source# from pyspark.sql.types import StringType# from pyspark.sql.functions import split# COMMAND ----------"""Full_NB = Notebook_Head + Mi...

  • 502 Views
  • 1 replies
  • 0 kudos
Latest Reply
filipniziol
Contributor III
  • 0 kudos

Hi @wesg2 ,One needs to be very precise when building this.The below code WORKS:# Define the content of the .py file with cell separators (Works!) notebook_content = """# Databricks notebook source # This is the header of the notebook # You can add i...

  • 0 kudos
DBUser2
by New Contributor III
  • 819 Views
  • 2 replies
  • 0 kudos

How to use transaction when connecting to Databricks using Simba ODBC driver

I'm connecting to a databricks instance using Simba ODBC driver(version 2.8.0.1002). And I am able to perform read and write on the delta tables. But if I want to do some INSERT/UPDATE/DELETE operations within a transaction, I get the below error, an...

  • 819 Views
  • 2 replies
  • 0 kudos
Latest Reply
florence023
New Contributor III
  • 0 kudos

@DBUser2 wrote:I'm connecting to a databricks instance using Simba ODBC driver(version 2.8.0.1002). And I am able to perform read and write on the delta tables. But if I want to do some INSERT/UPDATE/DELETE operations within a transaction, I get the ...

  • 0 kudos
1 More Replies
FabriceDeseyn
by Contributor
  • 8168 Views
  • 6 replies
  • 6 kudos

Resolved! What does autoloader's cloudfiles.backfillInterval do?

I'm using autoloader directory listing mode (without incremental file listing) and sometimes, new files are not picked up and found in the cloud_files-listing.I have found that using the 'cloudfiles.backfillInterval'-option can resolve the detection ...

image
  • 8168 Views
  • 6 replies
  • 6 kudos
Latest Reply
822025
New Contributor II
  • 6 kudos

If we set the backfill to 1 week, will it run only 1ce a week or rather it will look for old files not processed in every trigger ?For eg :- if we set it to 1 day and the job runs every hour, then will it look for files in past 24 hours on a sliding ...

  • 6 kudos
5 More Replies
jlanglois98
by New Contributor II
  • 2119 Views
  • 2 replies
  • 0 kudos

Bootstrap timeout during cluster start

Hi all, I am getting the following error when I try to start a cluster in our Databricks workspace for east us 2:Bootstrap Timeout:Compute terminated. Reason: Bootstrap TimeoutHelpBootstrap Timeout. Please try again later. Instance bootstrap failed c...

  • 2119 Views
  • 2 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @jlanglois98 ,Take a look at below thread. Similar issue:Solved: Re: Problem with spinning up a cluster on a new wo... - Databricks Community - 29996

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels