cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

priyansh
by New Contributor III
  • 200 Views
  • 3 replies
  • 0 kudos

How Photon Acceleration Actually works?

Hey folks!I would like to know that how photon acceleration actually works, I have tested it on a sample of 219MB, 513MB, 2.7 GB, 4.1 GB of Data and the difference in seconds between normal and photon accelerated compute was not so much, So my questi...

image (4).png
  • 200 Views
  • 3 replies
  • 0 kudos
Latest Reply
arch_db
New Contributor II
  • 0 kudos

Try to check merge operation on tables over 200GB.

  • 0 kudos
2 More Replies
Jorge3
by New Contributor III
  • 304 Views
  • 2 replies
  • 2 kudos

How to Upload Python Wheel Artifacts to a Volume from a DAB Run?

Hello,I'm currently working on a Databricks Assets Bundle (DAB) that builds and deploys a Python wheel package. My goal is to deploy this package to a Volume so that other DAB jobs can use this common library.I followed the documentation and successf...

  • 304 Views
  • 2 replies
  • 2 kudos
Latest Reply
dataeng42io
New Contributor III
  • 2 kudos

Hi @Jorge3 Hope I am not too lake to answer but here is my suggestion.If you reference to the docs to consume a wheel that is in a volume you can configure your job to reference your wheel in your volume.Documentation: > https://learn.microsoft.com/e...

  • 2 kudos
1 More Replies
EricCournarie
by New Contributor II
  • 165 Views
  • 2 replies
  • 0 kudos

Metadata on a prepared statement return upper case column names

Hello,Using the JDBC Driver , when I check the metadata of a prepared statement, the column names names are all uppercase . This does not happen when running a DESCRIBE on the same select. Any properties to set , or it is a known issue ? or a workaro...

  • 165 Views
  • 2 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

Looks like a bug. Can you try using double quotes?  SELECT "ColumnName" instead of backticks?   

  • 0 kudos
1 More Replies
camilo_s
by Contributor
  • 361 Views
  • 3 replies
  • 0 kudos

Spark SQL vs serverless SQL

Are there any benchmarks showing performance and cost differences between running SQL workloads on Spark SQL vs Databricks SQL (specially serverless SQL)?Our customer is hesitant about getting locked into Databricks SQL as opposed to being able to ru...

  • 361 Views
  • 3 replies
  • 0 kudos
Latest Reply
robinhood555
New Contributor II
  • 0 kudos

@camilo_s wrote:Are there any benchmarks showing performance and cost differences between running SQL workloads on Spark SQL vs Databricks SQL (specially serverless SQL)?  hpinstantinkOur customer is hesitant about getting locked into Databricks SQL ...

  • 0 kudos
2 More Replies
radix
by New Contributor II
  • 77 Views
  • 0 replies
  • 0 kudos

Pool clusters and init scripts

Hey, just trying out pool clusters and providing the instance_pool_type and driver_instance_pool_id configuration to the Airflow new_cluster fieldI also pass the init_scripts field with an s3 link as usual but it this case of pool clusters it doesn't...

  • 77 Views
  • 0 replies
  • 0 kudos
shsalami
by New Contributor III
  • 227 Views
  • 2 replies
  • 0 kudos

Sample streaming table is failed

Running the following databricks sample code in the pipeline: CREATE OR REFRESH STREAMING TABLE customersAS SELECT * FROM cloud_files("/databricks-datasets/retail-org/customers/", "csv") I got error:org.apache.spark.sql.catalyst.ExtendedAnalysisExcep...

  • 227 Views
  • 2 replies
  • 0 kudos
Latest Reply
shsalami
New Contributor III
  • 0 kudos

There is no table with that name.Also, in that folder just the following file exists:dbfs:/databricks-datasets/retail-org/customers/customers.csv

  • 0 kudos
1 More Replies
shsalami
by New Contributor III
  • 218 Views
  • 2 replies
  • 1 kudos

Resolved! Materialize view creation is failed

I have 'ALL_PRIVILEGES' and 'USE_SCHEMA' on lhdev.gld_sbx schema but the following command has been failed with the error:DriverException: Unable to process statement for Table 'customermvx' create materialized view customermvxasselect *from lhdev.gl...

  • 218 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 1 kudos

Hi @shsalami ,According to below documentation snippet, you also need USE CATALOG privilege on the parent catalog. "The user who creates a materialized view (MV) is the MV owner and needs to have the following permissions:SELECT privilege over the ba...

  • 1 kudos
1 More Replies
pedrojunqueira
by New Contributor II
  • 1158 Views
  • 4 replies
  • 2 kudos

Resolved! Generating Personal Access Token to service principle databricks cli

Hi I am having issues generating personal access token to my service principle.I followed the steps from here my `~/.databrickscfg` has the following```[my-profile-name]host = <account-console-url>account_id = <account-id>azure_tenant_id = <azure-ser...

  • 1158 Views
  • 4 replies
  • 2 kudos
Latest Reply
PabloCSD
Contributor
  • 2 kudos

I want something similar, to use a service principal token instead of a PAT, have you ever done this?https://community.databricks.com/t5/administration-architecture/use-a-service-principal-token-instead-of-personal-access-token/m-p/91629

  • 2 kudos
3 More Replies
shinaushin
by New Contributor II
  • 2406 Views
  • 14 replies
  • 3 kudos

Session expired, cannot log back in to Community Edition

Whenever I am logged into my Community Edition account and leave it idle for a bit, it says that my session has expired, which is understandable. However, when I try logging back in with the exact same credentials, I receive an error saying that a Co...

  • 2406 Views
  • 14 replies
  • 3 kudos
Latest Reply
prajwalreddy
New Contributor II
  • 3 kudos

same issue i also facing from 2 weeks , session expired please login again this message will be shown after every 2-3 minute.

  • 3 kudos
13 More Replies
TCK
by New Contributor II
  • 282 Views
  • 2 replies
  • 0 kudos

Embedding external content and videos via IFrame

Hello there,I'm currently creating a notebook which contains a training course for data engineering.For certain topics it would be nice to embed external resources like Youtube videos so participants do not have to leave the notebook to watch the vid...

  • 282 Views
  • 2 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

Did you try using displayHTML() ?  https://docs.databricks.com/en/visualizations/html-d3-and-svg.html

  • 0 kudos
1 More Replies
Rik
by New Contributor III
  • 4566 Views
  • 9 replies
  • 9 kudos

Resolved! File information is not passed to trigger job on file arrival

We are using the UC mechanism for triggering jobs on file arrival, as described here: https://learn.microsoft.com/en-us/azure/databricks/workflows/jobs/file-arrival-triggers.Unfortunately, the trigger doesn't actually pass the file-path that is gener...

Data Engineering
file arrival
trigger file
Unity Catalog
  • 4566 Views
  • 9 replies
  • 9 kudos
Latest Reply
artemich
New Contributor II
  • 9 kudos

Same here!Additionally would be great to enhance it to support not just the path to a directory, but also the prefix of the file name (or regex for bonus points). Right now if you have 10 types of files arriving to the same folder, it would be much c...

  • 9 kudos
8 More Replies
rcostanza
by New Contributor II
  • 120 Views
  • 0 replies
  • 1 kudos

Changing a Delta Live Table's schema

I have a Delta Live Table whose source is a Kafka stream. One of the columns is a Decimal and I need to change its precision.What's the correct approach to changing the DLT's schema?Just changing the column's precision in the DLT definition will resu...

  • 120 Views
  • 0 replies
  • 1 kudos
johnb1
by Contributor
  • 397 Views
  • 5 replies
  • 0 kudos

Resolved! SQL UDF vs. Python UDF, SQL UDF vs. Pandas UDF

I would like to understand how(1) SQL UDFs compare to Python UDFs(2) SQL UDFs compare to Pandas UDFsEspecially in terms of performance.I cannot find any documentation on the topics, also not in the official Databricks documentation (which unfortunate...

  • 397 Views
  • 5 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

The first sublink has SQL UDFs where you can write your SQL UDF using SQL or Python. This Python implementation is different from the one mentioned above. https://docs.databricks.com/en/udf/unity-catalog.html

  • 0 kudos
4 More Replies
techie001
by New Contributor
  • 166 Views
  • 1 replies
  • 0 kudos

Delta Live tables vs Azure SQL DB for a read intensive application

Hi,I am looking for some advice to compare cost and performace between Delta Live tables vs Azure SQL DB files in Azure Blob for building the backend for a web application.There would be very frequent read operation(Multiple searches every second) an...

  • 166 Views
  • 1 replies
  • 0 kudos
Latest Reply
gchandra
Valued Contributor II
  • 0 kudos

As far as I know, Azure SQL DB is RDBMS, whereas DLT is to build a data pipeline. 

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels