cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Mr__E
by Contributor II
  • 4240 Views
  • 5 replies
  • 5 kudos

Resolved! Using shared python wheels for job compute clusters

We have a GitHub workflow that generates a python wheel and uploads to a shared S3 available to our Databricks workspaces. When I install the Python wheel to a normal compute cluster using the path approach, it correctly installs the Python wheel and...

  • 4240 Views
  • 5 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

You can mount S3 as a DBFS folder then set that library in "cluster" -> "libraries" tab -> "install new" -> "DBFS" 

  • 5 kudos
4 More Replies
snarfed
by New Contributor II
  • 2141 Views
  • 4 replies
  • 5 kudos

Serverless SQL endpoints on Azure?

Serverless SQL Endpoints sound exciting! Sounds like they've been in preview on AWS for a couple months. Any idea if/when they're coming to Azure?

  • 2141 Views
  • 4 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @snarfed​ , A SQL endpoint is a computation resource that lets you run SQL commands on data objects within Databricks SQL. This article introduces SQL endpoints and describes how to work with them using the Databricks SQL UI. A SQL endpoint is a t...

  • 5 kudos
3 More Replies
ArindamHalder
by New Contributor II
  • 1896 Views
  • 3 replies
  • 3 kudos

Resolved! Is there any performance result available for DeltaLake?

Specifically for write and read streaming data to HDFS or s3 etc. For IoT specific scenario how it performs on time series transactional data. Can we consider delta table as time series table?

  • 1896 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Arindam Halder​ , How is it going?Were you able to resolve your problem?

  • 3 kudos
2 More Replies
Anonymous
by Not applicable
  • 3061 Views
  • 3 replies
  • 4 kudos

Resolved! Play the BIG DATA GAME | By Firebolt

https://www.firebolt.io/big-data-gameThe most fun our Bricksters have had in a while at work is thanks to a little BIG DATA thing called The BIG DATA GAME ️This game is the cure for the mid-week blues. The Big Data Game is a simple yet awesome online...

Image Image
  • 3061 Views
  • 3 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Haha !!!@Lindsay Olson​ , @Hubert Dudek​ !!I kept dying at the Data Lake House. 

  • 4 kudos
2 More Replies
sachinmkp1
by New Contributor II
  • 41487 Views
  • 3 replies
  • 1 kudos

Resolved! org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 69 tasks (4.0 GB) is bigger than spark.driver.maxResultSize (4.0 GB)

set spark.conf.set("spark.driver.maxResultSize", "20g") get spark.conf.get("spark.driver.maxResultSize") // 20g which is expected in notebook , I did not do in cluster level setting still getting 4g while executing the spark job , why? because of th...

  • 41487 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @sachinmkp1@gmail.com​ , Does @Jose Gonzalez​  's reply answer your question?

  • 1 kudos
2 More Replies
William_Scardua
by Valued Contributor
  • 2938 Views
  • 4 replies
  • 2 kudos

Resolved! Error/Exception when a read websocket with readStream

Hi guys, how are you ? Can you help me ? that my situation When I try to read a websocket with readStream I receive a unknow error exception java.net.UnknownHostException That's my code wssocket = spark\ .readStream\ .forma...

  • 2938 Views
  • 4 replies
  • 2 kudos
Latest Reply
Deepak_Bhutada
Contributor III
  • 2 kudos

It will definitely create a streaming object. So, don't go by wssocket.isStreaming = Truepiece. Also, it will create the streaming object without any issue. Since lazy evaluation Now, coming to the issue, please put the IP directly, sometimes the sla...

  • 2 kudos
3 More Replies
SivakrishnaSunk
by New Contributor II
  • 1586 Views
  • 2 replies
  • 2 kudos

Resolved! Azure synapse writing data with Databricks using polybase error

We are writing the data from Delta tables to Azure synapse using Azure Databricks. While loading the data into the synapse staging table getting an error " HdfsBridge::CreateRecordReader - Unexpected error encountered creating the record reader: abf...

  • 1586 Views
  • 2 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Hi SivakrishnaSunkara,You will find more information and example in the following link url https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/synapse-analytics please follow the steps fro the authentication section to avoid thi...

  • 2 kudos
1 More Replies
Abeeya
by New Contributor II
  • 4979 Views
  • 2 replies
  • 5 kudos

Resolved! How to Overwrite Using pyspark's JDBC without loosing constraints on table columns

Hello,My table has primary key constraint on a perticular column, Im loosing primary key constaint on that column each time I overwrite the table , What Can I do to preserve it? Any Heads up would be appreciatedTried Belowdf.write.option("truncate", ...

  • 4979 Views
  • 2 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Abeeya .​ , How are you? Did @Hubert Dudek​ 's answer help you in any way? Please let us know.

  • 5 kudos
1 More Replies
OmanEvisa
by New Contributor
  • 385 Views
  • 0 replies
  • 0 kudos

PROCESS OF APPLYING FOR OMAN E-VISA The Oman e-Visa was initiated in 2018, for making the process easy. Presently, 220 countries in the world are elig...

PROCESS OF APPLYING FOR OMAN E-VISAThe Oman e-Visa was initiated in 2018, for making the process easy. Presently, 220 countries in the world are eligible to apply for Oman e-Visa. Tourists can apply for visas online by submitting the Oman visa applic...

  • 385 Views
  • 0 replies
  • 0 kudos
JBear
by New Contributor III
  • 3654 Views
  • 8 replies
  • 4 kudos

Resolved! Cant find reason but suddenly new Jobs are getting huge job id numbers. example 945270539673815

Created Job ID is suddenly started to make huge numbers, and that is now making problems in Terraform plan, cause int is too big Error: strconv.ParseInt: parsing "945270539673815": value out of rangeIm new on the board and pretty new with Databricks ...

  • 3654 Views
  • 8 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Jere Karhu​ , In case you are using the Job/Run id in API, please be advised that you will need to change the client-side logic to process int64/long and expect a random number. In some cases, you just need to change the declared type in their so...

  • 4 kudos
7 More Replies
Mr__E
by Contributor II
  • 2558 Views
  • 3 replies
  • 3 kudos

Resolved! Importing MongoDB with field names containing spaces

I am currently using a Python notebook with a defined schema to import fairly unstructured documents in MongoDB. Some of these documents have spaces in their field names. I define the schema for the MongoDB PySpark connector like the following:Struct...

  • 2558 Views
  • 3 replies
  • 3 kudos
Latest Reply
Mr__E
Contributor II
  • 3 kudos

Solution: It turns out the issue is not the schema reading in, but the fact that I am writing to Delta tables, which do not currently support spaces. So, I need to transform them prior to dumping. I've been following a pattern of reading in raw data,...

  • 3 kudos
2 More Replies
Krishscientist
by New Contributor III
  • 2290 Views
  • 1 replies
  • 2 kudos

Resolved! Issue when reading .wav file

Hi, I am developing notebook to read .wav files and build Speech Matching Scenario. I have saved files in "/FileStore/tables/doors_and_corners_kid_thats_where_they_get_you.wav".When I wrote code like thisfrom scipy.io import wavfileimport numpy as np...

  • 2290 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Try to prefix it with dbfs dbfs:/FileStore or /dbfs/FileStore

  • 2 kudos
CBull
by New Contributor III
  • 4319 Views
  • 7 replies
  • 3 kudos

Resolved! Spark Notebook to import data into Excel

Is there a way to create a notebook that will take the SQL that I want to put into the Notebook and populate Excel daily and send it to a particular person?

  • 4319 Views
  • 7 replies
  • 3 kudos
Latest Reply
merca
Valued Contributor II
  • 3 kudos

Do I understand you correctly:You want to run a notebook or sql query that will generate some data in form of table and you need to "send" somehow this data to someone (or somebody needs this data at some point)?If this is correct assumption, you hav...

  • 3 kudos
6 More Replies
Sudeshna
by New Contributor III
  • 10675 Views
  • 7 replies
  • 8 kudos

Resolved! I am new to Databricks SQL and want to create a variable which can hold calculations either from static values or from select queries similar to SQL Server. Is there a way to do so?

I was trying to create a variable and i got the following error -command - SET a = 5;Error -Error running queryConfiguration a is not available.

  • 10675 Views
  • 7 replies
  • 8 kudos
Latest Reply
BilalAslamDbrx
Honored Contributor III
  • 8 kudos

@Sudeshna Bhakat​ what @Joseph Kambourakis​ described works on clusters but is restricted on Databricks SQL endpoints i.e. only a limited number of SET commands are allowed. I suggest you explore the curly-braces (e.g. {{ my_variable }}) in Databrick...

  • 8 kudos
6 More Replies
shelms
by New Contributor II
  • 12956 Views
  • 3 replies
  • 7 kudos

Resolved! SQL CONCAT returning null

Has anyone else experienced this problem? I'm attempting to SQL concat two fields and if the second field is null, the entire string appears as null. The documentation is unclear on the expected outcome, and contrary to how concat_ws operates.SELECT ...

Screen Shot 2022-03-14 at 4.00.53 PM
  • 12956 Views
  • 3 replies
  • 7 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 7 kudos

Hi @Steve Helms​ , Would you like to share with us whether you got your answer, or else do you require more help? Would you like to mark the best answer in case your problem is resolved?

  • 7 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels