cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Erik_L
by Contributor II
  • 3436 Views
  • 2 replies
  • 1 kudos

Resolved! Pyspark read multiple Parquet type expansion failure

ProblemReading nearly equivalent parquet tables in a directory with some having column X with type float and some with type double fails.Attempts at resolvingUsing streaming filesRemoving delta caching, vectorizationUsing ,cache() explicitlyNotesThis...

  • 3436 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Erik Louie​ Help us build a vibrant and resourceful community by recognizing and highlighting insightful contributions. Mark the best answers and show your appreciation!Regards

  • 1 kudos
1 More Replies
abi-tosh
by New Contributor III
  • 4068 Views
  • 6 replies
  • 4 kudos

Databricks Attribute Error: 'IPythonShell' object has no attribute 'kernel'

I have been getting this error repeatedly when trying to run a notebook. I have tried attaching multiple different clusters and installing some of the libraries that it wanted me to update. I have also tried to clear the state of the notebook and res...

  • 4068 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Toshali Mohapatra​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best ans...

  • 4 kudos
5 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 3489 Views
  • 2 replies
  • 8 kudos

Databricks has announced that users can now create notebooks in Jupyter format in Repos, offering a familiar experience for creating and editing noteb...

Databricks has announced that users can now create notebooks in Jupyter format in Repos, offering a familiar experience for creating and editing notebooks. This update allows users to integrate with the broader data science ecosystem, import and expo...

Untitled
  • 3489 Views
  • 2 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hi @Hubert Dudek​ Thanking you for helping us to build a vibrant and resourceful community by recognizing and highlighting insightful contributions.Regards

  • 8 kudos
1 More Replies
xhh
by New Contributor
  • 1076 Views
  • 2 replies
  • 0 kudos
  • 1076 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @令辉 孔​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback will he...

  • 0 kudos
1 More Replies
William_Scardua
by Valued Contributor
  • 1860 Views
  • 2 replies
  • 1 kudos

Cosmos DB Connector for 12.1 Cluster and above

Hi guys,​You know what`s the version suporte the databricks cluster version above 12.1 ?my cluster:error: Thank you

databricks-cluster cosmosdb-connector-erro
  • 1860 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @William Scardua​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 1 kudos
1 More Replies
883702
by New Contributor III
  • 1985 Views
  • 1 replies
  • 0 kudos

Resolved! TypeError on DataFrame via spark readStream transform invocation of UDF

Our use case is to "clean up" column names (remove spaces, etc) on ingestion of CSV data using the Delta Live Table capability. We desire to use the schema inference capability during ingestion so schema specification (up front) will not be happenin...

  • 1985 Views
  • 1 replies
  • 0 kudos
Latest Reply
883702
New Contributor III
  • 0 kudos

The issue was erroneously believing the transform function needed UDF decorator. With the decorator removed the transform invokes (and works) as expected.

  • 0 kudos
Phani1
by Valued Contributor II
  • 1763 Views
  • 3 replies
  • 0 kudos

Performance issue while loading bulk data into Post Gress DB from data bricks.

We are facing a performance issue while loading bulk data into Postgress DB from data bricks. We are using spark JDBC connections to move the data. However, the rate of transfer is very low which is causing performance bottleneck. is there any better...

  • 1763 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16502773013
Databricks Employee
  • 0 kudos

Hello @Janga Reddy​ @Daniel Sahal​ and @Vidula Khanna​ ,To enhance performance in general we need to design for more parallelism, in Spark JDBC context this controlled by the number of partitions for the data to be writtenThe example here shows how t...

  • 0 kudos
2 More Replies
Avvar2022
by Contributor
  • 3026 Views
  • 2 replies
  • 2 kudos

Resolved! I am new to data bricks. setting up Workspace for NON-prod environment Separate workspaces for DEV, QA or Just one work space for NON-prod ?

What i learned based on learning materials, documents, etc.. For data bricks it is a good practice to set up 1 non-prod workspace but separate clusters for Dev, QA, SIT, etc.Is it best practice to set up only 1 NON-PROD Workspace instead of separate ...

Databricks non-prod workspace set up options
  • 3026 Views
  • 2 replies
  • 2 kudos
Latest Reply
Avvar2022
Contributor
  • 2 kudos

Thank you. This helps.

  • 2 kudos
1 More Replies
Arnold_Souza
by New Contributor III
  • 3441 Views
  • 4 replies
  • 2 kudos

SAT - Security Analysis Tool implementation error

I want to implement SAT in my workspace account. I was able to execute the terraform that enable the necessary infra to work on that. When I try to execute the workflow "SAT Initializer Notebook (one-time)" it fails with the error:AnalysisException: ...

1 2
  • 3441 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Arnold Souza​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 2 kudos
3 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 2686 Views
  • 1 replies
  • 7 kudos

SQL cells in databricks notebooks can now be run in parallel, which means faster query processing and analysis. This new feature is especially helpful...

SQL cells in databricks notebooks can now be run in parallel, which means faster query processing and analysis. This new feature is especially helpful for queries that take longer to run or analyze large datasets. With parallel processing, Databricks...

paraler
  • 2686 Views
  • 1 replies
  • 7 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 7 kudos

Informative ​

  • 7 kudos
oleole
by Contributor
  • 11449 Views
  • 1 replies
  • 1 kudos

Resolved! MERGE to update a column of a table using Spark SQL

Coming from MS SQL background, I'm trying to write a query in Spark SQL that simply update a column value of table A (source table) by INNER JOINing a new table B with a filter.MS SQL query looks like this:UPDATE T SET T.OfferAmount = OSE.EndpointEve...

  • 11449 Views
  • 1 replies
  • 1 kudos
Latest Reply
oleole
Contributor
  • 1 kudos

Posting answer to my question:   MERGE into TempOffer VIEW USING OfferSeq OSE ON VIEW.OfferId = OSE.OfferID AND OSE.OfferId = 1 WHEN MATCHED THEN UPDATE set VIEW.OfferAmount = OSE.EndpointEventAmountValue;

  • 1 kudos
RyanHager
by Contributor
  • 2435 Views
  • 5 replies
  • 2 kudos

Is there a stream / Kafka topic that we can connect to for monitoring all Databricks jobs/workflows (create/status update/fail/error/complete)?

Currently we are creating and monitoring jobs using the api. This results in a lot of polling of the API for job status. Is there a Kafka stream, we could listen to get jobs updates and significantly reduce the number of calls to the Databricks jobs...

  • 2435 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Ryan Hager​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 2 kudos
4 More Replies
Ramana
by Contributor
  • 2635 Views
  • 3 replies
  • 3 kudos

Resolved! How do we set spark_version in cluster policies to select the latest GPU ML LTS version as defaultValue?

Currently, I use the below two different JSON snippets to choose either Standard or ML runtime. Similar to the below, what is the defaultValue for spark_version to select the latest GPU ML LTS runtime version? "spark_version": {  "type": "regex",  "p...

  • 2635 Views
  • 3 replies
  • 3 kudos
Latest Reply
LandanG
Databricks Employee
  • 3 kudos

Hi @Ramana Kancharana​ ,As of right now these options are only available for non-GPU DBRs

  • 3 kudos
2 More Replies
irfanaziz
by Contributor II
  • 3725 Views
  • 1 replies
  • 3 kudos

TimestampFormat issue

The databricks notebook failed yesterday due to timestamp format issue. error:"SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: Fail to parse '2022-08-10 00:00:14.2760000' in the new parser. You can set spark.s...

  • 3725 Views
  • 1 replies
  • 3 kudos
Latest Reply
searchs
New Contributor II
  • 3 kudos

You must have solved this issue by now but for the sake of those that encounter this again, here's the solution that worked for me:spark.sql("set spark.sql.legacy.timeParserPolicy=LEGACY")

  • 3 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels