cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

YSF
by New Contributor III
  • 2182 Views
  • 2 replies
  • 1 kudos

Resolved! Issues with using Databricks-Connect and Petastorm

Has anyone successfully used Petastorm + Databricks-Connect + Delta Lake?The use case is being able to use DeltaLake as a data store regardless of whether I want to use the databricks workspace or not for my training tasks.I'm using a cloud-hosted ju...

  • 2182 Views
  • 2 replies
  • 1 kudos
Latest Reply
YSF
New Contributor III
  • 1 kudos

because its janky or why? I don't need it for customer facing production. More so for if I'm using my own HPC or local workstation, but I want to access data from delta lake. Figured it was easier/preferable to setting up my own spark environment loc...

  • 1 kudos
1 More Replies
guruv
by New Contributor III
  • 17664 Views
  • 4 replies
  • 5 kudos

Resolved! parquet file to include partitioned column in file

HI,I have a daily scheduled job which processes the data and write as parquet file in a specific folder structure like root_folder/{CountryCode}/parquetfiles. Where each day job will write new data for countrycode under the folder for countrycodeI am...

  • 17664 Views
  • 4 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

Most external consumers will read partition as column when are properly configured (for example Azure Data Factory or Power BI).Only way around is that you will duplicate column with other name (you can not have the same name as it will generate conf...

  • 5 kudos
3 More Replies
Development
by New Contributor III
  • 656 Views
  • 0 replies
  • 0 kudos

Hi All, I hope you're doing well I am facing issue while installing an python library on ADB Cluster. lib - PyCaret ( latest version) its not gett...

Hi All,I hope you're doing wellI am facing issue while installing an python library on ADB Cluster.lib - PyCaret ( latest version)its not getting install and showing me 'Failed' Status.It would be great if you can help here !!Thanks

  • 656 Views
  • 0 replies
  • 0 kudos
TimK
by New Contributor II
  • 3503 Views
  • 2 replies
  • 1 kudos

Resolved! Cannot Get Databricks SQL to read external Hive Metastore

I have followed the documentation and using the same metastore config that is working in the Data Engineering context. When attempting to view the Databases, I get the error:Encountered an internal errorThe following information failed to load:The li...

  • 3503 Views
  • 2 replies
  • 1 kudos
Latest Reply
TimK
New Contributor II
  • 1 kudos

@Bilal Aslam​  I didn't think to look there before since I hadn't tried to run any queries. I see the failed SHOW DATABASES queries in history and they identify the error: Builtin jars can only be used when hive execution version == hive metastore v...

  • 1 kudos
1 More Replies
daschl
by Contributor
  • 11243 Views
  • 18 replies
  • 8 kudos

Resolved! NoSuchMethodError: org.apache.spark.sql.catalyst.json.CreateJacksonParser on Databricks Cloud (but not on Spark Directly)

Hi,I'm working for Couchbase on the Couchbase Spark Connector and noticed something weird which I haven't been able to get to the bottom of so far.For query DataFrames we use the Datasource v2 API and we delegate the JSON parsing to the org.apache.sp...

  • 11243 Views
  • 18 replies
  • 8 kudos
Latest Reply
daschl
Contributor
  • 8 kudos

Since there hasn't been any progress on this for over a month, I applied a workaround and copied the classes into the connector source code so we don't have to rely on the databricks classloader. It seems to work in my testing and will be released wi...

  • 8 kudos
17 More Replies
KaushalPatidar
by New Contributor II
  • 2900 Views
  • 3 replies
  • 0 kudos

I cannot access my account, please help

Hi, when I am trying to log into my account, its showing "invalid email address and password". But,I am sure everything is correct. I request @Kaniz Fatma​ , @Harikrishnan Kunhumveettil​ and @Prabakar Ammeappin​ to please look into it and resolve thi...

  • 2900 Views
  • 3 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi, when I am trying to log into my account, its showing "invalid email address and password". But,I am sure everything is correct. I request @Kaniz Fatma​ , @Harikrishnan Kunhumveettil​ and @Prabakar Ammeappin​ to please look into it and resolve thi...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
2 More Replies
grandsurgical
by New Contributor
  • 431 Views
  • 0 replies
  • 0 kudos

Grand Surgical, established in 2010, has been manufacturing high-quality Surgical instruments for all disciplines of surgery. Cardiac, Vascular, denta...

Grand Surgical, established in 2010, has been manufacturing high-quality Surgical instruments for all disciplines of surgery. Cardiac, Vascular, dental, ophthalmic.We develop and deliver hospitals and medical professionals worldwide with superior qua...

  • 431 Views
  • 0 replies
  • 0 kudos
Azam
by New Contributor III
  • 1877 Views
  • 0 replies
  • 2 kudos

Databricks Community Edition Not able to Login Account

I am studying databricks and I have an community edition account since November 19, 2021 and from December 22nd I am not able to login. "Invalid email address or password" error is thrown. When forgot password link is clicked no email is sent to regi...

  • 1877 Views
  • 0 replies
  • 2 kudos
theclubprice
by New Contributor
  • 486 Views
  • 0 replies
  • 0 kudos

The Club Price is a leading supplier of high-quality, affordable products whose clientele spans all over Texas and neighboring areas since 1992. We ha...

The Club Price is a leading supplier of high-quality, affordable products whose clientele spans all over Texas and neighboring areas since 1992. We have an exemplary track record of increasing our partners’ sales with our variety in products. We have...

  • 486 Views
  • 0 replies
  • 0 kudos
rednirusmart
by New Contributor
  • 516 Views
  • 0 replies
  • 0 kudos

Rednirus Mart is a Third-Party Pharma Manufacturer and Supplier. If you are looking For Pharma Contract manufacturers For Ayurvedic Medicine Manufactu...

Rednirus Mart is a Third-Party Pharma Manufacturer and Supplier. If you are looking For Pharma Contract manufacturers For Ayurvedic Medicine Manufacturer Company in your region. Rednirus Mart is one of the leading one and their products are manufactu...

Third Party Pharma Manufacturers
  • 516 Views
  • 0 replies
  • 0 kudos
pjp94
by Contributor
  • 2244 Views
  • 1 replies
  • 3 kudos

Use '%sql' inside a python cmd cell?

Hi so I want to essentially execute a sql query if a condition is met. So one of my cells in my python notebook is a sql query (%sql followed by the query). Is there any way to put that in an 'IF' statement ie if an environment variable = some value,...

  • 2244 Views
  • 1 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

in python cell just use:query = "SELECT 1"spark.sql(query)

  • 3 kudos
RasmusOlesen
by New Contributor III
  • 7320 Views
  • 4 replies
  • 2 kudos

Upgrading from Spark 2.4 to 3.2: Recursive view errors when using

We get errors like this,Recursive view `x` detected (cycle: `x` -> `x`).. in our long-term working code, that has worked just fine in Spark 2.4.5 (Runtime 6.4), when we run it on a Spark 3.2 cluster (Runtime 10.0).It happens whenever we have,<x is a ...

  • 7320 Views
  • 4 replies
  • 2 kudos
Latest Reply
arkrish
New Contributor II
  • 2 kudos

This is a breaking change introduced in Spark 3.1 From Migration Guide: SQL, Datasets and DataFrame - Spark 3.1.1 Documentation (apache.org)In Spark 3.1, the temporary view will have same behaviors with the permanent view, i.e. capture and store runt...

  • 2 kudos
3 More Replies
Ryan_Chynoweth
by Esteemed Contributor
  • 1218 Views
  • 0 replies
  • 0 kudos

Azure_DAAM

Attached to this post we have added an ADLS Gen2 access recommendation to have the ideal security and governance over your data. The best practice involves leveraging Cluster ACLs, cluster configuration, and secret ACLs to handle user access over you...

  • 1218 Views
  • 0 replies
  • 0 kudos
sarvesh
by Contributor III
  • 1005 Views
  • 1 replies
  • 3 kudos

Audit Vertica tables in Spark!

I am trying to use Audit from Vertica in spark and getting correct table size from it, but the minimum size Audit function can find is bytes, but we are getting data in bits even smaller than bytes. val size = f"select audit('table_name');"

  • 1005 Views
  • 1 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

Rather everything will be in bytes. Spak sql have built in methods to get table size but also in bytes:spark.sql("ANALYZE TABLE df COMPUTE STATISTICS NOSCAN")spark.sql("DESCRIBE EXTENDED df ").filter(col("col_name") === "Statistics").show(false)

  • 3 kudos
Hayley
by Databricks Employee
  • 3209 Views
  • 1 replies
  • 2 kudos

What is the best way to do EDA in Databricks?

Are there example notebooks to quickstart the exploratory data analysis?

  • 3209 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hayley
Databricks Employee
  • 2 kudos

A quick way to start exploratory data analysis is to use the EDA notebook that is created when you use Databricks AutoML. Then you can use the notebook generated as is, or as a starting point for modeling. You’ll need a cluster with Databricks Runtim...

  • 2 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels