- 4060 Views
- 6 replies
- 0 kudos
Databricks Cluster is going down after installing the external library
I have created a Databricks cluster with below configurations.Databricks Runtime Version13.2 ML (includes Apache Spark 3.4.0, Scala 2.12)Node typei3.xlarge30.5 GB Memory, 4 CoresI created a notebook and trying to load the Mysql table which resides i...
- 4060 Views
- 6 replies
- 0 kudos
- 0 kudos
Hi, The below error describes that there is an issue connecting to the host from Databricks, you can find more details about the network configurations here at https://docs.databricks.com/administration-guide/cloud-configurations/aws/customer-managed...
- 0 kudos
- 1311 Views
- 1 replies
- 2 kudos
Resolved! Setting cluster permissions with API
I would like to update the permissions of a cluster using the API. Documentation mentions the following: patch api/2.0/permissions/{request_object_type}/{request_object_id}.Which {request_object_type} to use? ‘cluster’, ‘cluster’ and ‘compute’ are no...
- 1311 Views
- 1 replies
- 2 kudos
- 2 kudos
@KoenZandvliet clusters is the one you should be looking for.
- 2 kudos
- 4305 Views
- 5 replies
- 3 kudos
Resolved! Databricks Certified Data Engineer Associate (Version 2) Exam got suspended
Hi Team,My Databricks Certified Data Engineer Associate (Version 2) exam got suspended today and it is in suspended state.I was there continuously in front of the camera and suddenly the alert appeared and support person asked me to show the full tab...
- 4305 Views
- 5 replies
- 3 kudos
- 3 kudos
Hi @Retired_mod ,I've been into the same situation as Shifa and I've also raised ticket with Databricks but no feedback yet!Can you please help on that?Cheers,Rabie
- 3 kudos
- 545 Views
- 0 replies
- 0 kudos
java.net.SocketTimeoutException at java.net.SocketInputStream.socketRead
Databricks notebook is configured with ADLS gen2 using Service principal authentication and is able to read/write files to ADLS gen2. However, occasionally, we are seeing below errors in the production environment:java.net.SocketTimeoutException at j...
- 545 Views
- 0 replies
- 0 kudos
- 1106 Views
- 2 replies
- 0 kudos
Cannot access community account
Resetting password does not work. After I enter my new password, it just keeps processing. I waited for over 10 minutes, tried on different browsers, tried on a VPN, nothing works. Also this randomly happened. I didnt forget my password, just the sys...
- 1106 Views
- 2 replies
- 0 kudos
- 2848 Views
- 2 replies
- 0 kudos
Resolved! How to generate schema with org.apache.spark.sql.functions.schema_of_csv?
Hello, I use spark 3.4.1-hadooop 3 on windows 11. And I am struggling to generate the schema of csv data with schema_of csv function. Below is my java codes. Map<String, String> kafkaParams = new HashMap<>(); kafkaParams.put("kafka.bootstrap.servers"...
- 2848 Views
- 2 replies
- 0 kudos
- 0 kudos
I use org.apache.spark.sql.functions.lit method and solve this issue. Thank you any way.
- 0 kudos
- 7813 Views
- 4 replies
- 2 kudos
Sync the production data in environment into test environment
Hello,I have a database called sales which contain several delta tables and views in both production and test workspace. But the data is not synced because some people develop the code in test workspace. As time passed, both the data and the tables i...
- 7813 Views
- 4 replies
- 2 kudos
- 2 kudos
Hi @zyang Hope everything is going great. Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...
- 2 kudos
- 2789 Views
- 2 replies
- 2 kudos
Resolved! Confirmation that Ingestion Time Clustering is applied
The article on Ingestion Time Clustering mentions that "Ingestion Time Clustering is enabled by default on Databricks Runtime 11.2", however how can I confirm is it active for my table? For example, is there a:True/False "Ingestion Time Clustered" fl...
- 2789 Views
- 2 replies
- 2 kudos
- 2 kudos
Thanks @NandiniN, that was very helpful. I have 3 follow-up questions:If I already have a table (350GB) that has been partitioned by 3 columns: Year, Month, Day, and stored in the hive-style with subdirectories: Year=X/Month=Y/Day=Z, can I read it in...
- 2 kudos
- 2046 Views
- 1 replies
- 1 kudos
Resolved! Photon and UDF efficiency
When using a JVM engine, Scala UDFs have an advantage over Python UDFs because data doesn't have to be shifted out to the Python environment for processing. If I understand the implications of using the Photon C++ engine, any processing that needs to...
- 2046 Views
- 1 replies
- 1 kudos
- 1 kudos
Photon does not support UDFs:https://learn.microsoft.com/en-us/azure/databricks/runtime/photon#limitationsSo when creating a UDF, photon will not be used.
- 1 kudos
- 564 Views
- 0 replies
- 0 kudos
Structured Streaming and Workplace Max Jobs
From the documentation: A workspace is limited to 1000 concurrent task runs. A 429 Too Many Requests response is returned when you request a run that cannot start immediately.The number of jobs a workspace can create in an hour is limited to 10000 (i...
- 564 Views
- 0 replies
- 0 kudos
- 1182 Views
- 2 replies
- 0 kudos
Plot number of abandoned cart items by product
abandoned_carts_df = (email_carts_df.filter(col('converted') == False).filter(col('cart').isNotNull()))display(abandoned_carts_df) abandoned_items_df = (abandoned_carts_df.select(col("cart").alias("items")).groupBy("items").count())display(abandoned_...
- 1182 Views
- 2 replies
- 0 kudos
- 0 kudos
Hi @SSV_dataeng ,Try abandoned_items_df = (abandoned_carts_df.withColumn("items", explode("cart")).groupBy("items").count().sort("items"))
- 0 kudos
- 1626 Views
- 4 replies
- 0 kudos
write to Delta
spark.conf.set("spark.databricks.delta.properties.defaults.columnMapping.mode","name")products_output_path = DA.paths.working_dir + "/delta/products"products_df.write.format("delta").save(products_output_path) verify_files = dbutils.fs.ls(products_ou...
- 1626 Views
- 4 replies
- 0 kudos
- 0 kudos
Hi @SSV_dataeng ,Please check with this (you would have to indent it correctly for python)productsOutputPath = DA.workingDir + "/delta/products"(productsDF.write.format("delta").mode("overwrite").save(productsOutputPath))verify_files = dbutils.fs.ls(...
- 0 kudos
- 4519 Views
- 3 replies
- 1 kudos
Can I change Service Principal's OAuth token's expiration date?
Hi,since I have to read from a Databricks table from an external API I created a Service Principal that would start a cluster and perform the operation, to authenticate the request on behalf of the Service Principal I generate the OAuth token followi...
- 4519 Views
- 3 replies
- 1 kudos
- 1 kudos
Hello @marchino Please check if this is of your interest https://kb.databricks.com/en_US/security/set-an-unlimited-lifetime-for-service-principal-access-token
- 1 kudos
- 2658 Views
- 2 replies
- 1 kudos
Data lineage on views
I do not know if this is intended behavior of data lineage but for me it is weird.When I create a view based on two tables the data lineage upstream looks correct. But when I replace the view to only use one of the tables, then data lineage upstream ...
- 2658 Views
- 2 replies
- 1 kudos
- 1 kudos
After some thoughts, i have come to this conclusion:Data lineage on views is working as one should expect. I strongly recommend that this feature is redesigned so it shows the result of the lastest view.
- 1 kudos
- 5107 Views
- 3 replies
- 0 kudos
Iterative read and writes cause java.lang.OutOfMemoryError: GC overhead limit exceeded
I have an iterative algorithm which read and writes a dataframe iteration trough a list with new partitions, like this: for p in partitions_list:df = spark.read.parquet("adls_storage/p")df.write.format("delta").mode("overwrite").option("partitionOver...
- 5107 Views
- 3 replies
- 0 kudos
- 0 kudos
@daniel_sahalI've attached the wrong snip/ Actually it is FULL GC Ergonomics, which was bothering me. Now I am attaching the correct snip. But as you said I scaled a bit. The thing I forgot to mention is that the table is wide - more than 300 column...
- 0 kudos
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group-
AI Summit
4 -
Azure
2 -
Azure databricks
2 -
Bi
1 -
Certification
1 -
Certification Voucher
2 -
Community
7 -
Community Edition
3 -
Community Members
1 -
Community Social
1 -
Contest
1 -
Data + AI Summit
1 -
Data Engineering
1 -
Databricks Certification
1 -
Databricks Cluster
1 -
Databricks Community
8 -
Databricks community edition
3 -
Databricks Community Rewards Store
3 -
Databricks Lakehouse Platform
5 -
Databricks notebook
1 -
Databricks Office Hours
1 -
Databricks Runtime
1 -
Databricks SQL
4 -
Databricks-connect
1 -
DBFS
1 -
Dear Community
1 -
Delta
9 -
Delta Live Tables
1 -
Documentation
1 -
Exam
1 -
Featured Member Interview
1 -
HIPAA
1 -
Integration
1 -
LLM
1 -
Machine Learning
1 -
Notebook
1 -
Onboarding Trainings
1 -
Python
2 -
Rest API
10 -
Rewards Store
2 -
Serverless
1 -
Social Group
1 -
Spark
1 -
SQL
8 -
Summit22
1 -
Summit23
5 -
Training
1 -
Unity Catalog
3 -
Version
1 -
VOUCHER
1 -
WAVICLE
1 -
Weekly Release Notes
2 -
weeklyreleasenotesrecap
2 -
Workspace
1
- « Previous
- Next »