Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
We have about 12k write/s and 1.5TB/mo compressed S3 data. How can we choose between Serverless vs managed? And what will be good way to project the cost? In serverless, how the machine and hours scaled or scheduled based on the load? If there is a l...
Hi @Frank Zhang How can we choose between Serverless vs managed? And what will be good way to project the cost? -- Once you enable the serverless feature on your workspace, by default the new warehouse will be created with a serverless option. If yo...
We tried moving our scala script from standalone cluster to databricks platform. Our script is compatible with following version:Spark: 2.4.8 Scala: 2.11.12The databricks cluster has spark/scala following with version:Spark: 3.2.1. Scala: 2.121: we ...
Hi @Monika Samant Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...
Hi @Juan Afanador Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly.
Hi I have finished Lakehouse Fundamentals assessment, received my completion certificate but so far did not receive a badge for it. Would you be able to assist please?
Hi @Maciej Oleksy Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly.
@Trushna Khatri adding some more information to prabakar. can you please let me know what is actual need of starting cluster during specific time. usually if you criteria is to use for jobs go with job cluster. here cluster start when ever your job ...
Hi,When i create a metastore in aws databricks, i always have this error in the picture bellow.Eventhought i follow this link https://docs.databricks.com/data-governance/unity-catalog/get-started.html#cloud-tenant-setup-aws
I'm running a scheduled job on Job clusters. I didnt mention the log location for the cluster. Where can we get the stored logs location. Yes, I can see the logs in the runs, but i need the logs location.
The Account Console SAML SSO docs mention that the admin role must be specified in the identity provider response. However it's not clear which attribute to use for passing role info via SAML.What SAML attribute should the role be assigned to? What d...
Hey there @Jameel A. Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too. Else please let us know if you need mo...
DBR 10.4 LTS is failing frequently due to GC overhead once in half an hour.Can anyone from Databricks Team let me know if we have some existing tickets or bugs.Note : We used the same configuration and same DBR for almost last 3 months.When checking ...
hi @Vidula Khanna have raised a support ticket to ADB from client side. We can close this however it seems like DBR Version 11.2 and above has some fixes for the RocksDB memory leak based on communication with Databricks developer team
The error i get when importing certain delta table isThe specified schema does not match the existing schema at dbfs:/mnt/mart/tablenamehowever, when i check the metadata table in the old workspace and the exported file, they match. However, it seems...
below is example error. however, in existing metadata i still see varchar 100 as the type. Specified metadata for field Percentage is different from existing schema:\n Specified: {}\n Existing: {\"HIVE_TYPE_STRING\":\"varchar(100)\"}\n\nIf your inte...
I amTrying to read a csv file stored in database tables of databricks, but getting error . It is runnin gfine for dbfs but same format not working for Database Tables.
root@387ece6d15b2:/usr/workspace# databricks --versionVersion 0.17.3root@387ece6d15b2:/usr/workspace# databricks jobs configure --version=2.1root@387ece6d15b2:/usr/workspace# databricks jobs get --job-id 123WARN: Your CLI is configured to use Jobs AP...
Command "databricks jobs configure --version=2.1" not work.workaround with adding option "--version=2.1" to each databricks jobs/runs command .It is not very convenient.
How to Install Libraries on DatabricksYou can install libraries in Databricks at the cluster level for libraries commonly used on a cluster, at the notebook-level using %pip, or using global init scripts when you have libraries that should be install...
It can be a risky to install libraries without any sort of oversite/security structure to ensure those libraries have no vulnerabilities. I think more caution needs to be added to the wording of these documents to express that. All of the libraries w...
Hi, Could there be a difference in the DBU charge for 2 cluster of exact same configuration and workload but one is a job cluster and the other is an interactive cluster?Thanks,Sweta
Hi @Swetha Marakani Does @Prabakar Ammeappin response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!
Hi @Camila Queiroz Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...