cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

oriole
by New Contributor III
  • 8638 Views
  • 5 replies
  • 2 kudos

Resolved! Spark Driver Crash Writing Large Text

I'm working with a large text variable, working it into single line JSON where Spark can process beautifully. Using a single node 256 GB 32 core Standard_E32d_v4 "cluster", which should be plenty memory for this dataset (haven't seen cluster memory u...

  • 8638 Views
  • 5 replies
  • 2 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 2 kudos

@David Toft​ Hi, The current implementation of dbutils.fs is single-threaded, performs the initial listing on the driver and subsequently launches a Spark job to perform the per-file operations. So I guess the put operation is running on a single cor...

  • 2 kudos
4 More Replies
andrew0117
by Contributor
  • 1876 Views
  • 3 replies
  • 2 kudos

Resolved! Will a table backed by a SQL server database table automatically get updated if the base table in SQL server database is updated?

If I creat a table using the code below: CREATE TABLE IF NOT EXISTS jdbcTableusing org.apache.spark.sql.jdbcoptions( url "sql_server_url", dbtable "sqlserverTable", user "username", password "password")will jdbcTable always be automatically sync...

  • 1876 Views
  • 3 replies
  • 2 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 2 kudos

Hi @andrew li​ There is a feature introduced from DBR11 where you can directly ingest the data to the table from a selected list of sources. As you are creating a table, I believe this command will create a managed table by loading the data from the...

  • 2 kudos
2 More Replies
bd
by New Contributor III
  • 1477 Views
  • 2 replies
  • 3 kudos

Resolved! Documented Autoloader option not supported?

I have a function which is meant to use the `cloudFiles` source to stream file contents from s3. It is configured like this:```stream = ( spark.readStream.format("cloudFiles") .option("cloudFiles.format", "text") .option("cloudFiles.schemaLo...

  • 1477 Views
  • 2 replies
  • 3 kudos
Latest Reply
bd
New Contributor III
  • 3 kudos

thanks, I see how I made that error.

  • 3 kudos
1 More Replies
Moonmoon
by New Contributor III
  • 5617 Views
  • 16 replies
  • 1 kudos

Resolved! Certificate/ Badge delayed

Hi Databricks team,I completed my Databricks certified Data Engineer Associate exam March 17th and after more than 48 hrs, I have not get the Certificate yet. Looks like many other folks are facing the same issue and I am not seeing any resolution pr...

  • 5617 Views
  • 16 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Moonmoon Mukherjee​ Great! Thanks for letting me know. Please it's a request, kindly mark it as the best answer. I would appreciate this.Regards

  • 1 kudos
15 More Replies
harsh_12345
by New Contributor III
  • 2838 Views
  • 7 replies
  • 2 kudos

Resolved! Passed data engineer Associate exam , but didnt recive any badge / certificate .Please help

Passed data engineer Associate exam , but didnt recive any badge / certificate .Please help

  • 2838 Views
  • 7 replies
  • 2 kudos
Latest Reply
sharukh_lodhi
New Contributor III
  • 2 kudos

Hi, I gave the associate data engineer exam on 17 march, but I haven't received the certification.I got an email right after passing the certification that you would receive your certificate after 48 hours.Would you please look into my issue, thanks!...

  • 2 kudos
6 More Replies
User16665996606
by New Contributor II
  • 2821 Views
  • 4 replies
  • 2 kudos

How to access public URLs via Databricks notebooks

I am trying to run a web application integrated with Gradio on Databricks. However, currently, I have to first run on the local URL and then launch it on the public URL. Are there any potential solutions for them to deploy the app on the public URL o...

  • 2821 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sixuan He​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 2 kudos
3 More Replies
joao
by New Contributor II
  • 1292 Views
  • 3 replies
  • 0 kudos

Directory of Databricks certification holders, not available ?

Hi,I have previously queried the directory of Databricks certification holders at https://directory.databrickscertified.com/, but now I'm getting an error message (404). Has this dorectory moved somewhere else ? Or is this just a temporary outing ?

  • 1292 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Joao Moreira de Sa Coutinho​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one tha...

  • 0 kudos
2 More Replies
flobib123
by New Contributor III
  • 5615 Views
  • 5 replies
  • 3 kudos

Resolved! Syntax highlight support for python multiline SQL strings

I would like to know if a tool is available in Databricks to manage the SQL syntax highlight in text.Like this VSCode plugin:https://marketplace.visualstudio.com/items?itemName=ptweir.python-string-sqlThank you.

  • 5615 Views
  • 5 replies
  • 3 kudos
Latest Reply
flobib123
New Contributor III
  • 3 kudos

Hello @Vidula Khanna​ ,No, no one answered my question correctly, but I'm using PySpark now, so I'm not on this topic anymore.But thank you for taking the time to answer me.

  • 3 kudos
4 More Replies
4kb_nick
by New Contributor III
  • 3587 Views
  • 3 replies
  • 5 kudos

302 Found when trying to run Unity Catalog Quickstart

Hi there,I'm helping a client of mine set up an Azure Databricks environment. The workspace is set up for private access only, and we are using Azure Firewall and Azure Private Link.We have the network environment successfully configured to the point...

  • 3587 Views
  • 3 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Nick Barretta​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest ...

  • 5 kudos
2 More Replies
h_aloha
by New Contributor III
  • 1356 Views
  • 2 replies
  • 2 kudos

Bugs in data-engineer-learning-path-v1-0-0-notebooks.dbc

Error message returned in DE 4.1 - DLT UI WalkthroughCmd 6:pipeline_language = "SQL"# pipeline_language = "Python"DA.print_pipeline_config(pipeline_language)Error:AttributeError: 'DBAcademyHelper' object has no attribute 'get_username_hash'----------...

  • 1356 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Helen Morgen​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback...

  • 2 kudos
1 More Replies
RaviMuna_33806
by New Contributor II
  • 2059 Views
  • 8 replies
  • 1 kudos

users/administrators counts per workspace

How can i find number of users/admins per workspace - looking for notebook execution option?

  • 2059 Views
  • 8 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Ravi M​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest providi...

  • 1 kudos
7 More Replies
robert37201
by New Contributor II
  • 1654 Views
  • 3 replies
  • 4 kudos

Job aborted due to stage failure: Input buffer size 0 for bloom filter is not power of 2

Query works great in a notebook, fails in Classic SQL Warehouse (photon enabled) with that error. Tables are relatively small. Just don't know where to begin understanding that error, google wasn't much help and Query History doesn't give me anything...

  • 1654 Views
  • 3 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Robert McCartney​ We haven't heard from you since the last response from @Lakshay Goel​ â€‹, and I was checking back to see if his suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to...

  • 4 kudos
2 More Replies
620139
by New Contributor II
  • 1983 Views
  • 3 replies
  • 3 kudos

Error when running OPTIMIZE on a Delta table with generated columns

I am seeing an error when running OPTIMIZE on a Delta table with generated columns:com.databricks.sql.transaction.tahoe.schema.DeltaInvariantViolationException: CHECK constraint Generated Column (created <=> now()) violated by row with values: - crea...

  • 1983 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Jeff Erickson​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedbac...

  • 3 kudos
2 More Replies
billaspiel
by New Contributor II
  • 12137 Views
  • 3 replies
  • 0 kudos

Resolved! Python open function is unable to detect the file in dbfs

hi , Im a newbie learning spark using databricks , I did some investigation and searched if this questions was been asked earlier in community forum but unable to find anything so . 1. DBFS is unable to detect the file even though its present in it...

  • 12137 Views
  • 3 replies
  • 0 kudos
Latest Reply
Dflo
New Contributor II
  • 0 kudos

I am having similar issues currently. I can read or access my storage account but when I attempted to read or access the container it told me path not found. I create the container and have full access as an owner.

  • 0 kudos
2 More Replies
pvignesh92
by Honored Contributor
  • 6156 Views
  • 8 replies
  • 0 kudos

Resolved! Multi Statement Writes from Spark to Snowflake

Does Spark support multi statement writes to Snowflake in a single session? To elaborate, I have a requirement where I need to do A selective deletion of data from a Snowflake table and Insert records to Snowflake table ( Ranges from around 1 M rows)...

  • 6156 Views
  • 8 replies
  • 0 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 0 kudos

In my analysis, I got the below understanding If your data is sitting in Snowflake and you have a set of DDL/DML queries that need to wrapped into a single transaction, you can use MULTI_STATEMENT option to 0 and use snowflake utils runQuery method t...

  • 0 kudos
7 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels