cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Data_Engineer3
by Contributor III
  • 2478 Views
  • 5 replies
  • 0 kudos

Default maximum spark streaming chunk size in delta files in each batch?

working with delta files spark structure streaming , what is the maximum default chunk size in each batch?How do identify this type of spark configuration in databricks?#[Databricks SQL]​ #[Spark streaming]​ #[Spark structured streaming]​ #Spark​ 

  • 2478 Views
  • 5 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

doc - https://docs.databricks.com/en/structured-streaming/delta-lake.html  Also, what is the challenge while using foreachbatch?

  • 0 kudos
4 More Replies
Arpi
by New Contributor II
  • 3195 Views
  • 4 replies
  • 4 kudos

Resolved! Database creation error

I am trying to create database with external location abfss but facing the below error.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs....

  • 3195 Views
  • 4 replies
  • 4 kudos
Latest Reply
source2sea
Contributor
  • 4 kudos

Changing it to a CLUSTER level for OAuth authentication helped me solve the problem.I wish the notebook AI bot could tell me the solution.before the changes, my configraiotn was at the notebook leve.and  it has below errorsAnalysisException: org.apac...

  • 4 kudos
3 More Replies
verargulla
by New Contributor III
  • 10665 Views
  • 3 replies
  • 4 kudos

Azure Databricks: Error Creating Cluster

We have provisioned a new workspace in Azure using our own VNet. Upon creating the first cluster, I encounter this error:Control Plane Request Failure: Failed to get instance bootstrap steps from the Databricks Control Plane. Please check that instan...

  • 10665 Views
  • 3 replies
  • 4 kudos
Latest Reply
ShaneOss
New Contributor II
  • 4 kudos

I'm also seeing this issue. Was there a solution?

  • 4 kudos
2 More Replies
DJey
by New Contributor III
  • 12665 Views
  • 6 replies
  • 2 kudos

Resolved! MergeSchema Not Working

Hi All, I have a scenario where my Exisiting Delta Table looks like below:Now I have an incremental data with an additional column i.e. owner:Dataframe Name --> scdDFBelow is the code snippet to merge Incremental Dataframe to targetTable, but the new...

image image image image
  • 12665 Views
  • 6 replies
  • 2 kudos
Latest Reply
Amin112
New Contributor II
  • 2 kudos

In Databricks Runtime 15.2 and above, you can specify schema evolution in a merge statement using SQL or Delta table APIs:MERGE WITH SCHEMA EVOLUTION INTO targetUSING sourceON source.key = target.keyWHEN MATCHED THENUPDATE SET *WHEN NOT MATCHED THENI...

  • 2 kudos
5 More Replies
Tico23
by Contributor
  • 13477 Views
  • 12 replies
  • 10 kudos

Connecting SQL Server (on-premise) to Databricks via jdbc:sqlserver

Is it possible to connect to SQL Server on-premise (Not Azure) from Databricks?I tried to ping my virtualbox VM (with Windows Server 2022) from within Databricks and the request timed out.%sh   ping 122.138.0.14This is what my connection might look l...

  • 13477 Views
  • 12 replies
  • 10 kudos
Latest Reply
BharathKumarS
New Contributor II
  • 10 kudos

I tried to connect to localhost sql server through databricks community edition, but it failed. I have created an IP rule on port 1433 allowed inbound connection from all public network, but still didn't connect. I tried locally using python its work...

  • 10 kudos
11 More Replies
Jyo777
by Contributor
  • 4800 Views
  • 4 replies
  • 4 kudos

need help with Azure Databricks questions on CTE and SQL syntax within notebooks

Hi amazing community folks,Feel free to share your experience or knowledge regarding below questions:-1.) Can we pass a CTE sql statement into spark jdbc? i tried to do it i couldn't but i can pass normal sql (Select * from ) and it works. i heard th...

  • 4800 Views
  • 4 replies
  • 4 kudos
Latest Reply
vijaypavann
Databricks Employee
  • 4 kudos

CTE expressions are supported with the `prepareQuery` option.  https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html   A prefix that will form the final query together with query. As the specified query will be parenthesized as a subquery i...

  • 4 kudos
3 More Replies
BeardyMan
by New Contributor III
  • 5453 Views
  • 9 replies
  • 3 kudos

Resolved! MLFlow Serve Logging

When using Azure Databricks and serving a model, we have received requests to capture additional logging. In some instances, they would like to capture input and output or even some of the steps from a pipeline. Is there any way we can extend the lo...

  • 5453 Views
  • 9 replies
  • 3 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 3 kudos

Another word from a Databricks employee:"""You can use the custom model approach but configuring it is painful. Plus you have ended every loggable model in the custom model. Another less intrusive solution would be to have a proxy server do the loggi...

  • 3 kudos
8 More Replies
Data_Engineer3
by Contributor III
  • 2457 Views
  • 2 replies
  • 6 kudos

Getting error popup in databricks

when i migrated to new databricks workspace, I am getting error popup message continuously and also indentation what I changed it is getting changed to other value every with new login .

image
  • 2457 Views
  • 2 replies
  • 6 kudos
Latest Reply
Sivagurunathann
New Contributor II
  • 6 kudos

Hi I am facing this issue session expired pop-ups frequently every 3 minutes while I start working on databricks.

  • 6 kudos
1 More Replies
Arby
by New Contributor II
  • 10590 Views
  • 4 replies
  • 0 kudos

Help With OSError: [Errno 95] Operation not supported: '/Workspace/Repos/Connectors....

Hello,I am experiencing issues with importing from utils repo the schema file I created.this is the logic we use for all ingestion and all other schemas live in this repo utills/schemasI am unable to access the file I created for a new ingestion pipe...

icon
  • 10590 Views
  • 4 replies
  • 0 kudos
Latest Reply
Arby
New Contributor II
  • 0 kudos

@Debayan Mukherjee​ Hello, thank you for your response. please let me know if these are the correct commands to access the file from notebookI can see the files in the repo folderbut I just noticed this. the file I am trying to access the size is 0 b...

  • 0 kudos
3 More Replies
CAN
by New Contributor
  • 853 Views
  • 1 replies
  • 0 kudos

Security Threats in Databricks for File Upload

Dear community, we are using the Azure Databricks service and wondering if uploading a file to the DBFS (or to a storage accessed directly from a notebook in Databricks) could be a potential security threat. Imagine you upload some files with 'malici...

  • 853 Views
  • 1 replies
  • 0 kudos
Latest Reply
KrunalMedapara
New Contributor II
  • 0 kudos

Uploading a file to the Databricks File System (DBFS) or accessing storage directly from a notebook in Azure Databricks could pose potential security risks if not managed properly. Here are some considerations:Sensitive Data Exposure: Uploading sensi...

  • 0 kudos
Akshith_Rajesh
by New Contributor III
  • 10704 Views
  • 5 replies
  • 6 kudos

Resolved! Call a Stored Procedure in Azure Synapse with input and output Params

driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager connection = driver_manager.getConnection(mssql_url, mssql_user, mssql_pass) connection.prepareCall("EXEC sys.sp_tables").execute() connection.close()The above code works fine but however...

  • 10704 Views
  • 5 replies
  • 6 kudos
Latest Reply
judyy
New Contributor III
  • 6 kudos

This blog helped me with the output of the stored procedure: https://medium.com/@judy3.yang/how-to-run-sql-procedure-in-databricks-notebook-e28023555565

  • 6 kudos
4 More Replies
pSdatabricks
by New Contributor II
  • 3497 Views
  • 3 replies
  • 0 kudos

Azure Databricks Monitoring & Alerting (Data Observability) Tools / Frameworks for Enterprise

I am trying to evaluate options for Monitoring and Alerting tools like New Relic, Datadog, Grafana with Databricks on Azure . No one supports when reached out to them. I would like to hear from the databricks team on the recommended tool / framework ...

  • 3497 Views
  • 3 replies
  • 0 kudos
Latest Reply
Sruthivika
New Contributor II
  • 0 kudos

I'd recommend this new tool we've been trying out. It's really helpful for monitoring and provides good insights on how Azure Databricks clusters, pools & jobs are doing – like if they're healthy or having issues. It brings everything together, makin...

  • 0 kudos
2 More Replies
SamarthJain
by New Contributor II
  • 5501 Views
  • 4 replies
  • 2 kudos

Hi All,I'm facing an issue with my Spark Streaming Job. It gets stuck in the "Stream Initializing" phase for more than 3 hours.Need your...

Hi All,I'm facing an issue with my Spark Streaming Job. It gets stuck in the "Stream Initializing" phase for more than 3 hours.Need your help here to understand what happens internally at the "Stream Initializing" phase of the Spark Streaming job tha...

  • 5501 Views
  • 4 replies
  • 2 kudos
Latest Reply
MohsenJ
Contributor
  • 2 kudos

I'm facing the same issue when I try to run this example Create a monitor using the API | Databricks on AWS (Inference Lakehouse Monitor regression example notebook). any idea? 

  • 2 kudos
3 More Replies
Bas1
by New Contributor III
  • 11440 Views
  • 16 replies
  • 20 kudos

Resolved! network security for DBFS storage account

In Azure Databricks the DBFS storage account is open to all networks. Changing that to use a private endpoint or minimizing access to selected networks is not allowed.Is there any way to add network security to this storage account? Alternatively, is...

  • 11440 Views
  • 16 replies
  • 20 kudos
Latest Reply
Odee79
New Contributor II
  • 20 kudos

How can we secure the storage account in the managed resource group which holds the DBFS with restricted network access, since access from all networks is blocked by our Azure storage account policy?

  • 20 kudos
15 More Replies
AlexWeh
by New Contributor II
  • 13541 Views
  • 1 replies
  • 2 kudos

Universal Azure Credential Passthrough

At the moment, Azure Databricks has the feature to use AzureAD login for the workspace and create single user clusters with Azure Data Lake Storage credential passthrough. But this can only be used for Data Lake Storage.Is there already a way, or are...

  • 13541 Views
  • 1 replies
  • 2 kudos
Latest Reply
polivbr
New Contributor II
  • 2 kudos

I have exactly the same issue. I have the need to call a protected API within a notebook but have no access to the current user's access token. I've had to resort to nasty workarounds involving installing and running the Azure CLI from within the not...

  • 2 kudos
Labels