cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Akshith_Rajesh
by New Contributor III
  • 8238 Views
  • 5 replies
  • 5 kudos

Resolved! Call a Stored Procedure in Azure Synapse with input and output Params

driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager connection = driver_manager.getConnection(mssql_url, mssql_user, mssql_pass) connection.prepareCall("EXEC sys.sp_tables").execute() connection.close()The above code works fine but however...

  • 8238 Views
  • 5 replies
  • 5 kudos
Latest Reply
judyy
New Contributor III
  • 5 kudos

This blog helped me with the output of the stored procedure: https://medium.com/@judy3.yang/how-to-run-sql-procedure-in-databricks-notebook-e28023555565

  • 5 kudos
4 More Replies
vk217
by Contributor
  • 6590 Views
  • 3 replies
  • 1 kudos

ModuleNotFoundError: No module named 'pyspark.dbutils'

I have a class in a python file like this from pyspark.sql import SparkSession from pyspark.dbutils import DBUtils class DatabricksUtils: def __init__(self‌‌): self.spark = SparkSession.getActiveSession() self.dbutils = DBUtil...

  • 6590 Views
  • 3 replies
  • 1 kudos
Latest Reply
Jarkrung
New Contributor II
  • 1 kudos

Hi, we are also in the same exact situation. Were you able to solve the problem? Or a workaround maybe.

  • 1 kudos
2 More Replies
EdemSeitkh
by New Contributor III
  • 2665 Views
  • 5 replies
  • 0 kudos

Resolved! Pass catalog/schema/table name as a parameter to sql task

Hi, i am trying to pass catalog name as a parameter into query for sql task, and it pastes it with single quotes, which results in error. Is there a way to pass raw value or other possible workarounds? query:INSERT INTO {{ catalog }}.pas.product_snap...

  • 2665 Views
  • 5 replies
  • 0 kudos
Latest Reply
lathaniel
New Contributor III
  • 0 kudos

@EdemSeitkh  can you elaborate on your workaround? Curious how you were able to implement an enum paramter in DBSQL.I'm running into this same issue now.

  • 0 kudos
4 More Replies
amelia1
by New Contributor II
  • 375 Views
  • 1 replies
  • 0 kudos

pyspark read data using jdbc url returns column names only

Hello,I have a remote azure sql warehouse serverless instance that I can access using databricks-sql-connector. I can read/write/update tables no problem.But, I'm also trying to read/write/update tables using local pyspark + jdbc drivers. But when I ...

  • 375 Views
  • 1 replies
  • 0 kudos
Latest Reply
anardinelli
New Contributor III
  • 0 kudos

Hi @amelia1 how are you? What you got was indeed the top 5 rows (see that it was the Row class). What does it show when you run display(df)? I'm thinking it might be something related to your schema, since you did not defined that, it can read the da...

  • 0 kudos
TWib
by New Contributor III
  • 1887 Views
  • 7 replies
  • 3 kudos

DatabricksSession broken for 15.1

This code fails with exception:[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.File <command-4420517954891674>, line 7 4 spark = DatabricksSession.builder.getOrCreate() 6 df = spark.read.table("samples.nyctaxi.trips") ---->...

  • 1887 Views
  • 7 replies
  • 3 kudos
Latest Reply
jcap
New Contributor II
  • 3 kudos

We are also seeing this error in 14.3 LTS from a simple example:from pyspark.sql.functions import coldf = spark.table('things')things = df.select(col('thing_id')).collect()[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.  

  • 3 kudos
6 More Replies
gianni77
by New Contributor
  • 41799 Views
  • 13 replies
  • 4 kudos

How can I export a result of a SQL query from a databricks notebook?

The "Download CSV" button in the notebook seems to work only for results <=1000 entries. How can I export larger result-sets as CSV?

  • 41799 Views
  • 13 replies
  • 4 kudos
Latest Reply
igorstar
New Contributor III
  • 4 kudos

If you have a large dataset, you might want to export it to a bucket in parquet format from your notebook:%python df = spark.sql("select * from your_table_name") df.write.parquet(your_s3_path) 

  • 4 kudos
12 More Replies
Mits
by New Contributor II
  • 1550 Views
  • 4 replies
  • 3 kudos

Sending email alerts to non-databricks user

I am trying to send email alerts to a non databricks user. I am using Alerts feature available in SQL. Can someone help me with the steps.Do I first need to first add Notification Destination through Admin settings and then use this newly added desti...

  • 1550 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Mitali Lad​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 3 kudos
3 More Replies
Phani1
by Valued Contributor
  • 258 Views
  • 1 replies
  • 0 kudos

integrating Azure Databricks with AAD

Hi Team, Could you please provide the details/process for integrating Azure Databricks - Unity Catalog and AAD? Regards,Phani

  • 258 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Contributor III
  • 0 kudos

Hello @Phani1 ,These doc pages might be useful for you: Set up and manage Unity CatalogSync users and groups from Microsoft Entra ID 

  • 0 kudos
ismaelhenzel
by New Contributor III
  • 264 Views
  • 1 replies
  • 0 kudos

Upsert into a Delta Lake table with merge when using row masking function

I'm using databricks rls functions on my tables, and i need to make some merges into, but tables with rls functions does not support merge operations (https://docs.databricks.com/en/data-governance/unity-catalog/row-and-column-filters.html#limitation...

ismaelhenzel_0-1716979371091.png
  • 264 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Contributor III
  • 0 kudos

Hi @ismaelhenzel, if you want to use the "MERGE INTO" sql command, you must turn-off rls. This is by design.

  • 0 kudos
SamarthJain
by New Contributor II
  • 3736 Views
  • 5 replies
  • 2 kudos

Hi All,I&#39;m facing an issue with my Spark Streaming Job. It gets stuck in the "Stream Initializing" phase for more than 3 hours.Need your...

Hi All,I'm facing an issue with my Spark Streaming Job. It gets stuck in the "Stream Initializing" phase for more than 3 hours.Need your help here to understand what happens internally at the "Stream Initializing" phase of the Spark Streaming job tha...

  • 3736 Views
  • 5 replies
  • 2 kudos
Latest Reply
olivier_soucy
New Contributor II
  • 2 kudos

I also had the same issue, but it seems only happening on DBR >= 15.0. Any idea why?

  • 2 kudos
4 More Replies
Mathias
by New Contributor II
  • 192 Views
  • 1 replies
  • 0 kudos

Delay rows coming into DLT pipeline

Backgroundand requirements: We are reading data from our factory and storing it in a DLT table called telemetry with columns sensorid, timestamp and value. We need to get rows where sensorid is “qrreader-x” and join with some other data from that sam...

  • 192 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Contributor III
  • 0 kudos

Hi @Mathias,  I'd say that watermarking might be a good solution for your use case. Please check Control late data threshold with multiple watermark policy in Structured Streaming.  If you want to dig-in further there's also: Spark Structured Streami...

  • 0 kudos
EcuaCrisCar
by New Contributor III
  • 607 Views
  • 1 replies
  • 0 kudos

Sending a personalized message to email.

Greetings community, I am new to using databricks and for some time I have tried some scripts in notebook. I would like your help on a task: Carry out a personalized mailing where, First, a query of the number of records in the test table is performe...

Data Engineering
SENDEMAIL SQL
  • 607 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @EcuaCrisCar,  To query the number of records in your test table, you can use SQL or DataFrame APIs in Databricks.Next, you’ll need to check if the record count falls within the specified range (80,000 to 90,000). If it does, proceed with the note...

  • 0 kudos
filipjankovic
by New Contributor
  • 2645 Views
  • 1 replies
  • 0 kudos

JSON string object with nested Array and Struct column to dataframe in pyspark

I am trying to convert JSON string stored in variable into spark dataframe without specifying schema, because I have a big number of different tables, so it has to be dynamically. I managed to do it with sc.parallelize, but since we are moving to Uni...

  • 2645 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @filipjankovic, Since you have multiple tables and need dynamic schema inference, I recommend using the following approach: Schema Inference from JSON String: You can infer the schema from the JSON string and then create a DataFrame. Schema I...

  • 0 kudos
NikhilK1998
by New Contributor II
  • 1008 Views
  • 1 replies
  • 1 kudos

DataBricks Certification Exam Got Suspended. Require support for the same.

Hi,I applied for Databricks Certified: Data Engineer Professional certification on 5th July 2023. The test was going fine for me but suddenly there was an alert from the system (I think I was in proper angle in front of camera and was genuinely givin...

  • 1008 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @NikhilK1998, I'm sorry to hear your exam was suspended. Thank you for filing a ticket with our support team. Please allow the support team 24-48 hours to resolve. In the meantime, you can review the following documentation: Room requirements Beh...

  • 1 kudos
Avinash_Narala
by New Contributor III
  • 395 Views
  • 1 replies
  • 0 kudos

Instance profile failure while installing Databricks Overwatch

Despite following the steps mentioned in the provided link to create an instance profile, we encountered a problem in step 6 where we couldn't successfully add the instance profile to Databricks(Step 6: Add the instance profile to Databricks).https:/...

  • 395 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Avinash_Narala, The error message you provided indicates that the verification of the instance profile failed due to an AWS authorization issue. Specifically, the user associated with the assumed role arn:aws:sts::755231362028:assumed-role/databr...

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels
Top Kudoed Authors