cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ros
by New Contributor III
  • 3223 Views
  • 2 replies
  • 3 kudos

Apache Hudi Table creation using hudi maven library

I installed hudi maven library org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.0 in Dbricks Runtime Ver : 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) with spark config :spark.sql.catalog.spark_catalog org.apache.spark.sql.hudi.catalog.HoodieCat...

  • 3223 Views
  • 2 replies
  • 3 kudos
Latest Reply
ros
New Contributor III
  • 3 kudos

@Shanmugavel Chandrakasu​ %sql create table hudi_cow_pt_tbl ( id bigint, name string, ts bigint, dt string, hh string ) using hudi tblproperties ( type = 'cow', primaryKey = 'id', preCombineField = 'ts' ) partitioned by (dt, hh) location '/mnt/data/h...

  • 3 kudos
1 More Replies
Erik
by Valued Contributor III
  • 2226 Views
  • 2 replies
  • 2 kudos

Create python modules for both repos and workspace

We are using the "databricks_notebook" terraform resource to deploy our notebooks into the "Workspace" as part of our CICD run, and our jobs run notebooks from the workspace. For development we clone the repo into "Repos". At the moment the only modu...

  • 2226 Views
  • 2 replies
  • 2 kudos
Latest Reply
RobiTakToRobi
New Contributor II
  • 2 kudos

You can create your own Python package and host it in Azure Artifacts. https://learn.microsoft.com/en-us/azure/devops/artifacts/quickstarts/python-packages?view=azure-devops

  • 2 kudos
1 More Replies
Oliver_Angelil
by Valued Contributor II
  • 9976 Views
  • 4 replies
  • 0 kudos

Resolved! Python code linter in Databricks notebook

Is it possible to get syntax linting in a DB notebook? Say with flake8, like I do in VS code?

  • 9976 Views
  • 4 replies
  • 0 kudos
Latest Reply
artsheiko
Databricks Employee
  • 0 kudos

No linting in a DB notebook available for now. The Notebook is currently in the process of adopting Monaco as the underlying code editor which will offer an improved code authoring experience for notebook cells.Some of the Monaco editor features enab...

  • 0 kudos
3 More Replies
Ancil
by Contributor II
  • 6967 Views
  • 8 replies
  • 6 kudos

Job aborted due to stage failure: Task 1863 in stage 10.0 failed 4 times, most recent failure: Lost task 1863.3 in stage 10.0 (TID 2021) (10.0.4.7 executor 2): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed): Fatal Python erro

I am getting below error some time run my databricks notebook from ADF, If the executor node is one then it works fine, if it increases 2 or more some times its failing on same data.Cluster Detail : Standard_F4s_v2 · Workers: Standard_F4s_v2 · 1-8 wo...

  • 6967 Views
  • 8 replies
  • 6 kudos
Latest Reply
swethaNandan
Databricks Employee
  • 6 kudos

Hi @Ancil P A​ Can you give paste the complete stacktrace from the failed task (from failed stage 10.0) and the code snippet that you are trying to run in the notebook . Also, do you think you can raise a databricks support ticket for the same?

  • 6 kudos
7 More Replies
DeviJaviya
by New Contributor II
  • 3486 Views
  • 2 replies
  • 1 kudos

Trying to build subquery in Databricks notebook, similar to SQL in a data frame with the Top(1)

Hello Everyone,I am new to Databricks, so I am at the learning stage. It would be very helpful if someone helps in resolving the issue or I can say helped me to fix my code.I have built the query that fetches the data based on CASE, in Case I have a ...

  • 3486 Views
  • 2 replies
  • 1 kudos
Latest Reply
DeviJaviya
New Contributor II
  • 1 kudos

Hello Rishabh,Thank you for your suggestion, we tried to limit 1 but the output values are coming the same for all the dates. which is not correct.

  • 1 kudos
1 More Replies
fijoy
by Contributor
  • 7950 Views
  • 1 replies
  • 2 kudos

Resolved! Using widget values in a shell script cell

I have a Databricks notebook containing a mix of SQL, Python, and shell script cells. I know I can retrieve and use values of widgets in Python cells using dbutils.widgets.get('key') and in SQL cells using ${key}.How can I use widget values in shell ...

  • 7950 Views
  • 1 replies
  • 2 kudos
Latest Reply
fijoy
Contributor
  • 2 kudos

For those interested, I found and am for now using this workaround:https://stackoverflow.com/questions/54662605/how-to-pass-a-python-variables-to-shell-script-in-azure-databricks-notebookbleswhile I wait for a more direct method.

  • 2 kudos
Hari_Dbrc
by New Contributor II
  • 3371 Views
  • 2 replies
  • 0 kudos

Issue while using community edition

Hello,Is anyone facing issue with their community edition.?Shows the below error and cant access the workspace or previosly created notebooks..Tried accessing from different devices..(not a cache issue)Error popup:(screenshots attached)Unable to view...

  • 3371 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Hari N​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resolve the...

  • 0 kudos
1 More Replies
moski
by New Contributor II
  • 2472 Views
  • 3 replies
  • 1 kudos

How to import a data table from SQLQuery2 into Databricks notebook

Can anyone show me a few commands to import a table, say "mytable2 From: Microsoft SQL Server Into: Databricks Notebook using spark dataframe or at least pandas dataframeCheers!

  • 2472 Views
  • 3 replies
  • 1 kudos
Latest Reply
irfanaziz
Contributor II
  • 1 kudos

You can read any table from MSSQL. You would need to authenticate to the db, so your would need the connection string:def dbProps(): return { "user" : "db-user", "password" : "your password", "driver" : "com.microsoft.sqlserver.jdbc.SQLServerD...

  • 1 kudos
2 More Replies
shaunangcx
by New Contributor II
  • 5153 Views
  • 3 replies
  • 0 kudos

Resolved! Command output disappearing (Not sure what's the root cause)

I have a workflow which will run every month and it will create a new notebook containing the outputs from the main notebook. However, after some time, the outputs from the created notebook will disappear. Is there anyway I can retain the outputs?

  • 5153 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Shaun Ang​ :There are a few possible reasons why the outputs from the created notebook might be disappearing:Notebook permissions: It's possible that the user or service account running the workflow does not have permission to write to the destinati...

  • 0 kudos
2 More Replies
Shubham039
by New Contributor III
  • 15650 Views
  • 8 replies
  • 6 kudos

Databricks notebook ipywidgets not working as expected ( button click issue)

I am working on Azure databricks(IDE). I wanted to create a button which takes a text value as input and on the click of a button a function needed to be run which prints the value entered.For that I created this code:from IPython.display import disp...

  • 15650 Views
  • 8 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Shubham Ringne​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

  • 6 kudos
7 More Replies
lugger1
by New Contributor III
  • 3758 Views
  • 1 replies
  • 1 kudos

Resolved! What is the best way to use credentials for API calls from databricks notebook?

Hello, I have an Databricks account on Azure, and the goal is to compare different image tagging services from Azure, GCP, AWS via corresponding API calls, with Python notebook. I have problems with GCP vision API calls, specifically with credentials...

  • 3758 Views
  • 1 replies
  • 1 kudos
Latest Reply
lugger1
New Contributor III
  • 1 kudos

Ok, here is a trick: in my case, the file with GCP credentials is stored in notebook workspace storage, which is not visible to os.environ() command. So solution is to read a content of this file, and save it to the cluster storage attached to the no...

  • 1 kudos
Diego_MSFT
by New Contributor II
  • 7455 Views
  • 1 replies
  • 4 kudos

Automating the re run of job (with several Tasks) // automate the notification of a failed specific tasks after re trying // Error handling on azure data factory pipeline with DataBricks notebook

Hi DataBricks Experts:I'm using Databricks on Azure.... I'd like to understand the following:1) if there is way of automating the re run some specific failed tasks from a job (with several Tasks), for example if I have 4 tasks, and the task 1 and 2 h...

  • 7455 Views
  • 1 replies
  • 4 kudos
Latest Reply
Lindberg
New Contributor III
  • 4 kudos

You can use "retries".In Workflow, select your job, the task, and in the options below, configure retries.If so, you can also see more options at:https://learn.microsoft.com/pt-br/azure/databricks/dev-tools/api/2.0/jobs?source=recommendations

  • 4 kudos
Data_Engineer3
by Contributor III
  • 13290 Views
  • 4 replies
  • 5 kudos

How can i use the same spark session from onenotebook to another notebook in databricks

I want to use the same spark session which created in one notebook and need to be used in another notebook in across same environment, Example, if some of the (variable)object got initialized in the first notebook, i need to use the same object in t...

  • 13290 Views
  • 4 replies
  • 5 kudos
Latest Reply
Manoj12421
Valued Contributor II
  • 5 kudos

You can use %run and then use the location of the notebook - %run "/folder/notebookname"

  • 5 kudos
3 More Replies
Anonymous
by Not applicable
  • 7800 Views
  • 1 replies
  • 1 kudos

Testing framework using Databricks Notebook and Pytest.

Hi Friends,I am designing a Testing framework using Databricks and pytest. Currently stuck with report generation, that is generating blank with only default parameters only .for ex :-testsuites><testsuite name="pytest" errors="0" failures="0" skippe...

  • 7800 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Vijaya Palreddy​ :There are several testing frameworks available for data testing that you can consider using with Databricks and Pytest:Great Expectations: Great Expectations is an open-source framework that provides a simple way to create and main...

  • 1 kudos
Michael_Marquis
by New Contributor II
  • 4570 Views
  • 1 replies
  • 3 kudos

How can I change the font size on ipywidgets in a Databricks notebook?

I'm trying to create a simple UI for a notebook using the recently implemented support for ipywidgets, but I'm having a hard time figuring out how to change certain style attributes like font size and color in widgets that should accept those style p...

  • 4570 Views
  • 1 replies
  • 3 kudos
Latest Reply
Miguel_Suarez
Databricks Employee
  • 3 kudos

Hey Michael,The example you're trying to run is for ipywidgets 8, we currently have ipywidgets 7-which has fewer button customizations. I believe the only font customization available in 7 is "font_weigh t" (no space). I hope this helps.Best,Miguel

  • 3 kudos
Labels