cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

snarfed
by New Contributor II
  • 2693 Views
  • 3 replies
  • 5 kudos

Serverless SQL endpoints on Azure?

Serverless SQL Endpoints sound exciting! Sounds like they've been in preview on AWS for a couple months. Any idea if/when they're coming to Azure?

  • 2693 Views
  • 3 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

There is always Synapse Serverless muhahaha

  • 5 kudos
2 More Replies
Jreco
by Contributor
  • 4649 Views
  • 2 replies
  • 5 kudos

Resolved! Reference py file from a notebook

Hi All,I'm trying to reference a py file from a notebook following this documentation: Files in repoI downloaded and added the files to my repo and when I try to run the notebook, the modules is not recognized: Any idea why is this happening? Thanks ...

image image
  • 4649 Views
  • 2 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

In this topic you can find some more info:https://community.databricks.com/s/question/0D53f00001Pp5EhCAJThe docs are not that clear.

  • 5 kudos
1 More Replies
Mec_Mec
by New Contributor II
  • 5513 Views
  • 6 replies
  • 4 kudos

Resolved! Copy a script from the current subscription to new subscription

I would like to check if there is a process to copy a script/code or migrate the script from the current subscription of the Azure Databricks - Notebooks to new subscription of Databricks (new notebook).

  • 5513 Views
  • 6 replies
  • 4 kudos
Latest Reply
Mec_Mec
New Contributor II
  • 4 kudos

how quickly move the Databricks notebooks from one account to another?

  • 4 kudos
5 More Replies
Håkon_Åmdal
by New Contributor III
  • 2834 Views
  • 1 replies
  • 1 kudos

Resolved! Incorrect length for `string` returned by the Databricks ODBC driver

Dear Databricks and community,​I have been struggling with a bug related to using golang and the Databricks ODBC driver.​It turns out that `SQLDescribeColW` consequently returns 256 as a length for `string` columns. However, in Spark, strings might b...

  • 2834 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16829050420
Databricks Employee
  • 1 kudos

Thanks for posting this issue @Håkon Åmdal​ . We should be able to reproduce and report it to the Magnitude team subsquently.

  • 1 kudos
RasmusOlesen
by New Contributor III
  • 5375 Views
  • 5 replies
  • 1 kudos

Resolved! ciso8601 library stopped installing out of the blue on DB clusters

We have multiple DB clusters (6.4 Extended Support) that have not changed in terms of libs installed or nodes etc. Sudden from one day to the other, after a cluster restart August 7th, they stopped installing ciso8601 lib as they would usually. Anyb...

  • 5375 Views
  • 5 replies
  • 1 kudos
Latest Reply
RasmusOlesen
New Contributor III
  • 1 kudos

Just to close this old qustion:We solved this by switching to a PEP517 free pip install, using the a Global Init Script:/databricks/python/bin/pip install ciso8601 --disable-pip-version-check --no-use-pep517Now it works for us.

  • 1 kudos
4 More Replies
ssm3819
by New Contributor III
  • 10612 Views
  • 2 replies
  • 3 kudos

Please let me know how i can install PyAudio using the Databricks notebook

Hi,i am trying to install the PyAudio package.but i am getting the following error. Collecting pyaudio Using cached PyAudio-0.2.11.tar.gz (37 kB)Building wheels for collected packages: pyaudio Building wheel for pyaudio (setup.py) ... error ERROR: Co...

image.png
  • 10612 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

looks like missing dependencies on the server (linux): portaudioThis should be installed:https://stackoverflow.com/questions/48690984/portaudio-h-no-such-file-or-directory

  • 3 kudos
1 More Replies
NAS
by New Contributor III
  • 2627 Views
  • 1 replies
  • 0 kudos

Set tags for an MLFlow Experiment using Python?

There is this rest API: https://www.mlflow.org/docs/latest/rest-api.html#set-experiment-tagCan I do the same from python's MLFlow API?

  • 2627 Views
  • 1 replies
  • 0 kudos
Latest Reply
NAS
New Contributor III
  • 0 kudos

Someone answered first in StackOverflow. Here it is:from mlflow.tracking import MlflowClient   # Create an experiment with a name that is unique and case sensitive. client = MlflowClient() experiment_id = client.create_experiment("Social NLP Experime...

  • 0 kudos
MadelynM
by Databricks Employee
  • 3198 Views
  • 2 replies
  • 4 kudos

Resolved! Why isn't my notebook search function working?

My search function is broken. I can't search for notebook contents.

  • 3198 Views
  • 2 replies
  • 4 kudos
Latest Reply
lizou
Contributor II
  • 4 kudos

Here is a tool availableelsevierlabs-os/NotebookDiscovery: Notebook Discovery Tool for Databricks notebooks (github.com)How to Catalog and Discover Your Databricks Notebooks Faster - The Databricks Blog

  • 4 kudos
1 More Replies
Prabakar
by Databricks Employee
  • 2210 Views
  • 0 replies
  • 2 kudos

Accessing the regions that are disabled by default in AWS from Databricks. In AWS we have 4 regions that are disabled by default. You must first enabl...

Accessing the regions that are disabled by default in AWS from Databricks.In AWS we have 4 regions that are disabled by default. You must first enable it before you can create and manage resources. The following Regions are disabled by default:Africa...

  • 2210 Views
  • 0 replies
  • 2 kudos
Jreco
by Contributor
  • 13734 Views
  • 13 replies
  • 3 kudos

Event hub streaming improve processing rate

Hi all,I'm working with event hubs and data bricks to process and enrich data in real-time.Doing a "simple" test, I'm getting some weird values (input rate vs processing rate) and I think I'm losing data:If you can see, there is a peak with 5k record...

image image
  • 13734 Views
  • 13 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

hi @Jhonatan Reyes​ ,How many Event hubs partitions are you readying from? your micro-batch takes a few milliseconds to complete, which I think is good time, but I would like to undertand better what are you trying to improve here.Also, in this case ...

  • 3 kudos
12 More Replies
BigJay
by New Contributor II
  • 5279 Views
  • 5 replies
  • 5 kudos

Capture num_affected_rows in notebooks

If I run some code, say for an ETL process to migrate data from bronze to silver storage, when a cell executes it reports num_affected_rows in a table format. I want to capture that and log it in my logger. Is it stored in a variable or syslogged som...

  • 5279 Views
  • 5 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

Good one Dan! I never thought of using the delta api for this but there you go.

  • 5 kudos
4 More Replies
xiaozy
by New Contributor
  • 1542 Views
  • 1 replies
  • 1 kudos
  • 1542 Views
  • 1 replies
  • 1 kudos
Latest Reply
Prabakar
Databricks Employee
  • 1 kudos

Hi @xiaojun wang​  please check the blog and let us know if this helps you.https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html

  • 1 kudos
Frankooo
by New Contributor III
  • 7491 Views
  • 8 replies
  • 7 kudos

How to optimize exporting dataframe to delta file?

Scenario : I have a dataframe that have 5 billion records/rows and 100+ columns. Is there a way to write this in a delta format efficiently. I have tried to export it but cancelled it after 2 hours (write didnt finish) as this processing time is not ...

  • 7491 Views
  • 8 replies
  • 7 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 7 kudos

Hi @Franco Sia​ ,I will recommend to avoid to use the repartition(50), instead enable optimizes writes on your Delta table. You can find more details hereEnable optimized writes and auto compaction on your Delta table. Use AQE (docs here) to have eno...

  • 7 kudos
7 More Replies
dbu_spark
by New Contributor III
  • 7693 Views
  • 10 replies
  • 6 kudos

Older Spark Version loaded into the spark notebook

I have databricks runtime for a job set to latest 10.0 Beta (includes Apache Spark 3.2.0, Scala 2.12) .In the notebook when I check for the spark version, I see version 3.1.0 instead of version 3.2.0I need the Spark version 3.2 to process workloads a...

Screen Shot 2021-10-20 at 11.45.10 AM
  • 7693 Views
  • 10 replies
  • 6 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 6 kudos

hi @Dhaivat Upadhyay​ ,Good news, DBR 10 was release yesterday October 20th. You can find more details in the release notes website

  • 6 kudos
9 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels