cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

asad77007
by New Contributor II
  • 2287 Views
  • 3 replies
  • 1 kudos

How to connect Analysis Service Cube with Databricks notebook

I am trying to connect AS Cube with Databricks notebook but unfortunately didn't find any solution yet. is there any possible way to connect AS cube with databricks notebook? if yes can someone please guide me

  • 2287 Views
  • 3 replies
  • 1 kudos
Latest Reply
omfspartan
New Contributor III
  • 1 kudos

I am able to connect Azure analysis services using Azure Analysis services rest api. is yours on-prem?

  • 1 kudos
2 More Replies
Baldrez
by New Contributor II
  • 3480 Views
  • 4 replies
  • 5 kudos

Resolved! REST API for Stream Monitoring

Hi, everyone. I just recently started using Databricks on Azure so my question is probably very basic but I am really stuck right now.I need to capture some streaming metrics (number of input rows and their time) so I tried using the Spark Rest Api ...

  • 3480 Views
  • 4 replies
  • 5 kudos
Latest Reply
jose_gonzalez
Moderator
  • 5 kudos

hi @Roberto Baldrez​ ,if you think that @Gaurav Rupnar​ solved your question, then please select it as best response to it can be moved to the top of the topic and it will help more users in the future.Thank you

  • 5 kudos
3 More Replies
zero234
by New Contributor III
  • 1801 Views
  • 2 replies
  • 2 kudos

I have created a DLT pipeline which  reads data from json files which are stored in databricks volum

I have created a DLT pipeline which  reads data from json files which are stored in databricks volume and puts data into streaming table This was working fine.when i tried to read the data that is inserted into the table and compare the values with t...

  • 1801 Views
  • 2 replies
  • 2 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 2 kudos

Keep your DLT code separate from your comparison code, and run your comparison code once your DLT data has been ingested.

  • 2 kudos
1 More Replies
Avinash_Narala
by Contributor
  • 1053 Views
  • 1 replies
  • 1 kudos

Unity Catalog Migration

Hello,We are in the process of migrating to Unity Catalog. So, can I know how to automate the process of Refactoring the Notebooks to Unity Catalog.

Data Engineering
automation
migration
unitycatalog
  • 1053 Views
  • 1 replies
  • 1 kudos
Latest Reply
MinThuraZaw
New Contributor III
  • 1 kudos

Hi @Avinash_Narala There is no one-click solution to refactor all table names notebooks with UC's three level namespaces. At least, manual updating table names is required during the migration process.One option is you can you search feature. Search ...

  • 1 kudos
valjas
by New Contributor III
  • 8043 Views
  • 3 replies
  • 0 kudos

Disable Machine Learning and Job Creation

We are working on creating a new databricks workspace for external entities. We have disabled Cluster and Warehouse creation permission but the external users are still able to create Jobs and job clusters. Is there a way to revoke Job creation permi...

  • 8043 Views
  • 3 replies
  • 0 kudos
Latest Reply
Venk1599
New Contributor II
  • 0 kudos

It permits cluster creation during Workflow/Job/DLT pipeline creation. However, when attempting to start any of these, it fails with a 'Not authorized to create compute' error. Please try it and inform me of the outcome

  • 0 kudos
2 More Replies
jaimeperry12345
by New Contributor
  • 740 Views
  • 1 replies
  • 0 kudos

duplicate files in delta table

I am facing this issue from long time but so far there is no solution. I have delta table. My bronze layer is picking up the old files (mostly 8 days old file) randomly. My source of files is azure blob storage.

  • 740 Views
  • 1 replies
  • 0 kudos
Latest Reply
Palash01
Valued Contributor
  • 0 kudos

Hey @jaimeperry12345 I will need more information to direct you in the right direction: Confirm the behavior: Double-check that your Delta table is indeed reading 8-day-old files randomly. Provide any logs or error messages you have regarding this.Ex...

  • 0 kudos
PaulineX
by New Contributor III
  • 2597 Views
  • 3 replies
  • 1 kudos

Resolved! can I use volume for external table location?

Hello,I have a parquet file test.parquet in the volume volume_ext_test. Tried to create an external table as below, it failed and says it "is not a valid URI".create table catalog_managed.schema_test.tbl_vol asselect * from parquet.`/Volumes/catalog_...

  • 2597 Views
  • 3 replies
  • 1 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 1 kudos

Hi @PaulineX As per the documentation, you cannot use volume for storing table data. It's for loading, storing and accessing files. You cannot use volumes as a location for tables. Volumes are intended for path-based data access only. Use tables for ...

  • 1 kudos
2 More Replies
Alex_O
by New Contributor II
  • 1168 Views
  • 1 replies
  • 0 kudos

Migrating Job Orchestration to Shared Compute and avoiding(?) refactoring

In an effort to migrate our data objects to the Unity Catalog, we must migrate our Job Orchestration to leverage Shared Compute to interact with the 3-namespace hierarchy.We have some functions and references to code that are outside of the features ...

Data Engineering
Shared Compute
spark
Unity Catalog
  • 1168 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alex_O
New Contributor II
  • 0 kudos

@Kaniz_Fatma Okay, that makes sense, thank you.What about the approach to identifying these unsupported methods? Is there any documentation of what is unsupported between Unrestricted and Shared?

  • 0 kudos
hps2
by New Contributor II
  • 1217 Views
  • 0 replies
  • 0 kudos

duplicate files in bronze delta table

Hello All,I am facing this issue from long time but so far there is no solution. I have delta table. My bronze layer is picking up the old files (mostly 8 days old file) randomly. My source of files is azure blob storage.Those files and not being upd...

  • 1217 Views
  • 0 replies
  • 0 kudos
irfanaziz
by Contributor II
  • 24767 Views
  • 7 replies
  • 8 kudos

Resolved! How to merge small parquet files into a single parquet file?

I have thousands of parquet files having same schema and each has 1 or more records. But reading with spark these files is very very slow. I want to know if there is any solution how to merge the files before reading them with spark? Or is there any ...

  • 24767 Views
  • 7 replies
  • 8 kudos
Latest Reply
mmore500
New Contributor II
  • 8 kudos

Give [*joinem*](https://github.com/mmore500/joinem) a try, available via PyPi: `python3 -m pip install joinem`.*joinem* provides a CLI for fast, flexbile concatenation of tabular data using [polars](https://pola.rs).I/O is *lazily streamed* in order ...

  • 8 kudos
6 More Replies
Prem1902
by New Contributor II
  • 1550 Views
  • 2 replies
  • 1 kudos

Resolved! Cost of running a job on databricks

Hi All, I need assistance with the cost of running the job on Databricks where I have 20-30 TB (one-time job) and daily data would be around 2GB. The level of transformation would be medium.  Source and destination is AWS s3.Looking for your quick re...

  • 1550 Views
  • 2 replies
  • 1 kudos
Latest Reply
Prem1902
New Contributor II
  • 1 kudos

Is there a way to predict the cost before building the solution? I mean we wanted to see our option on different platforms.

  • 1 kudos
1 More Replies
valjas
by New Contributor III
  • 651 Views
  • 1 replies
  • 0 kudos

Is it possible to migrate SQL Objects from one workspace to another?

We have SQL Queries and dashboards in workspace dev_01. New workspace dev_02 is created and unity catalog is enabled.I was able to migrate jobs, clusters, DLTs, SQL warehouses, users using APIs. But, while migrating queries using APIs, I can't get th...

  • 651 Views
  • 1 replies
  • 0 kudos
Latest Reply
jcoggs
New Contributor II
  • 0 kudos

I'm doing something similar, but I haven't run into this parent directory issue. [Actually to be clear I ran into an issue around missing user directories, but I believe that was different than what you describe]. Before migrating the queries, I'm re...

  • 0 kudos
Jaris
by New Contributor III
  • 2065 Views
  • 3 replies
  • 1 kudos

CDC Delta table select using startingVersion on Shared cluster running DBR 14.3 does not work

Hello everyone,We have switched from DBR 13.3 to 14.3 on our Shared development cluster and I am no longer able to run following read from a delta table with CDC enabled:data = ( spark.read.format("delta") .option("readChangeFeed", "true") .op...

  • 2065 Views
  • 3 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
Hello everyone,We have switched from DBR 13.3 to 14.3 on our Shared development cluster and I am no longer able to run following read from a delta table with CDC enabled:data = ( spark.read.format("delta") .option("readChangeFeed", "true") .op...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels