cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shraddharane
by New Contributor
  • 1377 Views
  • 2 replies
  • 0 kudos

Migrating legacy SSAS cube to databricks

We have SQL database. Database is designed in star schema. We are migrating data from SQL to databricks. There are cubes designed using SSAS. These cubes are used for end users in excel for analysis purpose. We are now looking for solution for:1) Can...

  • 1377 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @shraddharane , 1) Can cubes be migrated? No, SSAS cubes cannot be directly migrated to Databricks. Databricks do not support the concept of multidimensional cubes like SSAS. Databricks is a Lakehouse architecture built on the foundation of Delta ...

  • 0 kudos
1 More Replies
140015
by New Contributor III
  • 940 Views
  • 3 replies
  • 1 kudos

Resolved! Using DLT pipeline with non-incremental data

Hi,I would like to know what you think about using the Delta Live Tables when the source for this pipeline is not incremental. What I mean by that is suppose that the data provider creates for me a new folder with files each time it has update to the...

  • 940 Views
  • 3 replies
  • 1 kudos
Latest Reply
Joe_Suarez
New Contributor III
  • 1 kudos

When dealing with B2B data building, the process of updating and managing your data can present unique challenges. Since your data updates involve new folders with files and you need to process the entire new folder, the concept of incremental proces...

  • 1 kudos
2 More Replies
GNarain
by New Contributor II
  • 3065 Views
  • 12 replies
  • 5 kudos

Resolved! Is there api call to set "Table access control" workspace config ?

Is there api call to set "Table access control" workspace config ?

  • 3065 Views
  • 12 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @GNarain,  Here is an example of the API call:  Could you try and let us know?   POST /api/2.0/workspace/update{ "workspaceAccessControlEnabled": true} This API call will enable table access control for your workspace. You can make this API call u...

  • 5 kudos
11 More Replies
Eldar_Dragomir
by New Contributor II
  • 793 Views
  • 1 replies
  • 2 kudos

Resolved! Reprocessing the data with Auto Loader

Could you please provide me an idea how I can start reprocessing of my data? Imagine I have a folder in adls gen2 "/test" with binaryFiles. They already processed with current pipeline. I want to reprocess the data + continue receive new data. What t...

  • 793 Views
  • 1 replies
  • 2 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 2 kudos

@Eldar_Dragomir In order to re-process the data, we have to change the checkpoint directory. This will start processing the files from the beginning. You can use cloudFiles.maxFilesPerTrigger, to limit the number of files getting processed per micro-...

  • 2 kudos
anarad429
by New Contributor
  • 694 Views
  • 1 replies
  • 1 kudos

Resolved! Unity Catalog + Reading variable from external notebook

I am trying to run a notebook which reads some of its variables from and external notebook (I used %run command for that purpose), but it keeps giving me error that these variables are not defined. These sequences of notebooks run perfectly fine on a...

  • 694 Views
  • 1 replies
  • 1 kudos
Latest Reply
Atanu
Esteemed Contributor
  • 1 kudos

I think the issue here is the variable is not created until a value is assigned to it. So, you may need to assign a value to get_sql_schema

  • 1 kudos
NathanLaw
by New Contributor III
  • 412 Views
  • 1 replies
  • 0 kudos

CPU and GPU Elapse Runtimes

I have to 2 questions about elapsed job runtimes. The same Scoring notebook is run 3 times as 3 Jobs.   The jobs are identical, same PetaStorm code, CPU cluster config ( not Spot cluster) and data but have varying elapsed runtimes.   Elapsed runtimes...

  • 412 Views
  • 1 replies
  • 0 kudos
Latest Reply
shyam_9
Valued Contributor
  • 0 kudos

Hi @NathanLaw, Could you please confirm, if you have set any parameters for the best model? Is this stop after running some epochs if there is no improvement in the model performance? 

  • 0 kudos
Sanjay_AMP
by New Contributor II
  • 397 Views
  • 1 replies
  • 1 kudos

Deployment-ready sample source-code for Delta Live Table & Autoloader

Hi all,We are planning to develop an Autoloader based DLT Pipeline that needs to beDeployable via a CI/CD PipelineObservableCan somebody please point me to source-code that we can start with a firm foundation instead of falling into a newbie-pattern ...

  • 397 Views
  • 1 replies
  • 1 kudos
Latest Reply
Priyanka_Biswas
Valued Contributor
  • 1 kudos

Hi @Sanjay_AMP Delta Live Tables and AutoLoader can be used together to incrementally ingest data from cloud object storage.• Python code example: - Define a table called "customers" that reads data from a CSV file in cloud object storage. - Define a...

  • 1 kudos
wojciech_jakubo
by New Contributor III
  • 4705 Views
  • 7 replies
  • 2 kudos

Question about monitoring driver memory utilization

Hi databricks/spark experts!I have a piece on pandas-based 3rd party code that I need to execute as a part of a bigger spark pipeline. By nature, pandas-based code is executed on driver node. I ran into out of memory problems and started exploring th...

Driver memory cycles_ Busy cluster
  • 4705 Views
  • 7 replies
  • 2 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 2 kudos

Hi @wojciech_jakubo 1. JVM memory will not be utilized for python related activities. 2. In the image we could only see the storage memory. We also have execution memory which would also be the same. Hence I came up with the executor memory to be of ...

  • 2 kudos
6 More Replies
Thor
by New Contributor III
  • 2503 Views
  • 1 replies
  • 2 kudos

Resolved! Dynamically change spark.task.cpus

Hello,I'm facing a problem with big tarballs to decompress and to fit in memory I had to limit Spark processing too many files at the same time so I changed the following property on my 8 cores VMs cluster:spark.task.cpus 4 This setting is the thresh...

  • 2503 Views
  • 1 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Hi @Thor, Spark does not offer the capability to dynamically modify configuration settings, such as spark.task.cpus, for individual stages or transformations while the application is running. Once a configuration property is set for a Spark applicati...

  • 2 kudos
bharanireddy
by New Contributor
  • 1049 Views
  • 1 replies
  • 0 kudos

Resolved! Unable to access Data Engineer with Databricks V3 course

Hello, Since yesterday noon EST, the Data Engineering with Databricks V3 course is in maintenance mode. Can someone please help restore the access.Thank you,Bharani

  • 1049 Views
  • 1 replies
  • 0 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 0 kudos

I believe you will have access now.

  • 0 kudos
ThomasVanBilsen
by New Contributor III
  • 1212 Views
  • 1 replies
  • 0 kudos

Resolved! Lineage graph now working.

Hey everyone,I've run the following code successfully:CREATE CATALOG IF NOT EXISTS lineage_data;CREATE SCHEMA IF NOT EXISTS lineage_data.lineagedemo;CREATE TABLE IF NOT EXISTS  lineage_data.lineagedemo.menu (    recipe_id INT,    app string,    main ...

  • 1212 Views
  • 1 replies
  • 0 kudos
Latest Reply
youssefmrini
Honored Contributor III
  • 0 kudos

I recommend you open a ticket to the support.

  • 0 kudos
lazcanja
by New Contributor
  • 812 Views
  • 1 replies
  • 1 kudos

Resolved! How to update table location with wasb to abfss

I created a table including location such as: wasb://<container>@<storageaccount>.blob.core.windows.net/foldername We have updated access to storage accounts to use abfssI am trying to execute the following command: alter table mydatabase.mytable set...

  • 812 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @lazcanja, The error message indicates an issue with the configuration value for the storage account key. The error might be due to an incorrect or invalid key. Given the information provided, you have correctly changed the configuration from spar...

  • 1 kudos
SamCallister
by New Contributor II
  • 13079 Views
  • 8 replies
  • 3 kudos

Dynamic Partition Overwrite for Delta Tables

Spark supports dynamic partition overwrite for parquet tables by setting the config: spark.conf.set("spark.sql.sources.partitionOverwriteMode","dynamic") before writing to a partitioned table. With delta tables is appears you need to manually specif...

  • 13079 Views
  • 8 replies
  • 3 kudos
Latest Reply
alijen
New Contributor II
  • 3 kudos

@SamCallister wrote: Spark supports dynamic partition overwrite for parquet tables by setting the config:spark.conf.set("spark.sql.sources.partitionOverwriteMode","dynamic")before writing to a partitioned table. With delta tables is appears you need ...

  • 3 kudos
7 More Replies
AleksandraFrolo
by New Contributor III
  • 398 Views
  • 0 replies
  • 0 kudos

Web scraping with Databricks

Hello,What is the easiest way to do web scraping in Databricks? Let's imagine that from this link: http://automated.pythonanywhere.com , I need to grab this element "/html/body/div[1]/div/h1[1]" and return a text, how can I do it? Can somebody write ...

  • 398 Views
  • 0 replies
  • 0 kudos
DatabricksPract
by New Contributor II
  • 3801 Views
  • 2 replies
  • 2 kudos

Resolved! Get metadata of tables in hive metastore

Hi team,I have a requirement to get the metadata of tables available in databricks hive metastore.Is there any way to get the metadata of all the tables inspite of looping through tables using Describe table_name.As hive metastore doesnot support inf...

  • 3801 Views
  • 2 replies
  • 2 kudos
Latest Reply
DatabricksPract
New Contributor II
  • 2 kudos

@Tharun-Kumar - Thanks for your quick reply, it worked.

  • 2 kudos
1 More Replies
Labels
Top Kudoed Authors