cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Hubert-Dudek
by Esteemed Contributor III
  • 1921 Views
  • 1 replies
  • 10 kudos

databricks SQL now supports getting the SQLSTATE code of a query to identify errors. You can use the e.getSqlState() method in a try/catch block to ge...

databricks SQL now supports getting the SQLSTATE code of a query to identify errors. You can use the e.getSqlState() method in a try/catch block to get the five-character code that indicates the success or failure of an SQL command. I am still thinki...

Untitled
  • 1921 Views
  • 1 replies
  • 10 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 10 kudos

Thank you for sharing @Hubert Dudek​ 

  • 10 kudos
databicky
by Contributor II
  • 5218 Views
  • 1 replies
  • 0 kudos

how to create borders in excel by python

how to create borders in excel by python like the following format.​in the ex column if i entered the value as data means it should be center between those rows.​​​

  • 5218 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Mohammed sadamusean​ :You can use the openpyxl library in Python to create borders in Excel. Here is an example code snippet that creates a border around a range of cells and centers the text in a specific column:import openpyxl from openpyxl.styles...

  • 0 kudos
429957
by New Contributor
  • 917 Views
  • 1 replies
  • 0 kudos

DeltaColumnMappingUnsupportedException' when performing 'Full refresh all' on DLT pipeline

Trigger:Perform 'Full refresh all' on a DLT pipeline (new or existing). The existing DLT table already existed beforehand.Issue:Getting the error 'DeltaColumnMappingUnsupportedException' during "Setting up tables" stage.```com.databricks.sql.transact...

  • 917 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Raeger Tay​ :The error message indicates that a schema change has been detected while changing the column mapping mode. It seems like you are trying to change the column mapping mode from the default (position) to the "name" mode, which maps columns...

  • 0 kudos
KVNARK
by Honored Contributor II
  • 2381 Views
  • 1 replies
  • 5 kudos

accessing power bi dataset using MDX query using windows is working but the same not working using python Linux server.

trying to access the SSAS POIWER BI dataset using MDX query from python LInux server. We are hitting roadblock. The existing setup works as expected in windows system due to adodb.dll but unable to connect in Linux. Any help would be much appreciated...

  • 2381 Views
  • 1 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

@KVNARK .​ :One potential solution would be to use an open-source MDX library for Python that can connect to SSAS, such as OLAP-XMLA for Python. This library can be used to execute MDX queries against a SSAS server, including Power BI datasets.Here's...

  • 5 kudos
Indra
by New Contributor
  • 1531 Views
  • 1 replies
  • 0 kudos

Performance issue with Simba ODBC Driver to perform simple insert command to Deltalake

Hi,Our team is using Simba ODBC to perform data loading to Deltalake, and For a table with 3 columns it took around 55 seconds to insert 15 records. How to improve transactional loading into Deltalake? is there any option from the Simba ODBC driver t...

  • 1531 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Indra Limena​ :There are several ways to improve transactional loading into Delta Lake:Use Delta Lake's native Delta JDBC/ODBC connector instead of a third-party ODBC driver like Simba. The native connector is optimized for Delta Lake and supports b...

  • 0 kudos
Istuti
by Contributor
  • 2531 Views
  • 1 replies
  • 2 kudos
  • 2531 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Istuti Gupta​ :There are several algorithms you can use to mask a column in Databricks in a way that is compatible with SQL Server. One commonly used algorithm is called pseudonymization or tokenization.Here's an example of how you can implement pse...

  • 2 kudos
Databrickguy
by New Contributor II
  • 1204 Views
  • 1 replies
  • 0 kudos

How to use Java MaskFormatter in sparksql?

I create a function based on Java MaskFormatter function in Databricks/Scala.But when I call it from sparksql, I received error messageError in SQL statement: AnalysisException: Undefined function: formatAccount. This function is neither a built-in/t...

  • 1204 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Tim zhang​ :The issue is that the formatAccount function is defined as a Scala function, but SparkSQL is looking for a SQL function. You need to register the Scala function as a SQL function so that it can be called from SparkSQL. You can register t...

  • 0 kudos
chanansh
by Contributor
  • 1247 Views
  • 1 replies
  • 0 kudos

stream from azure credentials

I am trying to read stream from azure:(spark.readStream .format("cloudFiles") .option('cloudFiles.clientId', CLIENT_ID) .option('cloudFiles.clientSecret', CLIENT_SECRET) .option('cloudFiles.tenantId', TENTANT_ID) .option("header", "true") .opti...

  • 1247 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Hanan Shteingart​ :It looks like you're using the Azure Blob Storage connector for Spark to read data from Azure. The error message suggests that the credentials you provided are not being used by the connector.To specify the credentials, you can se...

  • 0 kudos
fhmessas
by New Contributor II
  • 2933 Views
  • 1 replies
  • 0 kudos

Resolved! Autoloader stream with EventBridge message

Hi All,I have a few streaming jobs running but we have been facing an issue related to messaging. We have multiple feeds within the same root rolder i.e. logs/{accountId}/CloudWatch|CloudTrail|vpcflow/yyyy-mm-dd/logs. Hence, the SQS allows to setup o...

  • 2933 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Fernando Messas​ :Yes, you can configure Autoloader to consume messages from an SQS queue using EventBridge. Here are the steps you can follow:Create an EventBridge rule to filter messages from the SQS queue based on a specific criteria (such as the...

  • 0 kudos
bchaubey
by Contributor II
  • 3925 Views
  • 1 replies
  • 0 kudos

unable to connect with Azure Storage with Scala

Hi Team, I am unable to connect Storage account with scala in Databricks, getting bellow error.AbfsRestOperationException: Status code: -1 error code: null error message: Cannot resolve hostname: ptazsg5gfcivcrstrlrs.dfs.core.windows.netCaused by: Un...

  • 3925 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Bhagwan Chaubey​ :The error message suggests that the hostname for your Azure Storage account could not be resolved. This could happen if there is a network issue, or if the hostname is incorrect.Here are some steps you can try to resolve the issue:...

  • 0 kudos
Data_Sam
by New Contributor II
  • 955 Views
  • 1 replies
  • 1 kudos

Streaming data apply change error not function with incoming files

Hi all,When I design a streaming data pipeline with incoming moving files and used apply chnge function on silver table comparing change between bronze and silver for removing duplicates based on key columns, do you know why I got ignore change to tr...

  • 955 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Raymond Huang​ :The error message "ignore changes to true" typically occurs when you are trying to apply changes to a table using Delta Lake's change data capture (CDC) feature, but you have set the option ignoreChanges to true. This option tells De...

  • 1 kudos
NakedSnake
by New Contributor III
  • 981 Views
  • 1 replies
  • 0 kudos

Connect to resource in another AWS account using transit gateway, not working

I`m trying to reach a service hosted in another AWS account through transit gateway. Databricks environment was created using Terraform, from the template available in the official documentation.Placing a VM in Databricks` private subnets makes us ab...

  • 981 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Thomaz Moreira​ :It sounds like there might be an issue with the network configuration of your Databricks cluster. Here are a few things you can check:Make sure that your Databricks cluster is in the same VPC as your service in the other AWS account...

  • 0 kudos
anonturtle
by New Contributor
  • 1484 Views
  • 1 replies
  • 0 kudos

How does automl classify which feature is numeric or categorical?

When running automl on its UI, it classifies a feature "local_convenience_store" as both a numeric and categorical column. This affects the result as for numeric columns a scaler is used while in a categorical column it is one hot encoded. For contex...

  • 1484 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@hr then​ :The approach taken by AutoML to classify features as numeric or categorical depends on the specific AutoML framework or library being used, as different implementations may use different methods or heuristics to make this determination.In ...

  • 0 kudos
Llop
by New Contributor II
  • 1517 Views
  • 1 replies
  • 0 kudos

Delta Live Tables CDC doubts

We are trying to migrate to Delta Live Tables an Azure Data Factory pipeline which loads CSV files and outputs Delta Tables in Databricks.The pipeline is triggered on demand via an external application which places the files in a Storage folder and t...

  • 1517 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Enric Llop​ :When using Delta Live Tables to perform a "rip and replace" operation, where you want to replace the existing data in a table with new data, there are a few things to keep in mind.First, the apply_changes function is used to apply chang...

  • 0 kudos
190809
by Contributor
  • 1256 Views
  • 1 replies
  • 0 kudos

Trying to figure out what is causing non-null values in my bronze tables to be returned as NULL in silver tables.

I have a process which loads data from json to a bronze table. It then adds a couple of columns and creates a silver table. But the silver table has NULL values where there were values in the bronze tables. Process as follows:def load_to_silver(sourc...

  • 1256 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Rachel Cunningham​ :One possible reason for this issue could be a data type mismatch between the bronze and silver tables. It is possible that the column in the bronze table has a non-null value, but the data type of that column is different from th...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels