cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Mirza1
by New Contributor
  • 944 Views
  • 1 replies
  • 0 kudos

Error while Running a Table

Hi All,I am trying to run table schema and facing below error.Error - AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table.com.databricks.backend.common.rpc.SparkDriverExceptions$SQLExecutionException: org.apache...

  • 944 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @Mirza1 , Greetings!  Can you please confirm if it is an ADLS gen2 table? If yes then can you please give it a try to run the table schema by setting spark configs for gen2 at the cluster level?   You can refer to this document to set the spark co...

  • 0 kudos
Silabs
by New Contributor II
  • 8612 Views
  • 3 replies
  • 4 kudos

Resolved! Set up connection to on prem sql server

Ive just set up our databricks environment. Hosted in AWS. We have an on prem SQL server and would like to connect . How can i do that?

  • 8612 Views
  • 3 replies
  • 4 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 4 kudos

@Silabs good day! To connect your Databricks environment (hosted on AWS) to your on-premise SQL server, follow these steps: 1. Network Setup: Establish a connection between your SQL server and the Databricks virtual private cloud (VPC) using VPN or A...

  • 4 kudos
2 More Replies
dbengineer516
by New Contributor III
  • 5278 Views
  • 4 replies
  • 2 kudos

Resolved! Git Integration with Databricks Query Files and Azure DevOps

I’ve been trying to develop a solution for our team to be able to have Git integration between Databricks and Azure DevOps. However, the “query” file type/workspace item on Databricks can’t be committed and pushed to a Git repo, only regular file typ...

  • 5278 Views
  • 4 replies
  • 2 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 2 kudos

@dbengineer516 Good day! As per the Databricks documentation, only certain Databricks asset types are supported by Git folders. These include Files, Notebooks, and Folders. Databricks asset types that are currently not supported in Git folders includ...

  • 2 kudos
3 More Replies
SreeG
by New Contributor II
  • 1453 Views
  • 2 replies
  • 0 kudos

Error Reading Kafka message into Azure Databricks

TeamI am trying to test the connection to Kafka broker from Azure Databricks. Telnet and IP is successful.When I am trying to read the data, I am getting "Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid cer...

Data Engineering
Azure DB
kafka
  • 1453 Views
  • 2 replies
  • 0 kudos
Latest Reply
SreeG
New Contributor II
  • 0 kudos

Had to hold on this testing. But, when I get a chance to work on this, I will update my findings. Thank you!

  • 0 kudos
1 More Replies
PerformanceTest
by New Contributor
  • 927 Views
  • 1 replies
  • 0 kudos

Databricks to Jmteter connectivity issue

Hi All, we are conducting Databricks performance test with Apache Jmeter, after configuring JDBC config element getting below error Cannot create PoolableConnectionFactory ([Databricks][JDBCDriver](700120) Host xxxxx-xxxxx-xxxx.cloud.databricks.com c...

  • 927 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@PerformanceTest - can you please check your DB workspace terraform script to see if there is a different CNAME defined for your host workspace.  

  • 0 kudos
ByteForge
by New Contributor
  • 1301 Views
  • 1 replies
  • 0 kudos

How to import .dbc files above size limit?

 Above is the screenshot of error, is there any other way of processing dbc files? Do no have access/backup to previous workspace where this code is imported from

ByteForge_2-1715687151524.png
  • 1301 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@ByteForge - Kindly raise a support case with Databricks to work with Engg to increase the limits for your workspace.

  • 0 kudos
ac0
by Contributor
  • 1393 Views
  • 2 replies
  • 0 kudos

Resolved! Is it more performant to run optimize table commands on a serverless SQL warehouse or elsewhere?

Is it more performant to run optimize table commands on a serverless SQL warehouse or in a job or all-purpose compute cluster? I would presume a serverless warehouse would be faster, but I don't know how to test this.

  • 1393 Views
  • 2 replies
  • 0 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 0 kudos

@ac0 Good day! Serverless SQL warehouses are likely to execute "optimize table" commands faster than job or all-purpose compute clusters due to their rapid startup time, quick upscaling for low latency, and efficient handling of varying query demand....

  • 0 kudos
1 More Replies
NTRT
by New Contributor III
  • 1137 Views
  • 1 replies
  • 0 kudos

how to transform json-stat 2 filte to SparkDataFrame ? how to keep order on MapType structure ?

Hi,I am using different json files of type json-stat2.  These kind of json file is quite common used in national statistisc bureau. Its multi dimensional with multy arrays. Using python environment kan we use pyjstat package to easily  transform json...

  • 1137 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

MapType does not maintain order (json itself too).Can you apply the ordering yourself afterwards?

  • 0 kudos
NTRT
by New Contributor III
  • 1253 Views
  • 2 replies
  • 0 kudos

cant read json file with just 1,75 MiB ?

Hi,I am realtively new on databricks, although I am conscious about lazy evaluation, transformations and actions and peristence.I have a json file (complex-nested) with about 1,73 MiB. when df = spark.read.option("multiLine", "false").json('dbfs:/mnt...

  • 1253 Views
  • 2 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

This can be resolved by redefining the schema structure explicitly and using that schema to read the file. from pyspark.sql.types import StructType, StructField, StringType, IntegerType, ArrayType# Define the schema according to the JSON structuresch...

  • 0 kudos
1 More Replies
NTRT
by New Contributor III
  • 2488 Views
  • 4 replies
  • 0 kudos

Resolved! performance issues when readingjson-stat2

Hi,I am realtively new on databricks, although I am conscious about lazy evaluation, transformations and actions and peristence.I have a json file (complex-nested) with about 1,73 MiB. when df = spark.read.option("multiLine", "false").json('dbfs:/mnt...

  • 2488 Views
  • 4 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

Please give me a kudos if this works.Efficiency in Data Collection: Using .collect() on large datasets can lead to out-of-memory errors as it collects all rows to the driver node. If the dataset is large, consider alternatives such as extracting only...

  • 0 kudos
3 More Replies
Mathias_Peters
by Contributor
  • 1373 Views
  • 2 replies
  • 0 kudos

Asset Bundles: Adding project_directory in DBT task breaks previous python task

Hi, I have a job consisting of three tasks:  tasks: - task_key: Kinesis_to_S3_new spark_python_task: python_file: ../src/kinesis.py parameters: ["${var.stream_region}", "${var.s3_base_path}"] j...

  • 1373 Views
  • 2 replies
  • 0 kudos
Latest Reply
Mathias_Peters
Contributor
  • 0 kudos

Hi @Ajay-Pandey ,thank you for the hints. I will try to recreate the job via UI. I ran the tasks in a Github workflow. The file locations are mixed: the first two tasks (python and dlt) are located in the databricks/src folder. The dbt files come fro...

  • 0 kudos
1 More Replies
chandan_a_v
by Valued Contributor
  • 2701 Views
  • 2 replies
  • 1 kudos

Can't import local files under repo

I have a yaml file inside one of the sub dir in Databricks, I have appended the repo path to sys. Still I can't access this file. https://docs.databricks.com/_static/notebooks/files-in-repos.html

image
  • 2701 Views
  • 2 replies
  • 1 kudos
Latest Reply
Abhishek10745
New Contributor III
  • 1 kudos

Hello @chandan_a_v ,were you able to solve this issue?I am also experiencing the same thing where i cannot move file with extension .yml from repo folder to shared workspace folder.As per documentation, this is the limitation or functionality of data...

  • 1 kudos
1 More Replies
zero234
by New Contributor III
  • 982 Views
  • 1 replies
  • 0 kudos

Delta live table is inserting data multiple times

So I have created a delta live table Which uses spark.sql() to execute a query And uses df.write.mode(append).insert intoTo insert  data into the respective table And at the end i return a dumy table Since this was the requirement So now I have also ...

  • 982 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

whats your source? your sink is a delta table correct? how do you verify that there are no inserts happening? 

  • 0 kudos
Meshynix
by New Contributor III
  • 6665 Views
  • 6 replies
  • 0 kudos

Resolved! Not able to create external table in a schema under a Catalog.

Problem StatementCluster 1 (Shared Cluster) is not able to read the file location at "dbfs:/mnt/landingzone/landingzonecontainer/Inbound/" and hence we are not able to create an external table in a schema inside Enterprise Catalog.Cluster 2 (No Isola...

  • 6665 Views
  • 6 replies
  • 0 kudos
Latest Reply
Avi_Bricks
New Contributor II
  • 0 kudos

External table creation failing with error :- UnityCatalogServiceException:[RequestId=**** ErrorClass=INVALID_PARAMETER_VALUE] Unsupported path operation PATH_CREATE_TABLE on volume.Able to access and create files on external location.  

  • 0 kudos
5 More Replies
pshuk
by New Contributor III
  • 1511 Views
  • 1 replies
  • 0 kudos

run md5 using CLI

Hi,I want to run a md5 checksum on the uploaded file to databricks. I can generate md5 on the local file but how do I generate one on uploaded file on databricks using CLI (Command line interface). Any help would be appreciated.I tried running databr...

  • 1511 Views
  • 1 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels