cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mh_db
by New Contributor III
  • 134 Views
  • 2 replies
  • 0 kudos

Unable to connect to oracle server from databricks notebook in AWS

I'm trying to connect to oracle server hosted in azure from AWS databricks notebook but seems the connection keeps timing out. I tested the connection IP using telnet <hostIP> 1521 command from another EC2 instance and that seems to reach the oracle ...

Data Engineering
AWS
oracle
TCP
  • 134 Views
  • 2 replies
  • 0 kudos
Latest Reply
Yeshwanth
Valued Contributor II
  • 0 kudos

@mh_db good day! Could you please confirm the Cluster type you used for testing? Was it a Shared Cluster, an Assigned/Single-User Cluster, or a No-Isolation cluster? Could you please try the same on the Assigned/Single User Cluster and No Isolation c...

  • 0 kudos
1 More Replies
Avinash_Narala
by New Contributor III
  • 59 Views
  • 1 replies
  • 0 kudos

Application Deployment in Marketplace

Hi,I want to deploy my flask application in Databricks Marketplace.How can I do it?Can you please share the details

  • 59 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Avinash_Narala,  For more information, you can refer to the following resources: Tutorial: Deploy and query a feature serving endpoint - DatabricksPython Flask App migrate to Databricks - Microsoft Q&ADeploy custom models | Databricks on AWSHo...

  • 0 kudos
NataliaCh
by New Contributor
  • 568 Views
  • 1 replies
  • 0 kudos

Delta table cannot be reached with INTERNAL_ERROR

Hi all!I've been dropping and recreating delta tables at the new location. For one table something went wrong and now I cannot nor DROP nor recreate it. It is visible in catalog, however, when I click on the table I see message: [INTERNAL_ERROR] The ...

  • 568 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

I’m sorry to hear that you’re encountering this issue with your Delta table. Ensure that you are using a compatible version of Spark and its associated plugins. Sometimes, upgrading or downgrading Spark can resolve issues related to internal error...

  • 0 kudos
kseyser
by New Contributor
  • 198 Views
  • 1 replies
  • 0 kudos

Predicting compute required to run Spark jobs

Im working on a project to predict compute (cores) required to run spark jobs. Has anyone work on this or something similar before? How did you get started? 

  • 198 Views
  • 1 replies
  • 0 kudos
Latest Reply
Yeshwanth
Valued Contributor II
  • 0 kudos

@kseyser good day, This documentation might help you in your use-case: https://docs.databricks.com/en/compute/cluster-config-best-practices.html#compute-sizing-considerations Kind regards, Yesh

  • 0 kudos
dbal
by New Contributor III
  • 610 Views
  • 2 replies
  • 0 kudos

withColumnRenamed does not work with databricks-connect 14.3.0

I am not able to run our unit tests suite due a possible bug in the databricks-connect library. The problem is with the Dataframe transformation withColumnRenamed. When I run it in a Databricks cluster (Databricks Runtime 14.3 LTS), the column is ren...

dbal_3-1715382511871.png dbal_4-1715382516217.png dbal_1-1715383269610.png
  • 610 Views
  • 2 replies
  • 0 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 0 kudos

@dbal - can you please try withColumnsRenamed() instead Reference: https://docs.databricks.com/en/release-notes/dbconnect/index.html#databricks-connect-1430-python

  • 0 kudos
1 More Replies
StephanKnox
by New Contributor
  • 127 Views
  • 1 replies
  • 1 kudos

Parametrized SQL - Pass column names as a parameter?

Hi all, Is there a way to pass a column name(not a value) in a parametrized Spark SQL query?I am trying to do it like so, however it does not work as I think column name get expanded like 'value' i.e. surrounded by single quotes: def count_nulls(df:D...

  • 127 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @StephanKnox , You can use string interpolation (f-strings) to dynamically insert the column name into your query.

  • 1 kudos
NTRT
by New Contributor
  • 326 Views
  • 1 replies
  • 0 kudos

how to transform json-stat 2 filte to SparkDataFrame ? how to keep order on MapType structure ?

Hi,I am using different json files of type json-stat2.  These kind of json file is quite common used in national statistisc bureau. Its multi dimensional with multy arrays. Using python environment kan we use pyjstat package to easily  transform json...

  • 326 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

MapType does not maintain order (json itself too).Can you apply the ordering yourself afterwards?

  • 0 kudos
NTRT
by New Contributor
  • 146 Views
  • 2 replies
  • 0 kudos

cant read json file with just 1,75 MiB ?

Hi,I am realtively new on databricks, although I am conscious about lazy evaluation, transformations and actions and peristence.I have a json file (complex-nested) with about 1,73 MiB. when df = spark.read.option("multiLine", "false").json('dbfs:/mnt...

  • 146 Views
  • 2 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

This can be resolved by redefining the schema structure explicitly and using that schema to read the file. from pyspark.sql.types import StructType, StructField, StringType, IntegerType, ArrayType# Define the schema according to the JSON structuresch...

  • 0 kudos
1 More Replies
Michael_Appiah
by New Contributor III
  • 2697 Views
  • 6 replies
  • 3 kudos

Resolved! Parameterized spark.sql() not working

Spark 3.4 introduced parameterized SQL queries and Databricks also discussed this new functionality in a recent blog post (https://www.databricks.com/blog/parameterized-queries-pyspark)Problem: I cannot run any of the examples provided in the PySpark...

Michael_Appiah_0-1704459542967.png Michael_Appiah_1-1704459570498.png
  • 2697 Views
  • 6 replies
  • 3 kudos
Latest Reply
Michael_Appiah
New Contributor III
  • 3 kudos

@Cas Unfortunately I do not have any information on this. However, I have seen that DBR 14.3 and 15.0 introduced some changes to spark.sql(). I have not checked whether those changes resolve the issue outlined here. Your best bet is probably to go ah...

  • 3 kudos
5 More Replies
JacobKesinger
by New Contributor
  • 836 Views
  • 3 replies
  • 0 kudos

Iterating over a pyspark.pandas.groupby.DataFrameGroupBy

I have a pyspark.pandas.frame.DataFrame object (that I called from `pandas_api` on a pyspark.sql.dataframe.DataFrame object).  I have a complicated transformation that I would like to apply to this data, and in particular I would like to apply it in ...

  • 836 Views
  • 3 replies
  • 0 kudos
Latest Reply
MichTalebzadeh
Contributor
  • 0 kudos

Hi,The error indicates that the Unity Catalog does not support Spark higher-order functions, such as those used in pandas_udf. This limitation likely comes from architectural or compatibility constraints. To resolve the issue, consider alternative ap...

  • 0 kudos
2 More Replies
kazinahian
by New Contributor III
  • 413 Views
  • 2 replies
  • 1 kudos

Resolved! Lowcode ETL in Databricks

Hello everyone,I work as a Business Intelligence practitioner, employing tools like Alteryx or various low-code solutions to construct ETL processes and develop data pipelines for my Dashboards and reports. Currently, I'm delving into Azure Databrick...

  • 413 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @kazinahian,  In the Azure ecosystem, you have a few options for building ETL (Extract, Transform, Load) data pipelines, including low-code solutions. Let’s explore some relevant tools: Azure Data Factory: Purpose: Azure Data Factory is a clou...

  • 1 kudos
1 More Replies
etum
by New Contributor II
  • 168 Views
  • 1 replies
  • 2 kudos

Importing JSON files when format is subject to evolution

Hi there,I'm reaching out for some assistance with importing JSON files into Databricks. Still a beginner even if I've gained experience working with various data import batches (CSV/JSON) for application monitoring:  I'm currently facing a challenge...

  • 168 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @etum,  In JSON Schema, you can use the allOf keyword. It allows you to specify that the data must be valid against both the parent schema (the original schema) and the child schema (the new fields). This way, you ensure compatibility with both ol...

  • 2 kudos
mh_db
by New Contributor III
  • 317 Views
  • 0 replies
  • 0 kudos

How to get different dynamic value for each task in workflow

I created a workflow with two tasks. It runs the first notebook and then it wait for that to finish to start the second notebook. I want to use this dynamic value as one of the parameters {{job.start_time.iso_datetime}} for both tasks. This should gi...

  • 317 Views
  • 0 replies
  • 0 kudos
ashraf1395
by New Contributor II
  • 120 Views
  • 1 replies
  • 0 kudos

How to extend free trial period or enter free startup tier to complete our POC for a client.

We are a data consultancy. Our free trial period is currently getting over and we are still doing POC for one of our potential clients and focusing on providing expert services around databricks.1. Is there a possibility that we can extend the free t...

  • 120 Views
  • 1 replies
  • 0 kudos
Latest Reply
Mo
Contributor III
  • 0 kudos

hey @ashraf1395, I suggest you contact your databricks representative or account manager.

  • 0 kudos
Labels