cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mriccardi
by New Contributor II
  • 3225 Views
  • 4 replies
  • 1 kudos

Spark streaming: Checkpoint not recognising new data

Hello everyone!We are currently facing an issue with a stream that is not updating new data since the 20 of July.We've validated and bronze table has data that silver doesn't have.Also seeing the logs the silver stream is running but writing 0 files....

  • 3225 Views
  • 4 replies
  • 1 kudos
Latest Reply
mriccardi
New Contributor II
  • 1 kudos

Also the trigger is configured to run once, but when we start the job it never ends, it keeps in an endless loop.

  • 1 kudos
3 More Replies
Swostiman
by New Contributor II
  • 5822 Views
  • 5 replies
  • 4 kudos

Consuming data from databricks[Hive metastore] sql endpoint using pyspark

I was trying to read some delta data from databricks[Hive metastore] sql endpoint using pyspark, but while doing so I encountered that all the values of the table after fetching are same as the column name.Even when I try to just show the data it giv...

  • 5822 Views
  • 5 replies
  • 4 kudos
Latest Reply
sucan
New Contributor II
  • 4 kudos

Encountered the same issue and downgrading to 2.6.22 helped me resolve this issue.

  • 4 kudos
4 More Replies
Binesh
by New Contributor II
  • 10849 Views
  • 2 replies
  • 0 kudos

Databricks Logs some error messages while trying to read data using databricks-jdbc dependency

I have tried to read data from Databricks using the following java code.String TOKEN = "token..."; String url = "url...";   Properties properties = new Properties(); properties.setProperty("user", "token"); properties.setProperty("PWD", TOKEN);   Con...

Logger Errors
  • 10849 Views
  • 2 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@Binesh J​ - The issue could be due to the data type of the column is not compatible with getString() method in line#17. use getObject() method to retrieve the value as a generic value and then convert to string.

  • 0 kudos
1 More Replies
vanessafvg
by New Contributor III
  • 2094 Views
  • 1 replies
  • 3 kudos

Extracting data from excel in datalake storage using openpyxl

i am trying to extract some data into databricks but tripping all over openpyxl, newish user of databricks..from openpyxl import load_workbookdirectory_id="hidden"scope="hidden"client_id="hidden"service_credential_key="hidden"container_name="hidden"s...

  • 2094 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Vanessa Van Gelder​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
Ram443
by New Contributor III
  • 34502 Views
  • 9 replies
  • 5 kudos

Resolved! I created a data frame but was not able to see the data

Code to create a data frame:from pyspark.sql import SparkSessionspark=SparkSession.builder.appName("oracle_queries").master("local[4]")\  .config("spark.sql.warehouse.dir", "C:\\softwares\\git\\pyspark\\hive").getOrCreate()from pyspark.sql.functions ...

  • 34502 Views
  • 9 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

@ramanjaneyulu kancharla​  can you please select my answer as best answer

  • 5 kudos
8 More Replies
Sas
by New Contributor II
  • 1600 Views
  • 1 replies
  • 0 kudos

A streaming job going into infinite looping

HiBelow i am trying to read data from kafka, determine whether its fraud or not and then i need to write it back to mongodbbelow is my code read_kafka.pyfrom pyspark.sql import SparkSession from pyspark.sql.functions import * from pyspark.sql.types i...

  • 1600 Views
  • 1 replies
  • 0 kudos
Latest Reply
swethaNandan
Databricks Employee
  • 0 kudos

Hi Saswata,Can you remove the filter and see if it is printing output to console?kafka_df5=kafka_df4.filter(kafka_df4.status=="FRAUD")Thanks and RegardsSwetha Nandajan

  • 0 kudos
ankris
by New Contributor III
  • 5445 Views
  • 2 replies
  • 0 kudos

Could you please guide us on connecting ServiceNow data in databricks

Would like to extract data like ticket info, resolve time, etc., from ServiceNow in databricks.Not finding much information in community and appreciate your guidance on the same.

  • 5445 Views
  • 2 replies
  • 0 kudos
Latest Reply
crannow
New Contributor II
  • 0 kudos

ServiceNow offers API capabilities. You can consume the ServiceNow API within a Databricks notebook to extract data from ServiceNow. Following is a suggested prompt to use with ChatGPT for example python code to connect to ServiceNow's api. PROMPT: ...

  • 0 kudos
1 More Replies
naveenprabhun
by New Contributor III
  • 4770 Views
  • 2 replies
  • 3 kudos

Resolved! Unable to read data from ElasticSearch using Databricks (AWS) Cannot detect ES version - Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [IP:PORT]

I am trying to read data from ElasticSearch(ES Version 8.5.2) using PySpark on Databricks (13.0 (includes Apache Spark 3.4.0, Scala 2.12)). The ecosystem is on AWS.I am able to run a curl command on the Databricks notebook to the ES ip:port and fetch...

ErrorScreenshot Screenshot 2023-06-01 at 1.25.29 PM
  • 4770 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hoviedo
New Contributor III
  • 3 kudos

I have the same problem, did you find any solution? thanks

  • 3 kudos
1 More Replies
vijaykumarbotla
by New Contributor III
  • 7612 Views
  • 4 replies
  • 0 kudos

Resolved! Failed to merge fields 'LIFNR' and 'LIFNR'. Failed to merge incompatible data types IntegerType and StringType

I am have imported a csv file using spark.read method, i have used custom schema and declared the type of the column as string.i have delta table and the type of the column in the table is also string. I am getting failed to merge fields errors in sp...

  • 7612 Views
  • 4 replies
  • 0 kudos
Latest Reply
vijaykumarbotla
New Contributor III
  • 0 kudos

Hi All,the issue is resolved, i have executed column conversion and from next run the code is working fine.df = spark.read.format("delta").load("/mnt/dev/deltav2/X")df= df.withColumn("LIFNR", df.LIFNR.cast("string"))df.write.format('delta').option("o...

  • 0 kudos
3 More Replies
Rishitha
by New Contributor III
  • 1816 Views
  • 2 replies
  • 2 kudos

Resolved! Normalizing data from autoloader

I have data on s3 and i'm using autoloader to load the data. My json docs have fields which are array of structures. When I don't specify any schema the whole data is stored as strings even the array of structures are just a blob of string making it ...

  • 1816 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Rishitha Reddy​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

  • 2 kudos
1 More Replies
Swaroop
by New Contributor
  • 885 Views
  • 0 replies
  • 0 kudos

How to receive data from azure event hub in parquet ?

import asyncioimport osfrom azure.eventhub.aio import EventHubConsumerClientCONNECTION_STR = "Connection_string"EVENTHUB_NAME = "event_hub"async def on_event(partition_context, event):    # Put your code here.    # If the operation is i/o intensive, ...

  • 885 Views
  • 0 replies
  • 0 kudos
Teja07
by New Contributor II
  • 6787 Views
  • 0 replies
  • 0 kudos

Ingesting data from oracle to databricks through IICS

While ingesting the data from oracle to databricks through IICS, target table were created however data is not getting inserted. Below is the error. Could someone please help meException occurred when initializing data session. Root cause: java.lang....

  • 6787 Views
  • 0 replies
  • 0 kudos
Jits
by New Contributor II
  • 1424 Views
  • 2 replies
  • 3 kudos

Getting Error when Inserting data into table with the column as bigint

Hi All,I am creating table using Databricks SQL editor. The table definition isDROP TABLE IF EXISTS [database].***_test;CREATE TABLE [database].***_jitu_test(  id bigint)USING deltaLOCATION 'test/raw/***_jitu_test'TBLPROPERTIES ('delta.minReaderVersi...

  • 1424 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @jitendra goswami​ We haven't heard from you since the last response from @Werner Stinckens​ r​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpf...

  • 3 kudos
1 More Replies
knowAsha
by New Contributor II
  • 3348 Views
  • 3 replies
  • 3 kudos

Error while running the data engineering course notebook : "DE 2.2 - Providing Options for External Sources"

 Can somebody help me fixing this problem? I am running this notebook on databricks community edition

I am running this notebook in Databricks community edition.
  • 3348 Views
  • 3 replies
  • 3 kudos
Latest Reply
lemfo
New Contributor II
  • 3 kudos

df = spark.read.format('parquet').load(path = datasource_path) df = df.select("*").toPandas() df.to_sql('users', conn, if_exists='replace', index = False)

  • 3 kudos
2 More Replies
g96g
by New Contributor III
  • 2109 Views
  • 3 replies
  • 0 kudos

data is not written back to data lake

I have this strange case where data is not written back to data lake. I have 3 container- . Bronze, Silver and Gold. I have done the mounting and have not problem to read the source data and write it Bronze layer ( using hive meta store catalog). T...

  • 2109 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Givi Salu​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 0 kudos
2 More Replies
Labels