cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

aimas
by New Contributor III
  • 3817 Views
  • 8 replies
  • 5 kudos

Resolved! error creating tables using UI

Hi, i try to create a table using UI, but i keep getting the error "error creating table <table name> create a cluster first" even when i have a cluster alread running. what is the problem?

  • 3817 Views
  • 8 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

Be sure that cluster is selected (arrow in database) and at least there is Default database.

  • 5 kudos
7 More Replies
Orianh
by Valued Contributor II
  • 14092 Views
  • 11 replies
  • 10 kudos

Resolved! Read JSON files from the s3 bucket

Hello guys, I'm trying to read JSON files from the s3 bucket. but no matter what I try I get Query returned no result or if I don't specify the schema I get unable to infer a schema.I tried to mount the s3 bucket, still not works.here is some code th...

  • 14092 Views
  • 11 replies
  • 10 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 10 kudos

Please refer to the doc that helps you to read JSON. If you are getting this error the problem should be with the JSON schema. Please validate it.As a test, create a simple JSON file (you can get it on the internet), upload it to your S3 bucket, and ...

  • 10 kudos
10 More Replies
Data_Bricks1
by New Contributor III
  • 1927 Views
  • 7 replies
  • 0 kudos

data from 10 BLOB containers and multiple hierarchical folders(every day and every hour folders) in each container to Delta lake table in parquet format - Incremental loading for latest data only insert no updates

I am able to load data for single container by hard coding, but not able to load from multiple containers. I used for loop, but data frame is loading only last container's last folder record only.Here one more issue is I have to flatten data, when I ...

  • 1927 Views
  • 7 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

for sure function (def) should be declared outside loop, move it after importing libraries,logic is a bit complicated you need to debug it using display(Flatten_df2) (or .show()) and validating json after each iteration (using break or sleep etc.)

  • 0 kudos
6 More Replies
StephanieRivera
by Valued Contributor II
  • 983 Views
  • 1 replies
  • 5 kudos
  • 983 Views
  • 1 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

Hi as it is transaction tables (there are history commits and snapshot). I would not store there images or videos as it can be saved few times and you will have high storage costs, it can also be slow when data is big.I would definitely store images,...

  • 5 kudos
User16826994223
by Honored Contributor III
  • 3681 Views
  • 2 replies
  • 1 kudos

AssertionError: assertion failed: Unable to delete the record but I am able to select it though

Is there any reason this command works well:%sql SELECT * FROM datanase.table WHERE salary > 1000returning 2 rows, while the below:%sql delete FROM datanase.table WHERE salary > 1000ErrorError in SQL statement: AssertionError: assertion failed:...

  • 3681 Views
  • 2 replies
  • 1 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 1 kudos

DELETE FROM (and similarly UPDAT. aren't supported on the Parquet files - right now on Databricks, it's supported for Delta format. You can convert your parquet files into delta using CONVERT TO DELTA, and then this command will work for you.

  • 1 kudos
1 More Replies
dataslicer
by Contributor
  • 5653 Views
  • 4 replies
  • 4 kudos

Resolved! Unable to save Spark Dataframe to driver node's local file system as CSV file

Running Azure Databricks Enterprise DBR 8.3 ML running on a single node, with Python notebook. I have 2 small Spark dataframes that I am able source via credential passthrough reading from ADLSgen2 via `abfss://` method and display the full content ...

  • 5653 Views
  • 4 replies
  • 4 kudos
Latest Reply
Dan_Z
Honored Contributor
  • 4 kudos

Modern Spark operates by a design choice to separate storage and compute. So saving a csv to the river's local disk doesn't make sense for a few reasons:the worker nodes don't have access to the driver's disk. They would need to send the data over to...

  • 4 kudos
3 More Replies
PaulHernandez
by New Contributor II
  • 14948 Views
  • 7 replies
  • 0 kudos

Resolved! How to show an image in a notebook using html?

Hi everyone, I just learning how to personalize the databricks notebooks and would like to show a logo in a cell. I installed the databricks cli and was able to upload the image file to the dbfs: I try to display it like this: displayHTML("<im...

0693f000007OoKMAA0 0693f000007OoKNAA0
  • 14948 Views
  • 7 replies
  • 0 kudos
Latest Reply
_robschaper
New Contributor II
  • 0 kudos

@Paul Hernandez​ @Sean Owen​ @Navneet Tuteja​ I solved this after I also ran into the same issue where my notebook suddenly wouldn't show an image sitting on the driver in an accessible folder - no matter what I was trying in the notebook the display...

  • 0 kudos
6 More Replies
daindana
by New Contributor III
  • 6413 Views
  • 4 replies
  • 4 kudos

Resolved! Why doesn't my notebook display widgets when I use 'dbutils' while it is displayed with '%sql CREATE WIDGET'?

The widget is not shown when I use dbutils while it works perfect with sql.For example, %sql   CREATE WIDGET TEXT state DEFAULT "CA"This one shows me widget.dbutils.widgets.text("name", "Brickster", "Name") dbutils.widgets.multiselect("colors", "oran...

dbutils get info from widget dbutils widget creation
  • 6413 Views
  • 4 replies
  • 4 kudos
Latest Reply
daindana
New Contributor III
  • 4 kudos

Hello, Ryan! For some reason, this problem is solved, and now it is working perfectly! I did nothing new, but it is just working now. Thank you!:)

  • 4 kudos
3 More Replies
BorislavBlagoev
by Valued Contributor III
  • 3080 Views
  • 5 replies
  • 4 kudos

Resolved! Databricks writeStream checkpoint

I'm trying to execute this writeStream data_frame.writeStream.format("delta") \ .option("checkpointLocation", checkpoint_path) \ .trigger(processingTime="1 second") \ .option("mergeSchema", "true") \ .o...

  • 3080 Views
  • 5 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

You can remove that folder so it will be recreated automatically.Additionally every new job run should have new (or just empty) checkpoint location.You can add in your code before running streaming:dbutils.fs.rm(checkpoint_path, True)Additionally you...

  • 4 kudos
4 More Replies
halfwind22
by New Contributor III
  • 6680 Views
  • 11 replies
  • 12 kudos

Resolved! Unable to write csv files to Azure BLOB using pandas to_csv ()

I am using a Py function to read some data from a GET endpoint and write them as a CSV file to a Azure BLOB location.My GET endpoint takes 2 query parameters,param1 and param2. So initially, I have a dataframe paramDf that has two columns param1 and ...

  • 6680 Views
  • 11 replies
  • 12 kudos
Latest Reply
halfwind22
New Contributor III
  • 12 kudos

@Hubert Dudek​ I cant issue a spark command to executor node, throws up an error ,because foreach distributes the processing.

  • 12 kudos
10 More Replies
ItsMe
by New Contributor II
  • 1683 Views
  • 4 replies
  • 7 kudos

Resolved! Run Pyspark job of Python egg package using spark submit on databricks

Error: missing application resource​Getting this error while running job with spark submit.​ I have given following parameters while creating job:--conf spark.yarn.appMasterEnv.PYSAPRK_PYTHON=databricks/path/python3--py-files dbfs/path/to/.egg job_m...

  • 1683 Views
  • 4 replies
  • 7 kudos
Latest Reply
User16752246494
Contributor
  • 7 kudos

Hi,We tried a simulate the question on our end and what we did was packaged a module inside a whl file.Now to access the wheel file we created another python file test_whl_locally.py. Inside test_whl_locally.py to access the content of the wheel file...

  • 7 kudos
3 More Replies
afshinR
by New Contributor III
  • 467 Views
  • 1 replies
  • 1 kudos

Hi, could you please help me with my question? i have not get any answers.

Hi,could you please help me with my question? i have not get any answers.

  • 467 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @afshin riahi​ , Yes, Definitely I can help you with it.Please wait while I or someone from the community gets back with a response.Thank you for your patience .

  • 1 kudos
User16868770416
by Contributor
  • 3198 Views
  • 1 replies
  • 0 kudos

What is the best way to decode protobuf using pyspark?

I am using spark structured streaming to read a protobuf encoded message from the event hub. We use a lot of Delta tables, but there isn't a simple way to integrate this. We are currently using K-SQL to transform into avro on the fly and then use Dat...

  • 3198 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

hi @Will Block​ ,I think there is a related question being asked in the past. I think it was this one I found this library, I hope it helps.

  • 0 kudos
marchello
by New Contributor III
  • 3623 Views
  • 9 replies
  • 3 kudos

Resolved! error on connecting to Snowflake

Hi team, I'm getting weird error in one of my jobs when connecting to Snowflake. All my other jobs (I've got plenty) work fine. The current one also works fine when I have only one coding step (except installing needed libraries in my very first step...

  • 3623 Views
  • 9 replies
  • 3 kudos
Latest Reply
Dan_Z
Honored Contributor
  • 3 kudos

@marchello​ I suggest you contact Snowflake to move forward on this one.

  • 3 kudos
8 More Replies
Labels
Top Kudoed Authors