cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Henry
by New Contributor II
  • 2599 Views
  • 3 replies
  • 0 kudos

Cannot login Databricks Community Edition with new account

It seems it is not allowing me to log into databricks community edition. I have recently created a new account and had the account verified. However, whenever I try to log in, I am redirected to the same page without throwing any errors. When I do en...

  • 2599 Views
  • 3 replies
  • 0 kudos
Latest Reply
Henry
New Contributor II
  • 0 kudos

Hi @Piper Wilson​ , unfortunately the problem still persists with signing up via mobile.

  • 0 kudos
2 More Replies
Dusko
by New Contributor III
  • 3486 Views
  • 6 replies
  • 1 kudos

How to access root mountPoint without "Access Denied"?

Hi, I’m trying to read file from S3 root bucket. I can ls all the files but I can’t read it because of access denied. When I mount the same S3 root bucket under some other mountPoint, I can touch and read all the files. I also see that this new mount...

  • 3486 Views
  • 6 replies
  • 1 kudos
Latest Reply
Dusko
New Contributor III
  • 1 kudos

Hi @Atanu Sarkar​ , @Piper Wilson​ ,​thanks for the replies. Well I don't understand the fact about ownership. I believe that rootbucket is still under my ownership (I created it and I could upload/delete any files through browser without any problem...

  • 1 kudos
5 More Replies
fsm
by New Contributor II
  • 6610 Views
  • 4 replies
  • 2 kudos

Resolved! Implementation of a stable Spark Structured Streaming Application

Hi folks,I have an issue. It's not critical but's annoying.We have implemented a Spark Structured Streaming Application.This application will be triggered wire Azure Data Factory (every 8 minutes). Ok, this setup sounds a little bit weird and it's no...

  • 6610 Views
  • 4 replies
  • 2 kudos
Latest Reply
brickster_2018
Esteemed Contributor
  • 2 kudos

@Markus Freischlad​  Looks like the spark driver was stuck. It will be good to capture the thread dump of the Spark driver to understand what operation is stuck

  • 2 kudos
3 More Replies
admo
by New Contributor III
  • 8819 Views
  • 4 replies
  • 7 kudos

Scaling issue for inference with a spark.mllib model

Hello,I'm writing this because I have tried a lot of different directions to get a simple model inference working with no success.Here is the outline of the job# 1 - Load the base data (~1 billion lines of ~6 columns) interaction = build_initial_df()...

  • 8819 Views
  • 4 replies
  • 7 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 7 kudos

It is hard to analyze without Spark UI and more detailed information, but anyway few tips:look for data skews some partitions can be very big some small because of incorrect partitioning. You can use Spark UI to do that but also debug your code a bit...

  • 7 kudos
3 More Replies
Mendes
by New Contributor
  • 2761 Views
  • 2 replies
  • 0 kudos
  • 2761 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Danilo Mendes​ , Table schema is stored in the default Azure Databricks internal metastore and you can also configure and use external metastores. Ingest data into Azure Databricks. Access data in Apache Spark formats and from external data sources....

  • 0 kudos
1 More Replies
Tahseen0354
by Valued Contributor
  • 2389 Views
  • 4 replies
  • 2 kudos

Resolved! A Standard cluster is recommended for a single user - what is meant by that ?

Hi, I have seen it written in the documentation that standard cluster is recommended for a single user. But why ? What is meant by that ? Me and one of my colleagues were testing it on the same notebook. Both of us can use the same standard all purpo...

  • 2389 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

High concurrency cluster just split resource between users more evenly. So when 4 people run notebooks in the same time on cluster with 4 cpu you can imagine that every will get 1 cpu. In standard cluster 1 person could utilize all worker cpus as you...

  • 2 kudos
3 More Replies
Raie
by New Contributor III
  • 7045 Views
  • 3 replies
  • 4 kudos

Resolved! How do I specify column's data type with spark dataframes?

What I am doing:spark_df = spark.createDataFrame(dfnew)spark_df.write.saveAsTable("default.test_table", index=False, header=True)This automatically detects the datatypes and is working right now. BUT, what if the datatype cannot be detected or detect...

  • 7045 Views
  • 3 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

just create table earlier and set column types (CREATE TABLE ... LOCATION ( path path)in dataframe you need to have corresponding data types which you can make using cast syntax, just your syntax is incorrect, here is example of correct syntax:from p...

  • 4 kudos
2 More Replies
tomsyouruncle
by New Contributor III
  • 16096 Views
  • 14 replies
  • 3 kudos

How do I enable support for arbitrary files in Databricks Repos? Public Preview feature doesn't appear in admin console.

"Arbitrary files in Databricks Repos", allowing not just notebooks to be added to repos, is in Public Preview. I've tried to activate it following the instructions in the above link but the option doesn't appear in Admin Console. Minimum requirements...

image repos
  • 16096 Views
  • 14 replies
  • 3 kudos
Latest Reply
User16413245720
New Contributor II
  • 3 kudos

What environment is your deployment in?

  • 3 kudos
13 More Replies
Sudeshna
by New Contributor III
  • 10727 Views
  • 6 replies
  • 7 kudos

Resolved! I am new to Databricks SQL and want to create a variable which can hold calculations either from static values or from select queries similar to SQL Server. Is there a way to do so?

I was trying to create a variable and i got the following error -command - SET a = 5;Error -Error running queryConfiguration a is not available.

  • 10727 Views
  • 6 replies
  • 7 kudos
Latest Reply
BilalAslamDbrx
Honored Contributor III
  • 7 kudos

@Sudeshna Bhakat​ what @Joseph Kambourakis​ described works on clusters but is restricted on Databricks SQL endpoints i.e. only a limited number of SET commands are allowed. I suggest you explore the curly-braces (e.g. {{ my_variable }}) in Databrick...

  • 7 kudos
5 More Replies
shelms
by New Contributor II
  • 13297 Views
  • 2 replies
  • 7 kudos

Resolved! SQL CONCAT returning null

Has anyone else experienced this problem? I'm attempting to SQL concat two fields and if the second field is null, the entire string appears as null. The documentation is unclear on the expected outcome, and contrary to how concat_ws operates.SELECT ...

Screen Shot 2022-03-14 at 4.00.53 PM
  • 13297 Views
  • 2 replies
  • 7 kudos
Latest Reply
BilalAslamDbrx
Honored Contributor III
  • 7 kudos

CONCAT is a function defined in the SQL standard and available across a wide variety of DBMS. With the exception of Oracle which uses VARCHAR2 semantic across the board, the function returns NULL on NULL input.CONCAT_WS() is not standard and is mostl...

  • 7 kudos
1 More Replies
cmotla
by New Contributor III
  • 2026 Views
  • 1 replies
  • 7 kudos

Issue with complex json based data frame select

We are getting the below error when trying to select the nested columns (string type in a struct) even though we don't have more than a 1000 records in the data frame. The schema is very complex and has few columns as struct type and few as array typ...

  • 2026 Views
  • 1 replies
  • 7 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 7 kudos

Please share your code and some example of data.

  • 7 kudos
mikep
by New Contributor II
  • 3789 Views
  • 4 replies
  • 0 kudos

Resolved! Kubernetes or ZooKeeper for HA?

Hello. I am trying to understand High Availability in DataBricks. I understand that DB uses Kubernetes for the cluster manager and to manage Docker Containers. And while DB runs on top of AWS or Azure or GCP, is HA automatically provisioned when I st...

  • 3789 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

  • 0 kudos
3 More Replies
george2020
by New Contributor II
  • 967 Views
  • 0 replies
  • 2 kudos

Using the Databricks Repos API to bring Repo in top-level production folder to latest version

I am having an issue with Github Actions workflow using the Databricks Repos API. We want the API call in the Git Action to bring the Repo in our Databricks Repos Top-level folder to the latest version on a merge into the main branch.The Github Actio...

  • 967 Views
  • 0 replies
  • 2 kudos
RicksDB
by Contributor II
  • 3745 Views
  • 3 replies
  • 6 kudos

Resolved! Restricting file upload to DBFS

Hi,Is it possible to restrict upload files to dfbs root (Since everyone has access) ? The idea is to force users to use an ADLS2 mnt with credential passthrough for security reasons.Also, right now users use azure blob explorer to interact with ADLS2...

  • 3745 Views
  • 3 replies
  • 6 kudos
Latest Reply
User16764241763
Honored Contributor
  • 6 kudos

Hello @E H​ You can disable DBFS file browser in the workspace, if users directly upload from there. This will prevent uploads to DBFS.https://docs.databricks.com/administration-guide/workspace/dbfs-browser.html Please let us know if this solution wo...

  • 6 kudos
2 More Replies
wyzer
by Contributor II
  • 3132 Views
  • 2 replies
  • 2 kudos

Resolved! Insert data into an on-premise SQL Server

Hello,Is it possible to insert data from Databricks into an on-premise SQL Server ?Thanks.

  • 3132 Views
  • 2 replies
  • 2 kudos
Latest Reply
wyzer
Contributor II
  • 2 kudos

Hello,Yes we find out how to do it by installing a JDBC connector.It works fine.Thanks.

  • 2 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels