Data Engineering

Forum Posts

Sorted by:

by Saikrishna2 • New Contributor III

11-22-2022 8:03:53 AM

9439 Views
7 replies
11 kudos

Data bricks SQL is allowing 10 queries only ?

•Power BI is a publisher that uses AD group authentication to publish result sets. Since the publisher's credentials are maintained, the same user can access the data bricks database.•Number of the users are retrieving the data from the power bi or i...

Data Engineering

9439 Views
7 replies
11 kudos

11-22-2022 8:03:53 AM

View Replies

Latest Reply

VaibB
Contributor

12-02-2022 12:26:24 PM

11 kudos

I believe 10 is a limit as of now. See if you can increase the concurrency limit from the source.

11 kudos

12-02-2022 12:26:24 PM

6 More Replies

by User16835756816 • Databricks Employee

11-28-2022 12:04:54 PM

8788 Views
4 replies
11 kudos

How can I extract data from different sources and transform it into a fresh, reliable data pipeline?

Tip: These steps are built out for AWS accounts and workspaces that are using Delta Lake. If you would like to learn more watch this video and reach out to your Databricks sales representative for more information.Step 1: Create your own notebook or ...

Data Engineering

8788 Views
4 replies
11 kudos

11-28-2022 12:04:54 PM

View Replies

Latest Reply

Ajay-Pandey
Databricks MVP

12-04-2022 11:02:29 PM

11 kudos

Thanks @Nithya Thangaraj

11 kudos

12-04-2022 11:02:29 PM

3 More Replies

by Ajay-Pandey • Databricks MVP

12-02-2022 1:34:31 AM

1642 Views
1 replies
11 kudos

configure Unity Catalog in Azure Databricks

Hi all,Please help me setup the unity catalog in azure databricks .Any docs and content will help .

Data Engineering

1642 Views
1 replies
11 kudos

12-02-2022 1:34:31 AM

View Replies

Latest Reply

Ajay-Pandey
Databricks MVP

12-04-2022 11:01:17 PM

11 kudos

Anyone have idea about this??

11 kudos

12-04-2022 11:01:17 PM

by chhavibansal • New Contributor III

11-18-2022 11:08:00 AM

6876 Views
4 replies
1 kudos

ANALYZE TABLE showing NULLs for all statistics in Spark

var df2 = spark.read .format("csv") .option("sep", ",") .option("header", "true") .option("inferSchema", "true") .load("src/main/resources/datasets/titanic.csv") df2.createOrReplaceTempView("titanic") spark.table("titanic").cach...

Data Engineering

6876 Views
4 replies
1 kudos

11-18-2022 11:08:00 AM

View Replies

Latest Reply

chhavibansal
New Contributor III

12-03-2022 11:12:25 PM

1 kudos

can you share what the *newtitanic* is I think that you would have done something similarspark.sql("create table newtitanic as select * from titanic")something like this works for me, but the issue is i first make a temp view then again create a tab...

1 kudos

12-03-2022 11:12:25 PM

3 More Replies

by Jain • New Contributor III

11-18-2022 2:36:30 AM

3169 Views
1 replies
0 kudos

How to install GDAL on Databricks Cluster ?

I am currently using Runtime 10.4 LTS.The options available on Maven Central does not work as well as on PyPi.I am running:try: from osgeo import gdal except ImportError: import gdalto validate but it throws ModuleNotFoundError: No module n...

Data Engineering

3169 Views
1 replies
0 kudos

11-18-2022 2:36:30 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-03-2022 10:54:15 PM

0 kudos

@Abhishek Jain I can understand your issue it happens to me also multiple times so solving this issue I used to install the init script in my clusterMajor reason is that your 10X version does not support your current library so you have to find rig...

0 kudos

12-03-2022 10:54:15 PM

by Slalom_Tobias • Databricks Partner

08-22-2022 2:49:49 PM

14114 Views
1 replies
1 kudos

AttributeError: 'SparkSession' object has no attribute '_wrapped' when attempting CoNLL.readDataset()

I'm getting the error...AttributeError: 'SparkSession' object has no attribute '_wrapped'---------------------------------------------------------------------------AttributeError Traceback (most recent call last)<command-2311820097584616> in <cell li...

Data Engineering

14114 Views
1 replies
1 kudos

08-22-2022 2:49:49 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-03-2022 10:35:36 PM

1 kudos

this can happen in 10X version try to use 7.3 LTS and share your observationand if it not working there try to create init script and load it to your databricks cluster so whenever your machine go up you can get advantage of that library because some...

1 kudos

12-03-2022 10:35:36 PM

by rammy • Contributor III

11-21-2022 10:41:03 PM

3106 Views
1 replies
5 kudos

Not able to parse .doc extension file using scala in databricks notebook?

I could able to parse .doc extension files using Java programming with the help of POI libraries but when trying to convert Java code into Scala i expect it has to work with same java libraries with Scala programming but it is showing with below erro...

Data Engineering

3106 Views
1 replies
5 kudos

11-21-2022 10:41:03 PM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

12-03-2022 12:57:24 AM

5 kudos

Hi @Ramesh Bathini In pyspark, we have a docx module. I found that to be working perfectly fine. Can you try using that ?Documentation and stuff could be found online. Cheers...

5 kudos

12-03-2022 12:57:24 AM

by Snuki • New Contributor II

11-29-2022 3:18:29 AM

2868 Views
4 replies
3 kudos

Hi All, I am getting error , could anyone help me?

Data Engineering

2868 Views
4 replies
3 kudos

11-29-2022 3:18:29 AM

View Replies

Latest Reply

Harun
Honored Contributor

11-29-2022 10:19:52 PM

3 kudos

I used to get these kind of error from databricks partner page, try to manually search the course that you are looking for. For example, when i used the link to navigate to datalakehouse foundational course page it showed the same error to me. i manu...

3 kudos

11-29-2022 10:19:52 PM

3 More Replies

by db-avengers2rul • Contributor II

11-29-2022 5:28:48 AM

10188 Views
2 replies
0 kudos

Resolved! delete files from the directory

Is there a way to delete recursively files using a command in notebookssince in the below directory i have many combination of files like .txt,,png,.jpg but i only want to delete files with .csv example dbfs:/FileStore/.csv*

Data Engineering

10188 Views
2 replies
0 kudos

11-29-2022 5:28:48 AM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

12-03-2022 12:31:30 AM

0 kudos

Hi @Rakesh Reddy Gopidi You can use the os module to iterate over a directory.By using a loop over the directory, you can check what the file ends with using .endsWith(".csv).After fetching all the files, you can remove it. Hope this helps..Cheers.

0 kudos

12-03-2022 12:31:30 AM

1 More Replies

by UmaMahesh1 • Honored Contributor III

12-01-2022 11:26:31 AM

8654 Views
2 replies
15 kudos

Resolved! Pyspark dataframe column comparison

I have a string column which is a concatenation of elements with a hyphen as follows. Let 3 values from that column looks like below, Row 1 - A-B-C-D-E-FRow 2 - A-B-G-C-D-E-FRow 3 - A-B-G-D-E-FI want to compare 2 consecutive rows and create a column ...

Data Engineering

8654 Views
2 replies
15 kudos

12-01-2022 11:26:31 AM

View Replies

Latest Reply

NhatHoang
Valued Contributor II

12-02-2022 8:03:13 PM

15 kudos

Hi,I think you can follow these steps:1. Use window function to create a new column by shifting, then your df will look like thisid value lag1 A-B-C-D-E-F null2 A-B-G-C-D-E-F A-B-C-D-E-F3 A-B-G-D-E-F ...

15 kudos

12-02-2022 8:03:13 PM

1 More Replies

by cozos • New Contributor III

11-30-2022 9:06:46 PM

8184 Views
5 replies
5 kudos

What does "ScalaDriverLocal: User Code Compile error" mean?

22/11/30 01:45:31 WARN ScalaDriverLocal: loadLibraries: Libraries failed to be installed: Set() 22/11/30 01:50:14 INFO Utils: resolved command to be run: WrappedArray(getconf, PAGESIZE) 22/11/30 01:50:15 WARN ScalaDriverLocal: User Code Compile err...

Data Engineering

8184 Views
5 replies
5 kudos

11-30-2022 9:06:46 PM

View Replies

Latest Reply

cozos
New Contributor III

12-01-2022 1:35:53 PM

5 kudos

Hi @Werner Stinckens thanks for the help. Unfortunately I don't think its so simple - I do have a JAR that I submitted as a Databricks JAR task, and the JAR does have the org.apache.beam class: I guess what I'm trying to understand is what does Scal...

5 kudos

12-01-2022 1:35:53 PM

4 More Replies

by vr • Valued Contributor

11-26-2022 4:26:24 PM

20136 Views
11 replies
9 kudos

Why is execution too fast?

I have a table, full scan of which takes ~20 minutes on my cluster. The table has "Time" TIMESTAMP column and "day" DATE column. The latter is computed (manually) as "Time" truncated to day and used for partitioning.I query the table using predicate ...

Data Engineering

20136 Views
11 replies
9 kudos

11-26-2022 4:26:24 PM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

11-27-2022 6:40:45 AM

9 kudos

Hi @Vladimir Ryabtsev ,Because you are creating a delta table, I think that you are seeing a performance improvement because of Dynamic Partition pruning, According to the documentation, "Partition pruning can take place at query compilation time wh...

9 kudos

11-27-2022 6:40:45 AM

10 More Replies

by jd1 • New Contributor II

11-10-2022 6:56:33 AM

2067 Views
1 replies
3 kudos

Hello, When working in a python notebook and using tab-complete to navigate the file system, I find that pressing enter on a partially completed path ...

Hello,When working in a python notebook and using tab-complete to navigate the file system, I find that pressing enter on a partially completed path will add the full path to the cell in the notebook. This is annoying behaviour, since you end up with...

Data Engineering

2067 Views
1 replies
3 kudos

11-10-2022 6:56:33 AM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

12-02-2022 12:35:43 PM

3 kudos

Someone heard you In the experimental Monaco editor, I found this particular issue not appearing.

3 kudos

12-02-2022 12:35:43 PM

by stinodego • New Contributor III

11-18-2022 12:48:49 AM

6761 Views
8 replies
19 kudos

Python job run error messages are unreadable

This has been going on for some time now; all errors look like this (note the weird `[0;34m` marks everywhere). How can we fix this?We're not doing anything crazy, this is just the latest runtime with pretty much the simplest possible hello world pro...

Data Engineering

6761 Views
8 replies
19 kudos

11-18-2022 12:48:49 AM

View Replies

Latest Reply

VaibB
Contributor

12-02-2022 12:03:34 PM

19 kudos

Have you tried detaching and reattaching the notebook? Or Cluster restart? Did you check you are not importing any specific library someone else with the right access might have installed some library with install to all clusters as checked.

19 kudos

12-02-2022 12:03:34 PM

7 More Replies

by cmilligan • Contributor II

12-02-2022 10:32:10 AM

10867 Views
2 replies
6 kudos

Resolved! How to go up two folders using relative path in %run?

I'm wanting to store a notebook with functions two folders up from the current notebook. I know that I can start the path with ../ to go up one folder but when I've tried .../ it won't go up two folders. Is there a way to do this?

Data Engineering

10867 Views
2 replies
6 kudos

12-02-2022 10:32:10 AM

View Replies

Latest Reply

VaibB
Contributor

12-02-2022 11:40:41 AM

6 kudos

In order to access a notebook in the current folder use ../notebook_2to go 2 folders up and access (say notebook "secret") use ../../secret

6 kudos

12-02-2022 11:40:41 AM

1 More Replies

Databricks Community

Forum Posts

Data bricks SQL is allowing 10 queries only ?

How can I extract data from different sources and transform it into a fresh, reliable data pipeline?

configure Unity Catalog in Azure Databricks

ANALYZE TABLE showing NULLs for all statistics in Spark

How to install GDAL on Databricks Cluster ?

AttributeError: 'SparkSession' object has no attribute '_wrapped' when attempting CoNLL.readDataset()

Not able to parse .doc extension file using scala in databricks notebook?

Hi All, I am getting error , could anyone help me?

Resolved! delete files from the directory

Resolved! Pyspark dataframe column comparison

What does "ScalaDriverLocal: User Code Compile error" mean?

Why is execution too fast?

Hello, When working in a python notebook and using tab-complete to navigate the file system, I find that pressing enter on a partially completed path ...

Python job run error messages are unreadable

Resolved! How to go up two folders using relative path in %run?

Best Compute Option for Near-Real-Time Databricks ...

Managing Unity Catalog Permissions for Databricks ...

Scheduling jobs with table update triggers

Uploading file to volume and start ingestion job

ProfilingError: SPARK_ERROR. Spark encountered an ...