cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

BilongChen_
by New Contributor II
  • 575 Views
  • 2 replies
  • 3 kudos

AWS EC2 launched from Databricks tenancy

Hi,I was checking the EC2 details in our AWS account, and found all the EC2's launched from Databricks are with "dedicated" tenancy. I double checked the cluster launch configuration and didn't find anywhere to change the tenancy setting. How can we ...

  • 575 Views
  • 2 replies
  • 3 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 3 kudos

Connect with Databricks support they will guide in this

  • 3 kudos
1 More Replies
SIRIGIRI
by Contributor
  • 808 Views
  • 3 replies
  • 2 kudos

sharikrishna26.medium.com

Spark Dataframes SchemaSchema inference is not reliable.We have the following problems in schema inference:Automatic inferring of schema is often incorrectInferring schema is additional work for Spark, and it takes some extra timeSchema inference is ...

  • 808 Views
  • 3 replies
  • 2 kudos
Latest Reply
Varshith
New Contributor III
  • 2 kudos

one other difference between those 2 approaches is that In Schema DDL String approach we use STRING, INT etc.. But In Struct Type Object approach we can only use Spark datatypes such as StringType(), IntegerType(), etc..

  • 2 kudos
2 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 1383 Views
  • 2 replies
  • 4 kudos

Resolved! Why CTE is having issues with Databricks here?

df = spark.createDataFrame([(2018,'Apple1',45000),(2019,'Apple1',35000),(2020,'Apple1',75000),              (2018,'Samsung',15000),(2019,'Samsung',20000),(2020,'Samsung',25000),              (2018,'Nokia',21000),(2019,'Nokia',17000),(2020,'Nokia',140...

  • 1383 Views
  • 2 replies
  • 4 kudos
Latest Reply
Varshith
New Contributor III
  • 4 kudos

Issue is coming because of the semicolon beside PhoneBrandSales. Try removing that ; issue will be resolved. Please refer to the screenshot below.Please select this answer as best answer if it resolved your issueThanks,Varshith

  • 4 kudos
1 More Replies
Cristhian_Plaza
by New Contributor III
  • 1489 Views
  • 5 replies
  • 2 kudos

Cannot sign in at databricks partner-academy portal

Hi, I've registered some days ago to partner-academy.databricks using my company ID but that day I didn't login. Now, I'm trying to login but it's impossible, I'm also trying to recover my password through the forgot password option but never get the...

  • 1489 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Cristhian Plazas​ Thanks for reaching us!Kindly, mail to Nadiya to resolve the issue.

  • 2 kudos
4 More Replies
anirudhnegi94
by New Contributor III
  • 919 Views
  • 4 replies
  • 6 kudos

Resolved! Databricks Lakehouse Fundamentals Badge not received

Hi, I have successfully passed the test after completion of the course but I have'nt recieved any badge from your side. I simply got a certificate from your side post the test completion. I need to post my credentials on Linkedin with valid verificat...

  • 919 Views
  • 4 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Anirudh Negi​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 6 kudos
3 More Replies
JLMP
by New Contributor II
  • 858 Views
  • 2 replies
  • 2 kudos

Badge not received for Databricks Lakehouse Fundamentals Accreditation

I have successfully passed the test. But I have'nt recieved any badge or points, could you help me with this?The e-mail registered in the community is the same as that registered in the databricks academy and in credentials.databricks.com as well.Pdf...

  • 858 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

Please submit a ticket to Databricks Training Team here: https://help.databricks.com/s/contact-us?ReqType=training ,they will reach out you soon, maybe due to holiday they revert late,please be patient

  • 2 kudos
1 More Replies
Direo
by Contributor
  • 9574 Views
  • 11 replies
  • 6 kudos

Resolved! Permanently add python file path to sys.path in Databricks

If your notebook is in different directory or subdirectory than python module, you cannot import it until you add it to the Python path.That means that even though all users are using the same module, but since they are all working from different rep...

  • 9574 Views
  • 11 replies
  • 6 kudos
Latest Reply
uzadude
New Contributor III
  • 6 kudos

setting the `spark.executorEnv.PYTHONPATH` did not work for me. it looked like Spark/Databricks overwrite this somewhere. I used a simple python UDF to print some properties like `sys.path` and `os.environ` and didn't see the path I added.Finally, I ...

  • 6 kudos
10 More Replies
Jfoxyyc
by Valued Contributor
  • 841 Views
  • 2 replies
  • 4 kudos

Databricks Terraform - how to manage databricks entirely through Terraform?

I'm stuck at a point where I can't automatically set up everything about a databricks environment due to the fact that service principals can't be made an admin at the account level (accounts.azuredatabricks.net, similar for aws). Going into a bare t...

  • 841 Views
  • 2 replies
  • 4 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 4 kudos

Unfortunately there are still some limitations with doing IaC on Databricks with Terraform (ex. another one is that you can't setup KeyVault as a secret store with Service Principal).I think that instead of doing stuff manually, you can authenticate ...

  • 4 kudos
1 More Replies
Chris_Shehu
by Valued Contributor III
  • 763 Views
  • 2 replies
  • 2 kudos

Map("skipRows", "1") ignored during autoloader process. Something wrong with the format?

I've tried multiple variations of the following code. It seems like the map parameters are being completely ignored. CREATE LIVE TABLE a_raw2 TBLPROPERTIES ("quality" = "bronze") AS SELECT * FROM cloud_files("dbfs:/mnt/c-raw/a/c_medcheck_export*.csv"...

  • 763 Views
  • 2 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

skipRows was added in DBR 11.1 -- what DBR is your DLT pipeline on?

  • 2 kudos
1 More Replies
jneira
by New Contributor III
  • 1400 Views
  • 2 replies
  • 2 kudos

"org.apache.hadoop.hive.ql.metadata.HiveException: at least one column must be specified for the table" non deterministic error in a `insert ... select ... ` clause

Hi, first of all thahks for your work in databricks sql.Unfortunately i am having a problem running insert-selects statements programatically using the jdbc driver.They all have the form:`insert into `mytable` select 1, 'foo', moreLiterals`The statem...

  • 1400 Views
  • 2 replies
  • 2 kudos
Latest Reply
jneira
New Contributor III
  • 2 kudos

thanks for the suggestion, could tell me more about how to check logs in the cluster?

  • 2 kudos
1 More Replies
Veronika
by New Contributor III
  • 667 Views
  • 2 replies
  • 2 kudos

Scalable MLwith Apache Spark course, introductory video "Install the courseware": "Repos" section missing on Databricks platform

Hello, I'm a beginner on Databricks. I have "community edition" account on Databricks platform and Partner account in Databricks Academy platform . The problem is that I don't have "Repos" section which I'm supposed to have, as it's said in the free ...

  • 667 Views
  • 2 replies
  • 2 kudos
Latest Reply
Veronika
New Contributor III
  • 2 kudos

Ok, thank you! What type of account is required to get access to "repos" for training purposes? Is it possible with any free account, or which one is necessary?

  • 2 kudos
1 More Replies
hello_world
by New Contributor III
  • 2172 Views
  • 3 replies
  • 6 kudos

Resolved! What exactly is Z Ordering and Bloom Filter?

Have gone through the documentation, still cannot understand it.How is bloom filter indexing a column different from z ordering a column?Can somebody explain to me what exactly happens while these two techniques are applied?

  • 2172 Views
  • 3 replies
  • 6 kudos
Latest Reply
Rishabh264
Honored Contributor II
  • 6 kudos

hey @Daniel Sahal​ 1-A Bloomfilter index is a space-efficient data structure that enables data skipping on chosen columns, particularly for fields containing arbitrary textrefer this code snipet to create bloom filter CREATE BLOOMFILTER INDEX ON [TAB...

  • 6 kudos
2 More Replies
164079
by Contributor II
  • 1355 Views
  • 2 replies
  • 2 kudos

Resolved! Allow read access to S3 buckets from one AWS accounts to other AWS accounts.

Dear team, We have several AWS accounts with S3 buckets, the databricks setup is on our dev AWS account and we would like to allow instance profile to have read permission on all our S3 buckets on the other AWS accounts ( without using bucket policy...

  • 1355 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16255483290
Contributor
  • 2 kudos

Can you please share the IAM role policy in the secondary account (Bucket account) ?Just wanted to know have you tried setting the config in the cluster.spark.hadoop.fs.s3a.bucket.<s3-bucket-name>.aws.credentials.provider org.apache.hadoop.fs.s3a.aut...

  • 2 kudos
1 More Replies
lawrence009
by Contributor
  • 667 Views
  • 2 replies
  • 3 kudos

Advice on efficiently cleansing and transforming delta table

I have a delta table that is being updated nightly using Auto Loader. After the merge, the job kicks off a second notebook to clean and rewrite certain value using a series of UPDATE statements, e.g.,UPDATE TABLE foo SET field1 = some_value WHER...

  • 667 Views
  • 2 replies
  • 3 kudos
Latest Reply
Jfoxyyc
Valued Contributor
  • 3 kudos

I would partition the table by some sort of date that autoloader can use. You could then filter your update further and it'll automatically use partition pruning and only scan related files.

  • 3 kudos
1 More Replies
Jennifer_Lu
by New Contributor III
  • 719 Views
  • 1 replies
  • 3 kudos

How do I programmatically get the database name in a DLT notebook?

I have configured a database in the settings of my DLT pipeline. Is there a way to retrieve that value programmatically from within a notebook? I want to do something likespark.read.table(f"{database}.table")

  • 719 Views
  • 1 replies
  • 3 kudos
Latest Reply
Jfoxyyc
Valued Contributor
  • 3 kudos

You could also set it as a config value as database:value, and then retrieve it in the notebook using spark.conf.get().I'm hoping they update DLT to support UC, and then allow us to set database/schema at the notebook level in @dlt.table(schema_name,...

  • 3 kudos
Labels
Top Kudoed Authors