cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

HariharaSam
by Contributor
  • 132062 Views
  • 6 replies
  • 3 kudos

Resolved! Alter Delta table column datatype

Hi ,I am having a delta table and table contains data and I need to alter the datatype for a particular column.For example :Consider the table name is A and column name is Amount with datatype Decimal(9,4).I need alter the Amount column datatype from...

  • 132062 Views
  • 6 replies
  • 3 kudos
Latest Reply
saipujari_spark
Databricks Employee
  • 3 kudos

Hi @HariharaSam The following documents the info about how to alter a Delta table schema.https://docs.databricks.com/delta/update-schema.html

  • 3 kudos
5 More Replies
Data_Engineer_3
by New Contributor III
  • 23000 Views
  • 12 replies
  • 4 kudos

FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/tables/flight_data.zip' The data and file exists in location mentioned above

I am new to learning Spark and working on some practice; I have uploaded a zip file in DBFS /FileStore/tables directory and trying to run a python code to unzip the file; The python code is as: from zipfile import *with ZipFile("/FileStore/tables/fli...

  • 23000 Views
  • 12 replies
  • 4 kudos
Latest Reply
883022
New Contributor II
  • 4 kudos

What if changing the runtime is not an option? I'm experiencing a similar issue using the following:%pip install -r /dbfs/path/to/file.txtThis worked for a while, but now I'm getting the Errno 2 mentioned above. I am still able to print the same file...

  • 4 kudos
11 More Replies
Murthy1
by Contributor II
  • 3909 Views
  • 2 replies
  • 0 kudos

Terraform - Install egg file from S3

I am looking to install Python Egg files on all my clusters. The egg file is located in a S3 location. I tried using the following code which didn't work   resource "databricks_dbfs_file" "app" { source = "${S3_Path}/foo.egg" path = "/FileStore...

  • 3909 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Murthy1  Does @Retired_mod  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? We'd love to hear from you. Thanks!

  • 0 kudos
1 More Replies
scvbelle
by New Contributor III
  • 3743 Views
  • 2 replies
  • 0 kudos

Resolved! Optimising the creation of a change log for transactional sources in an ETL pipeline

I have multiple transactional sources feeding into my Azure Databricks env (MySQL, MSSQL, MySQL datadumps) for a client company-wide DataLake. They are basically all "managed sources" (using different DBMSs, receiving application dumps, etc), but I d...

Data Engineering
azure
Change Capture
ETL
JDBC
Workflows
  • 3743 Views
  • 2 replies
  • 0 kudos
Latest Reply
scvbelle
New Contributor III
  • 0 kudos

This is the implementaiton of the original pos, for interest (excluded from the original post to reduce length) # some blacklisting and schema renamings where necessary (due to conflicts and/or illegal characters) SourceName=str SchemaName=str TableN...

  • 0 kudos
1 More Replies
bchaubey
by Contributor II
  • 8359 Views
  • 5 replies
  • 0 kudos

Resolved! How to process all Azure storage file from Databricks

Hi,I want to process all files that are in my azure storage using databricks, What is the process?

  • 8359 Views
  • 5 replies
  • 0 kudos
Latest Reply
bchaubey
Contributor II
  • 0 kudos

Could you  please provide me the code with my scenario

  • 0 kudos
4 More Replies
Govind_PS
by New Contributor II
  • 3263 Views
  • 3 replies
  • 2 kudos

Resolved! Databricks Certified Data Engineer Associate Certificate or Badge not received

I have cleared the Databricks certified Data Engineer Associate exam on 14th July 2023 14:30 hrs. I have received the email stating that i have cleared the exam but haven't received the certificate and badge yet. Attaching the screenshot here. Thanks...

Screenshot_2023-07-16-08-31-01-37_e307a3f9df9f380ebaf106e1dc980bb6.jpg
  • 3263 Views
  • 3 replies
  • 2 kudos
Latest Reply
APadmanabhan
Databricks Employee
  • 2 kudos

Hi @Govind_PS Could you please let us know if you've received the credential? if not please share your webassessor email address.

  • 2 kudos
2 More Replies
AW
by New Contributor III
  • 14995 Views
  • 3 replies
  • 8 kudos

Resolved! Creating a service principal with admin role on account level in Azure Databricks using Terraform

Dear Community,In the GUI I can grant the admin role to a service principal with a simple switch.How can I achive the same in Terraform? Do you have some code examples?

switch
  • 14995 Views
  • 3 replies
  • 8 kudos
Latest Reply
AW
New Contributor III
  • 8 kudos

Dear @Pat Sienkiewicz​ , works perfectly! It would be so easy it the documentation would be better... Rg Adrian

  • 8 kudos
2 More Replies
missyT
by New Contributor III
  • 3285 Views
  • 3 replies
  • 4 kudos

Resolved! AI assistant and machine Learning

I am looking to create a basic virtual assistant (AI) that implements machine learning mechanisms.I have some basic knowledge of Python and I have seen some courses on the internet (youtube in particular) that look very interesting.But for the moment...

  • 3285 Views
  • 3 replies
  • 4 kudos
Latest Reply
valeryuaba
New Contributor III
  • 4 kudos

Hey everyone!I'm clearly excited about this topic since I'm a huge fan of AI assistants and machine learning. MissyT, creating a basic virtual assistant with machine learning capabilities is an excellent idea! With your simple knowledge of Python and...

  • 4 kudos
2 More Replies
Data4
by New Contributor II
  • 4661 Views
  • 1 replies
  • 5 kudos

Resolved! Load multiple delta tables at once from Sql server

What’s the best way to efficiently move multiple sql tables in parallel into delta tables

  • 4661 Views
  • 1 replies
  • 5 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 5 kudos

@Data4   To enable parallel read and write operations, the ThreadPool functionality can be leveraged. This process involves specifying a list of tables that need to be read, creating a method for reading these tables from the JDBC source and saving t...

  • 5 kudos
BriceBuso
by Contributor II
  • 7008 Views
  • 3 replies
  • 3 kudos

Run a multiple %command in the same cell

Hello, is there a way to run multiple %command in a same cell ? I heard that's not possible but would like a confirmation and maybe if it could be an idea for future updates.Moreover, is there a way to mask the output of cells (especially markdown) w...

  • 7008 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @BriceBuso  Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.  We'd love to hear from you. Thanks...

  • 3 kudos
2 More Replies
bchaubey
by Contributor II
  • 5712 Views
  • 4 replies
  • 3 kudos

Data Pull from S3

I have some file in S3, I want to process through Databricks, How it possible? Could you please help me regarding the same.

  • 5712 Views
  • 4 replies
  • 3 kudos
Latest Reply
dream
Contributor
  • 3 kudos

access_key = dbutils.secrets.get(scope = "aws", key = "aws-access-key") secret_key = dbutils.secrets.get(scope = "aws", key = "aws-secret-key") encoded_secret_key = secret_key.replace("/", "%2F") aws_bucket_name = "<aws-bucket-name>" mount_name = "<m...

  • 3 kudos
3 More Replies
baatchus
by New Contributor III
  • 7225 Views
  • 3 replies
  • 1 kudos

Deduplication, Bronze (raw) or Silver (enriched)

Need some help in choosing between where to do deduplication of data. So I have sensor data in blob storage that I'm picking up with Databricks Autoloader. The data and files can have duplicates in them.Which of the 2 options do I choose?Option 1:Cre...

  • 7225 Views
  • 3 replies
  • 1 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 1 kudos

@peter_mcnally You can use watermark to pick the late records and send only the latest records to the bronze table. This will ensure that you always have the latest information in your bronze table.This feature is explained in detail here - https://w...

  • 1 kudos
2 More Replies
anastassia_kor1
by New Contributor
  • 6986 Views
  • 2 replies
  • 1 kudos

Error "Distributed package doesn't have nccl built in" with Transformers Library.

I am trying to run a simple training script using HF's transformers library and am running into the error `Distributed package doesn't have nccl built in` error.Runtime: DBR 13.0 ML - SPark 3.4.0 - Scala 2.12Driver: i3.xlarge - 4 coresNote: This is a...

  • 6986 Views
  • 2 replies
  • 1 kudos
Latest Reply
patputnam-db
Databricks Employee
  • 1 kudos

Hi @anastassia_kor1,For CPU-only training, TrainingArguments has a no_cuda flag that should be set.For transformers==4.26.1 (MLR 13.0) and transformers==4.28.1 (MLR 13.1), there's an additional xpu_backend argument that needs to be set as well. Try u...

  • 1 kudos
1 More Replies
mshettar
by New Contributor II
  • 3784 Views
  • 2 replies
  • 0 kudos

Databricks CLI's workspace export_dir command adds unnecessary edits despite not making any change in the workspace

databricks workspace export_dir / export command with overwrite option enabled adds non-existent changes in the target directory. 1. It introduces new line deletion and 2. add/deletion of MAGIC comments despite not making any meaningful changes in th...

Screenshot 2023-06-06 at 2.44.48 PM
  • 3784 Views
  • 2 replies
  • 0 kudos
Latest Reply
RyanHager
Contributor
  • 0 kudos

I am encountering this issue as well and it did not happen previously.  Additionally, you see this pattern if you are using repos internally and make a change to a notebook in another section.

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels