cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Sujitha
by Community Manager
  • 1487 Views
  • 3 replies
  • 2 kudos

KB Feedback Discussion In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers t...

KB Feedback DiscussionIn addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers to common questions about Databricks, as well as information on optimisation and troubleshooting.These...

  • 1487 Views
  • 3 replies
  • 2 kudos
Latest Reply
martinez
New Contributor III
  • 2 kudos

Thanks for sharing!  

  • 2 kudos
2 More Replies
VitorGhiotti
by New Contributor II
  • 844 Views
  • 2 replies
  • 1 kudos

Python error on install epmwebapi library

The error below occurred when trying to install the mentioned library. how do i fix this

error.png
  • 844 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hemant
Valued Contributor II
  • 1 kudos

Hi @VitorGhiotti, I am able to install this package, can you share your cluster configuration and are you using a private endpoint?

  • 1 kudos
1 More Replies
superanna
by New Contributor II
  • 602 Views
  • 1 replies
  • 1 kudos

Yes, still illegal. And I also don’t understand why it is equated with drugs, but alcohol is not! Not a single murder has yet been committed under can...

Yes, still illegal. And I also don’t understand why it is equated with drugs, but alcohol is not! Not a single murder has yet been committed under cannabis, not a single war has been unleashed. It's just that people who don't use don't understand how...

  • 602 Views
  • 1 replies
  • 1 kudos
Latest Reply
Mz_Yvette
New Contributor II
  • 1 kudos

You are absolutely right! I have found it to be a big relief medically. I have nerve conditions which is not operable. The legal medical pills almost literally killed me, and if it wasn't for my husband's quick thinking, I wouldn't be here to share t...

  • 1 kudos
Priyag1
by Honored Contributor II
  • 2320 Views
  • 4 replies
  • 4 kudos

Data preparation in Databricks

Data preparation in Databricks Good data is important to ensure accurate and useful results. To get good data following tasks must be done Cleaning and formatting data - Handling missing values or outliers, ensuring data is in the correct format, and...

  • 2320 Views
  • 4 replies
  • 4 kudos
Latest Reply
dplante
Contributor II
  • 4 kudos

Data governance and data lineage are other things to call out.Here's a cheat sheet  that is also useful -> Data Preparation Cheatsheet

  • 4 kudos
3 More Replies
gpierard
by New Contributor III
  • 490 Views
  • 1 replies
  • 0 kudos

Badge not received for Databricks Certified Data Engineer Associate

Hello,I passed the certification but haven't received a badge. In fact, I created my databricks academy account only after completing the test. Could you please ensure I do receive that certification? Thanks 

  • 490 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @gpierard  Thank you for reaching out!  Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 0 kudos
matty_f
by New Contributor II
  • 2854 Views
  • 1 replies
  • 0 kudos

Migration scripts for distribution, embedded in library

I'm working on a python package that can be installed via pip. The package will manage a delta table for the user, and new versions of the package may need to run migrations on this tableIs this an okay format to use?def migrate(table_path): mm_p...

  • 2854 Views
  • 1 replies
  • 0 kudos
Latest Reply
matty_f
New Contributor II
  • 0 kudos

Not much community happening here 

  • 0 kudos
Dekova
by New Contributor II
  • 503 Views
  • 0 replies
  • 0 kudos

Structured Streaming & Workspace Job Limits

In "Advanced Data Engineering with Databricks", the section on Bronze Ingestion Patterns mentions that workspaces have limits of 5000 jobs triggered in an hour. As a solution, it suggest multiplex streaming to a single bronze table and then using sub...

Screenshot 2023-07-14 at 9.49.43 PM.png
Data Engineering
structured streaming
  • 503 Views
  • 0 replies
  • 0 kudos
brickster_2018
by Esteemed Contributor
  • 3278 Views
  • 2 replies
  • 3 kudos

Resolved! Can I install notebook scoped JAR/Maven libraries?

The notebook scoped libraries are very handy. Is it possible to leverage the same for maven jars or application jars as well?

  • 3278 Views
  • 2 replies
  • 3 kudos
Latest Reply
Pratik_Ghosh
New Contributor II
  • 3 kudos

Any further update on this topic?

  • 3 kudos
1 More Replies
Ruby8376
by Valued Contributor
  • 448 Views
  • 0 replies
  • 0 kudos

Schema definition help in scala notebook in databricks !!!!!!!1

I am building schema for an incoming avro file(json message) and creating a final dataframe for it. The schema built looks fine as per the json sample message provided but I am getting null values in all the fields. Can somebody look at this code and...

  • 448 Views
  • 0 replies
  • 0 kudos
erigaud
by Honored Contributor
  • 8160 Views
  • 5 replies
  • 3 kudos

Resolved! Gracefully stop a job based on condition

Hello, I have a job with many tasks running on a schedule, and the first tasks checks a condition. Based on the condition, I would either want to continue the job as normal, or to stop right away and don't run all the other tasks. Is there a way to d...

  • 8160 Views
  • 5 replies
  • 3 kudos
Latest Reply
erigaud
Honored Contributor
  • 3 kudos

I think the best way to accomplish this would be to either propagate the check, as mentionned by @menotron, or have the initial task in another job, and only run the second job if the condition is met. Obviously it depends on the use case. Thank you ...

  • 3 kudos
4 More Replies
Ria
by New Contributor
  • 1962 Views
  • 4 replies
  • 0 kudos

How to build master workflow for all the jobs present in workflow using databricks?

Suppose there are multiple job have been created using databricks workflow, now the requirement is to make one master workflow to trigger all the workflow depending on different condition like: some are supposed to trigger on daily basis, some on mon...

  • 1962 Views
  • 4 replies
  • 0 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 0 kudos

@Ria Hi , This feature was in development when I attended last Quarter Roadmap and I thought it is available in the latest versions or could be even in Private Preview. You can check with your Databricks Solution Architect. Even if not now, could be ...

  • 0 kudos
3 More Replies
s041507
by New Contributor
  • 460 Views
  • 0 replies
  • 0 kudos

Autoloader cannot load files from Repos with runtime 13.0+

Since runtime 13.0+ it is not possible anymore to reference Repos files with Autoloader using the "file:" prefix, e.g. "file:/Workspace/Repos/...". This was working before, but now Autoloader throws an error:com.databricks.sql.cloudfiles.errors.Cloud...

Data Engineering
autoloader
Repos
  • 460 Views
  • 0 replies
  • 0 kudos
Magnus
by Contributor
  • 1743 Views
  • 3 replies
  • 1 kudos

Auto Loader fails when reading json element containing space

I'm using Auto Loader as part of a Delta Live Tables pipeline to ingest json files, and today it failed with this error message:om.databricks.sql.transaction.tahoe.DeltaAnalysisException: Found invalid character(s) among ' ,;{}()\n\t=' in the column ...

Data Engineering
Auto Loader
Delta Live Tables
  • 1743 Views
  • 3 replies
  • 1 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 1 kudos

@Magnus You can read the input file using Pandas or Koalas (https://koalas.readthedocs.io/en/latest/index.html)then rename the columnsthen convert the Pandas/Koalas dataframe to Spark dataframe. You can write it back with the correct column name, so ...

  • 1 kudos
2 More Replies
207474
by New Contributor
  • 1267 Views
  • 3 replies
  • 2 kudos

How do I get the total number of queries run per day on a databricks SQL warehouse/endpoint?

I am trying to access the API: GET https://<databricks-instance>.cloud.databricks.com/api/2.0/sql/history/queries

  • 1267 Views
  • 3 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi there @Sravan Burla​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 2 kudos
2 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 689 Views
  • 0 replies
  • 0 kudos

Delta Live Tables Example Questions

I am testing with some examples of Delta Live Tables from https://github.com/databricks/delta-live-tables-notebooks/tree/main/divvy-bike-demoI have ran all the relevant files of ingestion:python-weatherinfo-api-ingest.pypython-divvybike-api-ingest-st...

THIAM_HUATTAN_0-1689301274037.png
  • 689 Views
  • 0 replies
  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels