Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hi All,I am facing some performance issue with one of pyspark udf function that post data to REST API(uses cosmos db backend to store the data).Please find the details below: # The spark dataframe(df) contains near about 30-40k data. # I am using pyt...
Hi @Sanjoy Sen Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...
If I were to stop a rather large job run, say half way thru execution, will any actions performed on our Delta tables persist or will they be rolled back?Are there any other risks that I need to be aware of in terms of cancelling a job run half way t...
Is it possible to attach a notebook to cluster and run it via the REST API?The closest approach I have found is to run a notebook, export the results (HTML!) and import it into the workspace again, but this does not allow us to retain the original ex...
I'm looking for a way to programmatically copy a notebook in Databricks using the workspace/export and workspace/import APIs. Once the notebook is copied, I want to automatically attach it to a specific cluster using its cluster ID. The challenge is ...
Hello guys. I use pyspark in my daily life. A demand has arisen to collect information in Jira. I was able to do this via Talend ESB, but I wouldn't want to use different tools to get the job done. Do you have any example of how to extract data from ...
Hi,There is also a new Databricks for Jira add-on on the Atlassian Marketplace. It is easy to setup and exports are directly created within Jira. They can be one-time, scheduled, or real-time. It can also export additional Jira data such as Assets, C...
Databricks Repos best-practices recommend using the Repos REST API to update a repo via your git provider. The REST API requires authentication, which can be done one of two ways:A user / personal access tokenA service principal access tokenUsing a u...
Having the exact same problem. Did you find a solution @michael_mehrten ?In my case Im using a managed identity so the solution some topics suggest on generating an access token from a Entra ID service principal is not applicable.
Hi, everyone. I just recently started using Databricks on Azure so my question is probably very basic but I am really stuck right now.I need to capture some streaming metrics (number of input rows and their time) so I tried using the Spark Rest Api ...
hi @Roberto Baldrez ,if you think that @Gaurav Rupnar solved your question, then please select it as best response to it can be moved to the top of the topic and it will help more users in the future.Thank you
I am looking to create a basic virtual assistant (AI) that implements machine learning mechanisms.I have some basic knowledge of Python and I have seen some courses on the internet (youtube in particular) that look very interesting.But for the moment...
Hey everyone!I'm clearly excited about this topic since I'm a huge fan of AI assistants and machine learning. MissyT, creating a basic virtual assistant with machine learning capabilities is an excellent idea! With your simple knowledge of Python and...
Take advantage of the Data + AI Summit Virtual Experience next week!Data + AI Summit is just a few days away! With data professionals from 155+ countries already registered, this is truly the premier event for the global data, analytics and AI commun...
The highly anticipated Data+AI summit is just around the corner. Are you curious about redefining your data governance strategy in the era of LLMs and generative AI? Look no further! This year's Data+AI summit offers a dynamic lineup of keynotes, dem...
Hi everyone!We’re excited to gather everyone for Data + AI Summit 2023 — the premier AI and LLM global event for the data, analytics and AI community. Join thousands of data engineers, data scientists and data analyst experts virtually from June 29–3...
Attention Community! For a limited period, we are offering a generous 50% discount on training at the Data + AI Summit. Simply apply the code FLS4vop5ep during the registration process. Hurry, though, as this offer will expire on June 12, 2023. Don'...
A data engineer, User A, has promoted a new pipeline to production by using the REST API to programmatically create several jobs. A DataOps engineer, User B, has configured an external orchestration tool to trigger job runs through the REST API. Both...
@Ajay Pandey II really appreciate your efforts and you are right in terms of UI, but when we carefully see the question we foundWhich statement describes the contents of the workspace audit logs concerning these events?audit logs are generated and...
I am trying to run some API calls to the account console. I tried with every syntax possible encoded and decoded to get the call successfully but it returns a "user not authenticated" error. But when I tried with the Account admin it worked. I need t...
We're excited to announce the first four winners of our Raffle, and we want to thank everyone who has participated so far. If you haven't yet entered, don't worry! We still have four more tickets to give away for the world's largest Data + AI summit...
Hello Everyone,I am thrilled to announce that we have our 4th winner for the raffle contest - @MUHAMMET EMIN KOSEOGLU . Please join me in congratulating him on this remarkable achievement!Your dedication and hard work have paid off, and we are del...