Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I am trying to sign up for the community edition (https://databricks.com/try-databricks) for use with a databricks academy course. However, I am unable to signup and I receive the following error (image attached). On going to login page (link in ora...
I have a sample set of power bi(.pbix) reports with all dropdowns, tables, filters etc. Now I would like to migrate this reports to data bricks. whatever visuals are created in power bi, I would like to create same in data bricks from scratch. I wou...
We must ensure you are sure that the Databricks cluster is operational. These are the steps needed for integration between Azure Databricks into Power BI Desktop.1. Constructing the URL for the connectionConnect to the cluster, and click the Advanced...
@Panda There is no REST API for databricks. "RE" in REST stands for Ready Everywhere. You cannot connect to the API in workspace 1, from a notebook in workspace 2. Therefor it is Not Ready Everywhere. Workspace 1 cannot resolve the hostname for Works...
Access to Databricks APIs require the user to authenticate. This usually means creating a PAT (Personal Access Token) token. Conveniently, a token is readily available to you when you are using a Databricks notebook.databricksURL = dbutils.notebook....
@User16752245312 You can use Databricks Secret Scope to manage sensitive data such as personal access tokens (PATs) securely. Storing your token in a secret scope ensures you don’t hard-code credentials in your notebook, making it more secure.For mo...
Is there a way to get the directory size in ADLS(gen2) using dbutils in databricks?
If I run this
dbutils.fs.ls("/mnt/abc/xyz")
I get the file sizes inside the xyz folder( there are about 5000 files), I want to get the size of the XYZ folder
how ca...
File size is only specified for files. So, if you specify a directory as your source, you have to iterate through the directory. The below snippet should work (and should be faster than the other solutions).import glob
def get_directory_size_in_byt...
I want to add custom logs that redirect in the Spark driver logs. Can I use the existing logger classes to have my application logs or progress message in the Spark driver logs.
1) Is it possible to save all the custom logging to its own file? Currently it is being logging with all other cluster logs (see image) 2) Also Databricks it seems like a lot of blank files are also being created for this. Is this a bug? this include...
Hello, I am doing the Data Science and Machine Learning course.
The Boston housing has unintuitive column names. I want to rename them, e.g. so 'zn' becomes 'Zoning'.
When I run this command:
df_bostonLegible = df_boston.rename({'zn':'Zoning'}, axi...
If df_boston is a DataFrame, but you still face issues, try an alternative syntax: df_boston = df_boston.rename(columns={'zn': 'Zoning'}).Make sure df_boston is a proper DataFrame and you're using a recent version of Pandas.
If I want to run TPC-DS test on databricks what are the steps involved, do we have already daya available on databricks file system or I have to download or create from somewhere.
Hey there, User16776431030.Great question about those magic commands in Databricks! Let me shed some light on this mystical matter.The %pip and %sh pip commands may seem similar on the surface, but they're quite distinct in their powers. %sh pip is l...
Yes, it is possible to connect databricks to a kerberized hbase cluster. The attached article explains the steps. It consists of setting up a kerberos client using a keytab in the cluster nodes, installing the hbase-spark integration library, and set...
When I am trying to read snowflake table from my databricks notebook, it is giving the error as:df1.read.format("snowflake") \.options(**options) \.option("query", "select * from abc") \.save()Getting below errorjava.sql.SQLException: No suitable dri...
Is this random number not possible to extract from the notebook context? It is available in the browser_hash but that is not populated when running a job.Is this random number static or does it change over time? If it is static, it can then be hardco...
What are the steps needed to connect to a DB2-AS400 source to pull data to lake using Databricks? I believe it requires establishing a jdbc connection, but I couldnot find much details online
Hi @Ajay Menon Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
@keenan_jones7 I had the same problem today. It looks like you've copied and pasted the JSON that Databricks displays in the GUI when you select View JSON from the dropdown menu when viewing a job.In order to use that JSON in a request to the Jobs ...