Hello,I am getting the following error when trying to copy data to databricks from an ADLS with SQL and using a SAS tokenFailure to initialize configuration for storage account <storage account>: Invalid configuration value detected for fs.azure.acco...
I'm trying to find the best strategy for handling big data sets. In this case I have something that is 450 million records. I'm pulling the data from SQL Server very quickly but when I try to push the data to the Delta Table OR a Azure Container the...
I am currently running dark mode for my Databricks Notebooks, and am using the "new UI" released a few days ago (May 2023) and the "New notebook editor."Currently all plots (like matplotlib) are showing wrong colors. For example, denoting:```... p...
We're trying to send email using Amazon SES using boto3.client in python. We've added SES Full access in clusters IAM Role.We were able to send email in "No isolation shared" mode in DBR 11.2 using ses = boto3.client('ses', region_name='us-****-2') s...
This appears to be an intentional design choice to prevent users from using the credentials of the host machine to carry out arbitrary AWS API calls. I really wish there was a workaround or setting to disable this behavior because we put a lot of wor...
Hi all,I have a problem with reading responses generated by Unity Catalog API 2.1 as they are missing fields that are otherwise described in the specification:List functions - The fields routine_dependencies, return_params, and input_params are missi...
Hi @DatabricksHero ,
view_dependencies
View dependencies (when table_type == VIEW or MATERIALIZED_VIEW, STREAMING_TABLE)
when DependencyList is None, the dependency is not provided;when DependencyList is an empty list, the dependency is provided b...
In our case, we have multiple sources writing to the same target table. A target table can be populated from multiple source tables, each contributing a set of fields. How to add/update columns in a target table from multiple sources.In a delta live...
I have a DLT table in schema A which is being loaded by DLT pipeline.I want to move the table from schema A to schema B, and repoint my existing DLT pipeline to table in schema B. also I need to avoid full reload in DLT pipeline on table in Schema B....
@a_t_h_i This feature is being actively worked upon by our Engineers. The plan is to change the schema name in the DLT pipeline settings and DLT will move the managed DLT table to the other schema.
Hello,forecast_date = '2017-12-01'
spark.conf.set('spark.sql.shuffle.partitions', 500 )
# generate forecast for this data
forecasts = (
history
.where(history.date < forecast_date) # limit training data to prior to our forecast date
.groupBy...
@Mr_K ApplyInPandas is a higher order function in Python. As of now, we do not support higher order functions in Unity Catalog. We do support direct calls made to python UDFs. Here is an example of how to reference UDFs in UC - https://docs.databrick...
I have JSON data set that contains a price in a string like "USD 5.00". I'd like to convert the numeric portion to a Double to use in an MLLIB LabeledPoint, and have managed to split the price string into an array of string. The below creates a data...
Hi,When I tried to use ipywidgets, it returns the following error.I’m using Databricks with PrivateLink enabled on AWS, and Runtime version is 12.2 LTS.Is there something that I need to use ipywidgets in my environment?
Hi @NCat, The error message "uncaught reference error: require is not defined" indicates that the require function is not defined in the current scope. This error can occur when using Databricks Connect with a version of Node.js that does not support...
We have our BI facts and dimensions built in as delta table in Datarbicks env and is being used for reporting by connecting PowerBI reports using datarbricks connection. We now have a need to use this data for another application utilizing SSRS repor...
I am curious what is going on under-the-hood when using `multiprocessing` module to parallelize an function call and apply it to a Pandas DataFrame along the row axis. Specifically, how does it work with DataBricks Architecture / Compute. My cluster ...
@Keval Shah​ :When using the multiprocessing module in Python to parallelize a function call and apply it to a Pandas DataFrame along the row axis, the following happens under the hood:The Pool object is created with the specified number of processes...
Hi , I am trying to read a csv file with one column has double quotes like below.
James,Butt,"Benton, John B Jr",6649 N Blue Gum St
Josephine,Darakjy,"Chanay, Jeffrey A Esq",4 B Blue Ridge Blvd
Art,Venere,"Chemel, James L Cpa",8 W Cerritos Ave #54...
Hi Team,I am also facing same issue and i have applied all the option mentioned from above posts:I will just post my dataset here:Attached is the my input data with 3 different column out of which comment column contains text value with double quotes...
I'm looking to know programatically how many files a delta table is made of.I know I can do %sqlDESCRIBE DETAIL my_tableBut that would only give me the number of files of the current version. I am looking to know the total number of files (basically ...