cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Transcarent
by New Contributor II
  • 8194 Views
  • 6 replies
  • 0 kudos

Error: ConnectionError: HTTPSConnectionPool(host='https', port=443): Max retries exceeded with url: /api/2.0/workspace/list?path=%2F (Caused b...

Error: ConnectionError: HTTPSConnectionPool(host='https', port=443): Max retries exceeded with url: /api/2.0/workspace/list?path=%2F (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))

  • 8194 Views
  • 6 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

hi @prakash reddy​ ,Is this an intermittent error or are you able to repro it? please let us know.

  • 0 kudos
5 More Replies
s_plank
by New Contributor III
  • 5677 Views
  • 6 replies
  • 5 kudos

Resolved! Databricks-Connect shows different partitions than Databricks for the same delta table

Hello,here is a small code-snippet:from pyspark.sql import SparkSession spark = SparkSession.builder.appName('example_app').getOrCreate()   spark.sql('SHOW PARTITIONS database.table').show() The output inside the Databricks-Notebook:+-------------+--...

  • 5677 Views
  • 6 replies
  • 5 kudos
Latest Reply
s_plank
New Contributor III
  • 5 kudos

Hi @Jose Gonzalez​ ,yes the SQL-Connector works fine. Thank you!

  • 5 kudos
5 More Replies
Krishscientist
by New Contributor III
  • 1809 Views
  • 1 replies
  • 0 kudos

Resolved! AutoML : data set for problem type "Classification"

HI,I am working on AutoML Experiment. Could you plz help me with data set for problem type "Classification"Regards.

  • 1809 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

There are a lot of datasets available in /databricks-datasets/ that you can look through. You'll have to turn them into a table so that you can access them in automl. There are datasets associated with the spark definitive guide and learning spark ...

  • 0 kudos
Rex
by New Contributor III
  • 7341 Views
  • 4 replies
  • 3 kudos

Resolved! Cannot use prepared statements with date functions

We are using PHP and the Databricks SQL ODBC driver and cannot run a query that users DATE functions with prepared statements. Sample script/docker setup here:https://github.com/rlorenzo/databricks_php/blob/odbc_prepare_error/test_connection.phpFor e...

  • 7341 Views
  • 4 replies
  • 3 kudos
Latest Reply
Rex
New Contributor III
  • 3 kudos

@Bilal Aslam​ We tried CAST and CONVERT and still getting the same error.

  • 3 kudos
3 More Replies
TS
by New Contributor III
  • 1059 Views
  • 0 replies
  • 1 kudos

Is there a better way for this matching?

I have an array:var arg = condColumnsKeyswith the elementsarg: Array[String] = Array(LOT_PREFIX, PS_NAME_BOOK_TEMPLATE_NAME, PS_NAME_PAGE_NAME, PS_NAME_FIELD_NAME)Desired outcome is to get the string "LOT_PREFIX" and store it in var ccLotPrefixMy fir...

  • 1059 Views
  • 0 replies
  • 1 kudos
Taha_Hussain
by Databricks Employee
  • 1858 Views
  • 1 replies
  • 1 kudos

Databricks Office Hours Our next Office Hours session is scheduled for April 27 2022 - 8:00 am PT. Do you have questions about how to set up or use Da...

Databricks Office HoursOur next Office Hours session is scheduled for April 27 2022 - 8:00 am PT.Do you have questions about how to set up or use Databricks? Do you want to learn more about the best practices for deploying your use case or tips on da...

  • 1858 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 1 kudos

Just registered. Thank you and happy weekend.

  • 1 kudos
StephanieAlba
by Databricks Employee
  • 3769 Views
  • 1 replies
  • 6 kudos

Resolved! Is it possible to use Autoloader with a daily update file structure?

We get new files from a third-p@rty each day. The files could be the same or different. However, each day all csv files arrive in the same dated folder. Is it possible to use autoloader on this structure?We want each csv file to be a table that gets ...

The folders In the folders
  • 3769 Views
  • 1 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 6 kudos

@Stephanie Rivera​ , You can use pathGlobfilter, but you will need a separate autoloader for which type of file.df_alert = spark.readStream.format("cloudFiles") \.option("cloudFiles.format", "binaryFile") \.option("pathGlobfilter", alert.csv") \.load...

  • 6 kudos
User16835756816
by Databricks Employee
  • 3023 Views
  • 1 replies
  • 5 kudos

 Announcing: Delta Live Tables ! 

Databricks is excited to announce the general availability of Delta Live Tables to you, our community. Anxiously awaited, Delta Live Tables (DLT) is the first ETL framework that uses a simple, declarative approach to building reliable streaming or ...

  • 3023 Views
  • 1 replies
  • 5 kudos
Latest Reply
User16725394280
Databricks Employee
  • 5 kudos

Informative Content thanks for sharing.

  • 5 kudos
Kush22
by New Contributor
  • 2514 Views
  • 0 replies
  • 0 kudos

Delete the file

While exporting data from Databricks to Azure blob storage how can I delete the committed, started and success file? ​

  • 2514 Views
  • 0 replies
  • 0 kudos
sgannavaram
by New Contributor III
  • 4850 Views
  • 1 replies
  • 2 kudos

Resolved! How to pass variables into query string?

I have two variables StartTimeStmp and EndTimeStmp, i am going to assign the Start timestamp to it based on Last Successful Job Runtime and EndTimeStamp would be current time of system.SET StartTimeStmp = '2022-03-24 15:40:00.000';SET EndTimeStmp = '...

  • 4850 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 2 kudos

@Srinivas Gannavaram​ , in python:spark.sql(f""" SELECT CI.CORPORATE_ITEM_INTEGRATION_ID , CI.CORPORATE_ITEM_CD WHERE CI.DW_CREATE_TS < '{my_timestamp_variable}' ; """)

  • 2 kudos
Direo
by Contributor II
  • 16114 Views
  • 2 replies
  • 3 kudos
  • 16114 Views
  • 2 replies
  • 3 kudos
Latest Reply
User16873043212
Databricks Employee
  • 3 kudos

@Direo Direo​ , Yeah, this is a location inside your dbfs. The whole control is on you. Databricks do not delete something you keep in this location.

  • 3 kudos
1 More Replies
Direo
by Contributor II
  • 2605 Views
  • 1 replies
  • 5 kudos
  • 2605 Views
  • 1 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 5 kudos

@Direo Direo​ , Yes, you use Merge syntax for that https://docs.delta.io/latest/delta-update.html.And is more efficient than overwriting if you want to update only part of the data, but you need to think about the logic of what to update so overwriti...

  • 5 kudos
Constantine
by Contributor III
  • 2336 Views
  • 1 replies
  • 4 kudos

Resolved! What's the best architecture for Structured Streaming and why?

I am building an ETL pipeline which reads data from a Kafka topic ( data is serialized in Thrift format) and writes it to Delta Table in databricks. I want to have two layersBronze Layer -> which has raw Kafka dataSilver Layer -> which has deserializ...

  • 2336 Views
  • 1 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 4 kudos

@John Constantine​ , "Bronze Layer -> which has raw Kafka data"If you use confluent.io, you can also utilize a direct sink to DataLake Storage - bronze layer."Silver Layer -> which has deserialized data"Then use Delta Live Tables to process it to del...

  • 4 kudos
cal
by New Contributor
  • 803 Views
  • 0 replies
  • 0 kudos

G.I.S., Inc. is a distributor and fabricator of thermal and acoustical insulation systems for industrial, commercial, power, process, original equipme...

G.I.S., Inc. is a distributor and fabricator of thermal and acoustical insulation systems for industrial, commercial, power, process, original equipment manufacturers, plumbing and HVAC industries. In today's fast paced market, consumers have a multi...

  • 803 Views
  • 0 replies
  • 0 kudos
Anonymous
by Not applicable
  • 2514 Views
  • 1 replies
  • 1 kudos

Resolved! "policy_id" parameter in JOB API

I can't find information about that parameter in https://docs.databricks.com/dev-tools/api/latest/jobs.htmlWhere is it documented?

  • 2514 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Databricks Employee
  • 1 kudos

I believe it is just "policy_id". As an incomplete example the specification via API would be something like: { "cluster_id": "1234-567890-abd35gh", "spark_context_id": 1234567890, "cluster_name": "my_cluster", "spark_version": "9.1.x-scala2....

  • 1 kudos
Labels