cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

alexgv12
by New Contributor III
  • 1614 Views
  • 2 replies
  • 0 kudos

How can I somehow run spark.something in a worker? - rdd foreach spark.context

i am using rdd to parallelize a function, in this function i format the record i want to save, how can i store from this function the record with a dataframe? because every time i use spark..... an error is generated Caused by: org.apache.spark.api.p...

  • 1614 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @alexander grajales vanegas​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear ...

  • 0 kudos
1 More Replies
jakubk
by Contributor
  • 480 Views
  • 0 replies
  • 0 kudos

databricks spark sql Custom table valued function + struct really slow (minutes for a single row)

I'm using azure databricksI have a custom table valued function which takes a URL as a parameter and outputs a single row table with certain elements from the URL extracted/labelled(i get search activity URLs and when in a specific format I can retri...

  • 480 Views
  • 0 replies
  • 0 kudos
Vegard_Stikbakk
by New Contributor II
  • 1362 Views
  • 2 replies
  • 3 kudos

Resolved! External functions on a SQL endpoint

want to create an external function using CREATE FUNCTION (External) and expose it to users of my SQL endpoint. Although this works from a SQL notebook, if I try to use the function from a SQL endpoint, I get "User defined expression is not supporte...

Screenshot 2022-03-24 at 21.32.59
  • 1362 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Vegard Stikbakke​ , Were you able to resolve your problem?

  • 3 kudos
1 More Replies
Jeff1
by Contributor II
  • 920 Views
  • 3 replies
  • 1 kudos

Resolved! Strange object returned using sparklyr

CommunityI'm running a sparklyr "group_by" function and the function returns the following info:# group by event_typeacled_grp_tbl <- acled_tbl %>% group_by("event_type") %>% summary(count = n())                   Length Cl...

  • 920 Views
  • 3 replies
  • 1 kudos
Latest Reply
Jeff1
Contributor II
  • 1 kudos

I should have deleted the post. While your are correct "event_type" should be without quotes the problem was the Summary function. I was using the wrong function it should have been "summarize."

  • 1 kudos
2 More Replies
irfanaziz
by Contributor II
  • 1787 Views
  • 3 replies
  • 1 kudos

Resolved! What is the difference between passing the schema in the options or using the .schema() function in pyspark for a csv file?

I have observed a very strange behavior with some of our integration pipelines. This week one of the csv files was getting broken when read with read function given below.def ReadCSV(files,schema_struct,header,delimiter,timestampformat,encode="utf8...

  • 1787 Views
  • 3 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Hi @nafri A​ ,What is the error you are getting, can you share it please? Like @Hubert Dudek​ mentioned, both will call the same APIs

  • 1 kudos
2 More Replies
gbrueckl
by Contributor II
  • 2997 Views
  • 6 replies
  • 4 kudos

Resolved! CREATE FUNCTION from Python file

Is it somehow possible to create an SQL external function using Python code?the examples only show how to use JARshttps://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-function.htmlsomething like:CREATE TEMPORAR...

  • 2997 Views
  • 6 replies
  • 4 kudos
Latest Reply
pts
New Contributor II
  • 4 kudos

As a user of your code, I'd find it a less pleasant API because I'd have to some_module.some_func.some_func() rather than just some_module.some_func()No reason to have "some_func" exist twice in the hierarchy. It's kind of redundant. If some_func is ...

  • 4 kudos
5 More Replies
antoooks
by New Contributor III
  • 1603 Views
  • 3 replies
  • 5 kudos

Resolved! display() function always return connection refused on tunneling despite successfully retrieving the schema

Hi everyone,I am using SSH tunnelling with SSHTunnelForwarder to reach a target AWS RDS PostgreSQL database. The connection got through, however when I tried to display the retrieved data frame it always throws "connection refused" error. Please see ...

image.png
  • 1603 Views
  • 3 replies
  • 5 kudos
Latest Reply
jose_gonzalez
Moderator
  • 5 kudos

hi @Kurnianto Trilaksono Sutjipto​ ,This seems like a connectivity issue with the url you are trying to connect to. It fails during the display() command because read is a lazy transformation and it will not be executed right away. On the other hand,...

  • 5 kudos
2 More Replies
maranBH
by New Contributor III
  • 19867 Views
  • 5 replies
  • 12 kudos

Resolved! How to import a function to another notebook using Repos without %run?

Hi all,I was reading the Repos documentation: https://docs.databricks.com/repos.html#migrate-from-run-commandsIt is explained that, one advantage of Repos is no longer necessary to use %run magic command to make funcions available in one notebook to ...

  • 19867 Views
  • 5 replies
  • 12 kudos
Latest Reply
maranBH
New Contributor III
  • 12 kudos

Thank you all for your help! I tried all that was suggested; but I finally realized it was my fault in first place:I was testing Files in Repos with a runtime < 8.4.I was trying to import a file from a DB Notebook instead of a static .py file.Upgradi...

  • 12 kudos
4 More Replies
Kaniz
by Community Manager
  • 489 Views
  • 1 replies
  • 0 kudos
  • 489 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

Typically a method is associated to an object and/or class while a function is not. For example, the following class has a single method called "my_method":class MyClass():   def __init__(self, a): self.a = a     def my_method(self): ...

  • 0 kudos
User15787040559
by New Contributor III
  • 2898 Views
  • 2 replies
  • 0 kudos

How to do a unionAll() when the number and the name of columns are different?

Looking at the API for Dataframe.unionAll() when you have 2 different dataframes with different number of columns and names unionAll() doesn't work.How can you do it?One possible solution is using the following function which performs the union of tw...

  • 2898 Views
  • 2 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

I'm not sure union is the right tool, if the DataFrames have fundamentally different information in them. If the difference is merely column name, yes, rename. If they don't, then the 'union' contemplated here is really a union of columns as well as ...

  • 0 kudos
1 More Replies
User16790091296
by Contributor II
  • 1664 Views
  • 1 replies
  • 0 kudos

Resolved! How can I use a Python function defined in my git-repo module within the DB notebook?

I have a function within a module in my git-repo. I want to import that to my DB notebook - how can I do that?

  • 1664 Views
  • 1 replies
  • 0 kudos
Latest Reply
aladda
Honored Contributor II
  • 0 kudos

Databricks Repos allows you to sync your work in Databricks with a remote Git repository. This makes it easier to implement development best practices. Databricks supports integrations with GitHub, Bitbucket, and GitLab. Using Repos you can bring you...

  • 0 kudos
Yogi
by New Contributor III
  • 6952 Views
  • 15 replies
  • 0 kudos

Resolved! Can we pass Databricks output to Azure function body?

Hi, Can anyone help me with Databricks and Azure function. I'm trying to pass databricks json output to azure function body in ADF job, is it possible? If yes, How? If No, what other alternative to do the same?

  • 6952 Views
  • 15 replies
  • 0 kudos
Latest Reply
AbhishekNarain_
New Contributor III
  • 0 kudos

You can now pass values back to ADF from a notebook.@@Yogi​ Though there is a size limit, so if you are passing dataset of larger than 2MB then rather write it on storage, and consume it directly with Azure Functions. You can pass the file path/ refe...

  • 0 kudos
14 More Replies
Labels