Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I am trying to lowercase one of the columns(A_description) of a dataframe(df) and getting the error-"'Column' object is not callable".Code: def new_desc(): for line in df: line = df['A_description'].map(str.lower) return line new_desc()Have used...
Hi I'm facing an issue when writing to a salesforce object. I'm using the springml/spark-salesforce library. I have the above libraries installed as recommended based on my research.I try to write like this:(_sqldf .write .format("com.springml.spar...
PrivilegesSELECT: gives read access to an object.CREATE: gives ability to create an object (for example, a table in a schema).MODIFY: gives ability to add, delete, and modify data to or from an object.USAGE: does not give any abilities, but is an add...
I'm getting the error...AttributeError: 'SparkSession' object has no attribute '_wrapped'---------------------------------------------------------------------------AttributeError Traceback (most recent call last)<command-2311820097584616> in <cell li...
this can happen in 10X version try to use 7.3 LTS and share your observationand if it not working there try to create init script and load it to your databricks cluster so whenever your machine go up you can get advantage of that library because some...
Hi,I am doing a grammar check using spark NLP using azure databricks. But am getting TypeError: 'JavaPackage' object is not callable - in DocumentAssembler() intialization line.document_assembler = DocumentAssembler()\ .setInputCol("text")\ .setOutpu...
Hi its working after adding below configs in cluster. Please check URL for more info :- https://nlp.johnsnowlabs.com/docs/en/install#databricks-support
Hi All,i am getting the below error when i am ingesting the data from source file , source file is also attached , i have tried in both Community edition and Azure databricks as well getting the same error , can any one suggest me the solution ? # ...
Hello I have a databricks question I was not able to answer myselfI have this queryselect count(*) from tablewhere object[0].value is not null and object[0].value.value1 = "s"and created_year = 2022 and created_month = 7 and created_day = 4you can se...
SELECT count(*)FROM ( SELECT explode(mmycolumn) FROM table WHERE created_year = 2022 and created_month = 7 and created_day = 5)WHERE col.field is not null and col.field.field! = "signal"
By any chance, was the cluster restarted after installing the libraries or was it detached and reattached from/to the notebook? Notebook-scoped libraries do not persist across sessions. You must reinstall notebook-scoped libraries at the beginning of...
CommunityI'm running a sparklyr "group_by" function and the function returns the following info:# group by event_typeacled_grp_tbl <- acled_tbl %>% group_by("event_type") %>% summary(count = n()) Length Cl...
I should have deleted the post. While your are correct "event_type" should be without quotes the problem was the Summary function. I was using the wrong function it should have been "summarize."
Some libraries have intermediate IPython HTML-objects returned to the notebook cell output.Since this happens during training a machine learning model the statements are typically buried within in the library so I cannot easily interfere. (e.g. in or...
Hi @Kaniz Fatma ,thanks for showing me the link. This helps if you are in control of the generated html-object. If the html-content comes from a library, that is where the problems start, because I cannot wrap displayHTML().(I can of course look for...
I am using a framework and i have a query where i am doing,df = seg_df.select(*).write.option("compression", "gzip') and i am getting below error,When i don't do the write.option i am not getting below error. Why is it giving me repartition error. Wh...