cant read json file with just 1,75 MiB ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2024 12:48 AM
Hi,
I am realtively new on databricks, although I am conscious about lazy evaluation, transformations and actions and peristence.
I have a json file (complex-nested) with about 1,73 MiB.
when
query_07129 ={"query":[],"response":{"format":"json-stat2"}}
- Labels:
-
Spark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2024 04:16 AM
This can be resolved by redefining the schema structure explicitly and using that schema to read the file.
from pyspark.sql.types import StructType, StructField, StringType, IntegerType, ArrayType
# Define the schema according to the JSON structure
schema = StructType([
StructField("field1", StringType(), True),
StructField("field2", IntegerType(), True),
# Add fields according to the JSON structure
])
# Read the JSON file with the defined schema
df = spark.read.schema(schema).json('dbfs:/mnt/makro/bronze/json_ssb/07129_20240514.json')
df.show()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2024 06:08 AM
thanks for your reply. In my case I ll need to read different json files in a loop. they have not the same scheme , how to proceed in that case? thanks

