cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Limited Results When trying to Access Delta Shared Tables from C#

Shawn_Eary
Contributor

This C# code only seems to return about 8 rows when it should return 100:

// Per Databricks AI, you must run the following command if you don't
// want to get an error about delta.enableDeltionVectors from C#:
// ALTER TABLE [myCatalog].setest.billboard_hot_100 SET TBLPROPERTIES (delta.enableDeletionVectors = false);
//
// This code was mostly written by Databricks AI but some other AI tools
// were used.
using System.Net.Http.Headers;

class Program
{
  private static readonly string bearerToken = "[myBearerToken]";
  private static readonly string endpoint = "https://[myRegion].azuredatabricks.net/api/2.0/delta-sharing/metastores/[myGUID]/shares/seary_billboard_100_test_share/schemas/setest/tables/billboard_hot_100/query";

  static async Task Main(string[] args)
  {
    using (HttpClient client = new HttpClient())
    {
      client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", bearerToken);
      client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

      var content = new StringContent("{\"responseFormat\": \"delta\"}", System.Text.Encoding.UTF8, "application/json");
      HttpResponseMessage response = await client.PostAsync(endpoint, content);

      if (response.IsSuccessStatusCode)
      {
        string responseData = await response.Content.ReadAsStringAsync();
        Console.WriteLine("Response Data:");
        Console.WriteLine(responseData);
        File.WriteAllText("c:\\Users\\seary\\Downloads\\billBoard100-deltaShareResults.json", responseData);
      }
      else
      {
        Console.WriteLine($"Error: {response.StatusCode}");
        string errorData = await response.Content.ReadAsStringAsync();
        Console.WriteLine("Error Data:");
        Console.WriteLine(errorData);
      }
    }
  }
}

Also, the returned JSON seems slightly corrupt to me. How can I get the above code to return all 100 rows from 

seary_billboard_100_test_share.setest.billboard_hot_100?
3 REPLIES 3

gchandra
Databricks Employee
Databricks Employee

Can you try with the ODBC driver?

https://docs.databricks.com/en/integrations/odbc/index.html

 



~

@gchandra you have a really good point.

If I Remember Correctly (IIRC), C# can get to the data I need using a service account and Personal Access Token (PAT) against the SQL Statement Execution API. I don't think I've tried ODBC yet though.

My team specifically wants me to use Delta Sharing (in this case) for two different reasons:

  1. Delta Sharing does not require us to spin a cluster up to get a result.
  2. Delta Sharing is easier to use with customers outside our organization.

It would be preferable if I could access Delta Sharing libraries directly from C# through a library or the REST API mentioned in my original post, but right now, I don't think that works so I've experimented with other means.

I managed to trigger Python code to call the Delta Sharing libraries using C#'s System.Diagnostics.Process. This works well for simple examples, but it's clunky and I'm concerned it might break in complex situations. On the other hand, I had a coworker suggest I also try calling the Delta Sharing libraries using Python.NET. Unfortunately, the latest version of Python.NET seems a little buggy to me and while it does appear to be getting back the correct data, there seem to be some "security issues" that prevent the processes it launches from terminating properly.

Yes, ODBC (or the SQL Statement Execution API) might be how I would personally work around these problems, but I'm not sure my team agrees with me. From a team perspective, it would be useful if Databricks could support a native C# library for Delta Sharing. It's currently a lot of unnecessary work to use C# to pull data from Databricks.

This is very frustrating. The documentation for the Delta Sharing REST API is very bad. When I run a query, I get back a bunch of meta data describing the Parquet files instead of the actual rows of data. The Delta Sharing REST API is a huge hassle to use from C#. For this reason, many of my team members have resolved to simply use Python which unfortunately is not as useful a language as C# IMO.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group