- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-05-2023 05:12 AM
The reason why the special characters are converted to '?????' is because the Hive Metastore stores data in a binary format. When you create a view, the Hive Metastore converts the data in the view to binary format. This conversion process strips out any non-ASCII characters.
To preserve the special characters in the view representation, you can use the following workaround:
- Create a new column in the view that contains the encoded version of the string attribute.
- Use the new column in the where clause of the view.
The following code shows how to do this:
CREATE VIEW my_view AS
SELECT
*,
encode(string_attribute, 'utf-8') AS encoded_string_attribute
FROM my_table;
SELECT
*
FROM my_view
WHERE
encoded_string_attribute != 'シミュレータに接続されていません';This code will create a new view called
my_view
that contains the original data from
my_table
plus a new column called
encoded_string_attribute
. The
encoded_string_attribute
column contains the encoded version of the
string_attribute
column. The where clause of the view will use the
encoded_string_attribute
column to filter out rows where the value of the
string_attribute
column is equal to 'シミュレータに接続されていません'.
This workaround will preserve the special characters in the view representation. However, it will also make the view slightly larger, because the encoded version of the
string_attribute
column will be slightly larger than the original
string_attribute
column.
If you want to avoid the performance penalty of storing the encoded version of the
string_attribute
column, you can use the following alternative workaround:
- Create a new column in the view that contains the escaped version of the string attribute.
- Use the new column in the where clause of the view.
The following code shows how to do this:
CREATE VIEW my_view AS
SELECT
*,
escape(string_attribute) AS escaped_string_attribute
FROM my_table;
SELECT
*
FROM my_view
WHERE
escaped_string_attribute != 'シミュレータに接続されていません';This code will create a new view called
my_view
that contains the original data from
my_table
plus a new column called
escaped_string_attribute
. The
escaped_string_attribute
column contains the escaped version of the
string_attribute
column. The where clause of the view will use the
escaped_string_attribute
column to filter out rows where the value of the
string_attribute
column is equal to 'シミュレータに接続されていません'.
This workaround will preserve the special characters in the view representation without making the view any larger. However, it will make the where clause of the view slightly slower, because the Hive Metastore has to do some extra work to escape the special characters.
Which workaround you choose will depend on your specific needs. If you need to preserve the special characters in the view representation and you need the view to be as fast as possible, then you should use the first workaround. If you need to preserve the special characters in the view representation but you don't need the view to be as fast as possible, then you should use the second workaround.