Save unicode in redis but fetch error

11,091

Solution 1

Update, for global setting, check jmoz's answer.

If you're using third-party lib such as django-redis, you may need to specify a customized ConnectionFactory:

class DecodeConnectionFactory(redis_cache.pool.ConnectionFactory):
    def get_connection(self, params):
        params['decode_responses'] = True
        return super(DecodeConnectionFactory, self).get_connection(self, params)

Assuming you're using redis-py, you'd better to pass str instead of unicode to Redis, or else Redis will encode it automatically for *set commands, normally in UTF-8. For the *get commands, Redis has no idea about the formal type of a value and has to just return the value in str directly.

Thus, As Denis said, the way that you storing the object to Redis is critical. You need to transform the value to str to make the Redis layer transparent for you.

Also, set the default encoding to UTF-8 instead of using ascii

Solution 2

I think I've discovered the problem. After reading this, I had to explicitly decode from redis which is a pain, but works.

I stumbled across a blog post where the author's output was all unicode strings which was obv different to mine.

Looking into the StrictRedis.__init__ there is a parameter decode_responses which by default is False. https://github.com/andymccurdy/redis-py/blob/273a47e299a499ed0053b8b90966dc2124504983/redis/client.py#L446

Pass in decode_responses=True on construct and for me this FIXES THE OP'S ISSUE.

Solution 3

for each string you can use the decode function to transform it in utf-8, e.g. for the value if the title field in your code:

In [7]: a='\xe6\xaf\x94\xe8\xb5\x9b'

In [8]: a.decode('utf8')
Out[8]: u'\u6bd4\u8d5b'

Solution 4

I suggest you always encode to utf-8 before writing to MongoDB or Redis (or any external system). And that you decode('utf-8') when you fecth results, so that you always work with Unicode in Python.

Share:
11,091

Related videos on Youtube

goofansu
Author by

goofansu

Updated on June 04, 2022

Comments

  • goofansu
    goofansu almost 2 years

    I'm using mongodb and redis, redis is my cache.

    I'm caching mongodb objects with redis-py:

    obj in mongodb: {u'name': u'match', u'section_title': u'\u6d3b\u52a8', u'title': 
    u'\u6bd4\u8d5b', u'section_id': 1, u'_id': ObjectId('4fb1ed859b10ed2041000001'), u'id': 1}
    

    the obj fetched from redis with hgetall(key, obj) is:

    {'name': 'match', 'title': '\xe6\xaf\x94\xe8\xb5\x9b', 'section_title': 
    '\xe6\xb4\xbb\xe5\x8a\xa8', 'section_id': '1', '_id': '4fb1ed859b10ed2041000001', 'id': '1'}
    

    As you can see, obj fetched from cache is str instead of unicode, so in my app, there is error s like :'ascii' codec can't decode byte 0xe6 in position 12: ordinal not in range(128)

    Can anyone give some suggestions? thank u

    • Denis
      Denis almost 12 years
      And how you save mongodb objects in redis?
  • Denis
    Denis almost 12 years
    Man I think hi is want make cache not for decode-encode fun, but for increase him system.
  • jmoz
    jmoz over 11 years
    Why would they auto encode it but then just leave you a string on get?
  • okm
    okm over 11 years
    @jmoz I'm not sure, maybe the author knows the reason =p .But unlike adapter such as psycopg2, normally the redis-py client does not store the original datatype w/ the data. Thus there is no way to know exactly what type the data(string) is originally and how to decode it. Maybe insisting on str instead of accepting other types of value then converting them to str implicitly, is better, but who knows.
  • jmoz
    jmoz over 11 years
    @okm I found something the other day regarding this, check my answer.
  • goofansu
    goofansu over 11 years
    Thank you, I'll try later. This maybe the best solution as I won't mess up my code.
  • maryokhin
    maryokhin almost 8 years
    I enabled this setting and it seems that I now get UnicodeDecodeError when trying to store and retrieve a dictionary in Django's cache