Strip Trademark Symbol from string Python
10,215
Solution 1
The trademark symbol is Unicode character U+2122
, or in Python notation u"\u2122"
.
Just do a search and replace:
'string'.replace(u"\u2122", '')
Solution 2
>>> 'Official Trademark™'.strip('™')
'Official Trademark'
>>>
Author by
Chris
Updated on June 14, 2022Comments
-
Chris almost 2 years
I'm trying to prep some data for a designer. I'm pulling data out of SQL Server with python on a Windows machine (not sure if OS is important). How would I make the string 'Official Trademark™' = 'Official Trademark'? Also, any further information/reading on unicode or the pertinent subject matter would help me become a little more independent. Thanks for any help!
Edited:
Perhaps I should have included some code. I'm now getting this error during run time: 'UnicodeDecodeError: 'ascii' codec can't decode byte 0x99 in position 2:ordinal not in range(128).' Here is my code:
row.note = 'TM™ Data\n' t = row.note t = t.rstrip(os.linesep).lstrip(os.linesep) t = t.replace(u"\u2122",'')
-
Blender over 12 yearsYou can't use non-ascii characters in Python source code: codepad.org/ymgtruH9
-
infrared over 12 yearsYou can, you just have to make
# -*- coding: utf-8 -*-
the first line in your source file: codepad.org/BYIPuDox -
Petr Viktorin over 12 yearsIn Python 3, the declaration is not required any more. In Python 2, you should use unicode strings,
u'™'
. -
Jochen Ritzel over 12 yearsThe " (gotta be careful with that Unicode, as you must .decode() it when printing)" part is misleading, you only have to encode yourself when Python thinks your terminal only supports ascii. I think for some reason on Windows it does.
-
heltonbiker over 12 yearsI put
#coding: utf-8
as the second line of the script (the first being the shebang#!/usr/bin/env python
) and it works too. -
Jochen Ritzel over 12 yearsWhen you print unicode Python automatically encodes it with the encoding given by
sys.stdout.encoding
. If your terminal is set up properly then that should be UTF-8 or whatever encoding makes sense for your language. Otherwise the encoding defaults to ascii and printing non-ascii characters will fail.