Is there any reason to prefer unicode(somestring, 'utf8') as opposed to somestring.decode('utf8')?
My only thought is that .decode() is a bound method so python may be able to resolve it more efficiently, but correct me if I'm wrong.
From stackoverflow
-
It's easy to benchmark it:
>>> from timeit import Timer >>> ts = Timer("s.decode('utf-8')", "s = 'ééé'") >>> ts.timeit() 8.9185450077056885 >>> tu = Timer("unicode(s, 'utf-8')", "s = 'ééé'") >>> tu.timeit() 2.7656929492950439 >>>Obviously, unicode() is faster.
FWIW, I don't know where you get the impression that methods would be faster - it's quite the contrary.
J.F. Sebastian : Fixed the example output.J.F. Sebastian : Python25: 3.0 vs. 0.9; Python26: 2.6 vs. 0.6 that is `unicode()` is about 4 time faster than `s.decode()` -
I'd prefer
'something'.decode(...)since theunicodetype is no longer there in Python 3.0, whiletext = b'binarydata'.decode(encoding)is still valid.
0 comments:
Post a Comment