Wednesday, 14 August 2013

Why is numpy/pandas returning only the first character when I use .astype(str)

Why is numpy/pandas returning only the first character when I use
.astype(str)

I'm trying to use the .astype() function to convert from an int32 to
string. I first noticed this when trying to use the conversion on a pandas
series, but when I tested with numpy I saw the same behavior, so I'm
assuming that numpy is the underlying cause.
In [0]: import numpy as np
In [1]: test = np.array([1, 22, 333, 4444])
In [2]: test.astype(str)
Out [2]: array(['1', '2', '3'],
dtype='|S1')
Why is it defaulting to S1 and not S4, as I would expect in order to
capture the full length? It seems simple, but maybe there's something I'm
missing? When I explicitly specify S3 (or greater) it works fine:
In [3]: test.astype('S10')
Out [3]: array(['1', '22', '333', '4444'],
dtype='|S10')
Based on the examples I've seen online, it doesn't seem like I should have
to specify this way. I've got numpy 1.6.1 installed.

No comments:

Post a Comment