python - Pandas IndexError for large dataframe -
when try add new column large dataframe indexerror. can me error?
>vec 0 1 2 3 4 5 6 v1.uc8.0 0 0 0 0 0 0 0 v1.uc48.0 0 0 0 0 0 0 0 7 8 9 ... 2546531 2546532 2546533 v1.uc8.0 0 0 0 ... 0 0 0 v1.uc48.0 0 0 0 ... 0 0 0 2546534 2546535 2546536 2546537 2546538 2546539 2546540 v1.uc8.0 0 0 0 0 0 0 0 v1.uc48.0 0 0 0 0 0 0 0 [2 rows x 2546541 columns] > vec['todrop']=0 indexerror traceback (most recent call last) <ipython-input-40-9868611037ed> in <module>() ----> 1 vec['todrop']=0 c:\anaconda\lib\site-packages\pandas\core\frame.pyc in __setitem__(self, key, value) 2115 else: 2116 # set column -> 2117 self._set_item(key, value) 2118 2119 def _setitem_slice(self, key, value): c:\anaconda\lib\site-packages\pandas\core\frame.pyc in _set_item(self, key, value) 2193 self._ensure_valid_index(value) 2194 value = self._sanitize_column(key, value) -> 2195 ndframe._set_item(self, key, value) 2196 2197 # check if modifying copy c:\anaconda\lib\site-packages\pandas\core\generic.pyc in _set_item(self, key, value) 1188 1189 def _set_item(self, key, value): -> 1190 self._data.set(key, value) 1191 self._clear_item_cache() 1192 c:\anaconda\lib\site-packages\pandas\core\internals.pyc in set(self, item, value, check) 2970 2971 try: -> 2972 loc = self.items.get_loc(item) 2973 except keyerror: 2974 # item wasn't present, insert @ end c:\anaconda\lib\site-packages\pandas\core\index.pyc in get_loc(self, key, method) 1435 """ 1436 if method none: -> 1437 return self._engine.get_loc(_values_from_object(key)) 1438 1439 indexer = self.get_indexer([key], method=method) pandas\index.pyx in pandas.index.indexengine.get_loc (pandas\index.c:3824)() pandas\index.pyx in pandas.index.indexengine.get_loc (pandas\index.c:3578)() pandas\src\util.pxd in util.get_value_at (pandas\index.c:15287)() indexerror: index out of bounds
i have been trying add new row transposed dataframe (vec.t) got same error.
this weird indeed.
you can use workaround:
vec = pd.merge(vec, pd.dataframe([0, 0], columns=["new"]), right_index=true, left_index=true) # optional: pass copy=false
make sure new 1-column dataframe has same index vec
.
more on why weird:
hopefully can provide proper answer.
df = pd.dataframe(np.zeros((2, 2546540))) df[2546540] = 0
output: indexerror
in op's post.
df["blah"] = 0
output:
typeerror: unorderable types: numpy.ndarray() < str()
meanwhile, everything's fine small dataframe:
df = pd.dataframe(np.zeros((2, 200))) df[200] = 0
output expected:
0 1 2 3 4 5 6 7 8 9 ... 191 192 193 194 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 195 196 197 198 199 200 0 0 0 0 0 0 0 1 0 0 0 0 0 0 [2 rows x 201 columns]
hope helps , can explain behavior of pandas.
Comments
Post a Comment