Describe the bug
Messytables should guess decimals correctly respecting the locale configuration.
For example: In germany the ,
is used as decimal dot but a value 1,200
is guessed as type "text".
This issue was initially reported as ckan issue https://github.com/ckan/ckan/issues/5769 where I recognized it.
The type guessing seems to happen here: https://github.com/okfn/messytables/blob/51b736892a48e420ab313675f54901c77b446dec/messytables/types.py
and seems to happen locale specific. (I think the magic happens in line 100:
value = locale.atof(value)
Unfortunately python seems to recognizes a dot as decimal point even if a german locale is set, which I could reproduce in my local environment:
>>> locale.getlocale()
('de_DE', 'cp1252')
>>> locale.atof('1,200')
Traceback (most recent call last):
File "<pyshell#35>", line 1, in <module>
locale.atof('1,200')
File "C:\Program Files\Python27\lib\locale.py", line 318, in atof
return func(string)
ValueError: invalid literal for float(): 1,200
>>> locale.localeconv()
{'mon_decimal_point': '', 'int_frac_digits': 127, 'p_sep_by_space': 127, 'frac_digits': 127, 'thousands_sep': '', 'n_sign_posn': 127, 'decimal_point': '.', 'int_curr_symbol': '', 'n_cs_precedes': 127, 'p_sign_posn': 127, 'mon_thousands_sep': '', 'negative_sign': '', 'currency_symbol': '', 'n_sep_by_space': 127, 'mon_grouping': [], 'p_cs_precedes': 127, 'positive_sign': '', 'grouping': []}