Problem description
When exporting a trained model using WeightedLinearModel.save
, floats are formatted using Python's format
with the "g" parameter that has a default precision of 6.
The class that handles this is CompactJSONEncoder
:
https://github.com/uf3/uf3/blob/f2ad1a23f5b38695894378803dc54f25139b1a3d/uf3/util/json_io.py#L138-L140
For a model I trained, this precision is too small, resulting in prediction errors when loading the model from the .json
file.
Example
I attached the solutions of the model here. The first output shows the solutions directly after training, and the second output shows the solutions after saving and loading.
Solutions after training
{'solution': {'Zr': -20560.33609234794,
'O': -41120.412798809746,
('O',
'O'): array([ 5.26435439e+00, 5.20311465e+00, 5.08063516e+00, 4.89691593e+00,
4.65195696e+00, 4.34575824e+00, 3.97831977e+00, 3.54964157e+00,
3.05972362e+00, 2.50856592e+00, 1.89616849e+00, 1.22868659e+00,
7.30053245e-01, 4.31556715e-01, 2.46165334e-01, 1.24437004e-01,
5.93001720e-02, 2.74566934e-02, 1.30984740e-02, 3.16545758e-03,
8.15071387e-04, 1.64258524e-04, -6.46071392e-03, -9.32398478e-03,
-1.06880467e-02, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00]),
('Zr',
'O'): array([ 2.57773141, 2.48379049, 2.29590865, 2.01408589, 1.63832222,
1.16861763, 0.60497211, -0.05261432, -0.80414166, -1.64960993,
-2.08524704, -1.9352091 , -1.61253899, -1.27462536, -0.97011593,
-0.72222357, -0.53059997, -0.37511157, -0.26761948, -0.19630742,
-0.1364386 , -0.08911125, -0.05464061, -0.02995521, -0.0185196 ,
0. , 0. , 0. ]),
('Zr',
'Zr'): array([7.63423726, 7.59087987, 7.50416508, 7.37409289, 7.20066331,
6.98387633, 6.72373196, 6.42023019, 6.07337103, 5.68315447,
5.24958052, 4.77264917, 4.25236043, 3.68871429, 3.08171076,
2.43142205, 1.77394657, 1.3332101 , 0.99423793, 0.73174477,
0.51741219, 0.35220643, 0.2297151 , 0.13608243, 0.05597425,
0. , 0. , 0. ])},
'knots': {('O',
'O'): array([1.0e-06, 1.0e-06, 1.0e-06, 1.0e-06, 2.0e-01, 4.0e-01, 6.0e-01,
8.0e-01, 1.0e+00, 1.2e+00, 1.4e+00, 1.6e+00, 1.8e+00, 2.0e+00,
2.2e+00, 2.4e+00, 2.6e+00, 2.8e+00, 3.0e+00, 3.2e+00, 3.4e+00,
3.6e+00, 3.8e+00, 4.0e+00, 4.2e+00, 4.4e+00, 4.6e+00, 4.8e+00,
5.0e+00, 5.0e+00, 5.0e+00, 5.0e+00]),
('Zr',
'O'): array([1.0e-06, 1.0e-06, 1.0e-06, 1.0e-06, 2.0e-01, 4.0e-01, 6.0e-01,
8.0e-01, 1.0e+00, 1.2e+00, 1.4e+00, 1.6e+00, 1.8e+00, 2.0e+00,
2.2e+00, 2.4e+00, 2.6e+00, 2.8e+00, 3.0e+00, 3.2e+00, 3.4e+00,
3.6e+00, 3.8e+00, 4.0e+00, 4.2e+00, 4.4e+00, 4.6e+00, 4.8e+00,
5.0e+00, 5.0e+00, 5.0e+00, 5.0e+00]),
('Zr',
'Zr'): array([1.0e-06, 1.0e-06, 1.0e-06, 1.0e-06, 2.0e-01, 4.0e-01, 6.0e-01,
8.0e-01, 1.0e+00, 1.2e+00, 1.4e+00, 1.6e+00, 1.8e+00, 2.0e+00,
2.2e+00, 2.4e+00, 2.6e+00, 2.8e+00, 3.0e+00, 3.2e+00, 3.4e+00,
3.6e+00, 3.8e+00, 4.0e+00, 4.2e+00, 4.4e+00, 4.6e+00, 4.8e+00,
5.0e+00, 5.0e+00, 5.0e+00, 5.0e+00])}}
Solutions after saving and loading
{'solution': {'Zr': -20560.3,
'O': -41120.4,
('O',
'O'): array([ 5.26435e+00, 5.20311e+00, 5.08064e+00, 4.89692e+00,
4.65196e+00, 4.34576e+00, 3.97832e+00, 3.54964e+00,
3.05972e+00, 2.50857e+00, 1.89617e+00, 1.22869e+00,
7.30053e-01, 4.31557e-01, 2.46165e-01, 1.24437e-01,
5.93001e-02, 2.74567e-02, 1.30984e-02, 3.16543e-03,
8.15054e-04, 1.64246e-04, -6.46072e-03, -9.32399e-03,
-1.06880e-02, 0.00000e+00, 0.00000e+00, 0.00000e+00]),
('Zr',
'O'): array([ 2.57773 , 2.48379 , 2.29591 , 2.01409 , 1.63832 ,
1.16862 , 0.604972 , -0.0526146, -0.804142 , -1.64961 ,
-2.08525 , -1.93521 , -1.61254 , -1.27463 , -0.970116 ,
-0.722224 , -0.5306 , -0.375112 , -0.26762 , -0.196307 ,
-0.136439 , -0.0891113, -0.0546406, -0.0299552, -0.0185196,
0. , 0. , 0. ]),
('Zr',
'Zr'): array([7.63424 , 7.59088 , 7.50417 , 7.37409 , 7.20066 , 6.98388 ,
6.72373 , 6.42023 , 6.07337 , 5.68316 , 5.24958 , 4.77265 ,
4.25236 , 3.68872 , 3.08171 , 2.43142 , 1.77395 , 1.33321 ,
0.994238 , 0.731745 , 0.517412 , 0.352207 , 0.229715 , 0.136083 ,
0.0559743, 0. , 0. , 0. ])},
'knots': {('O',
'O'): array([1.0e-06, 1.0e-06, 1.0e-06, 1.0e-06, 2.0e-01, 4.0e-01, 6.0e-01,
8.0e-01, 1.0e+00, 1.2e+00, 1.4e+00, 1.6e+00, 1.8e+00, 2.0e+00,
2.2e+00, 2.4e+00, 2.6e+00, 2.8e+00, 3.0e+00, 3.2e+00, 3.4e+00,
3.6e+00, 3.8e+00, 4.0e+00, 4.2e+00, 4.4e+00, 4.6e+00, 4.8e+00,
5.0e+00, 5.0e+00, 5.0e+00, 5.0e+00]),
('Zr',
'O'): array([1.0e-06, 1.0e-06, 1.0e-06, 1.0e-06, 2.0e-01, 4.0e-01, 6.0e-01,
8.0e-01, 1.0e+00, 1.2e+00, 1.4e+00, 1.6e+00, 1.8e+00, 2.0e+00,
2.2e+00, 2.4e+00, 2.6e+00, 2.8e+00, 3.0e+00, 3.2e+00, 3.4e+00,
3.6e+00, 3.8e+00, 4.0e+00, 4.2e+00, 4.4e+00, 4.6e+00, 4.8e+00,
5.0e+00, 5.0e+00, 5.0e+00, 5.0e+00]),
('Zr',
'Zr'): array([1.0e-06, 1.0e-06, 1.0e-06, 1.0e-06, 2.0e-01, 4.0e-01, 6.0e-01,
8.0e-01, 1.0e+00, 1.2e+00, 1.4e+00, 1.6e+00, 1.8e+00, 2.0e+00,
2.2e+00, 2.4e+00, 2.6e+00, 2.8e+00, 3.0e+00, 3.2e+00, 3.4e+00,
3.6e+00, 3.8e+00, 4.0e+00, 4.2e+00, 4.4e+00, 4.6e+00, 4.8e+00,
5.0e+00, 5.0e+00, 5.0e+00, 5.0e+00])}}
The most apparent difference is in the one-body offsets:
After training: 'Zr': -20560.33609234794
After loading: 'Zr': -20560.3
Impact on prediction accuracy
This decreased precision noticeably affects the model's performance:
Model with original precision:
Loaded model with 6 digit precision:
Solution
Increasing the accuracy to 20 digits, using format(0, ".20g")
instead of format(0, "g")
, solves the issue for my model. However, I cannot assess whether this precision is large enough for other models or if other issues arise from this.
Maybe someone has input on this so that this issue can be resolved.