I went through the code and one thing is bothering me. I think there is a major bug in the implementation. It is possible that I don't understand something, so please correct me if I'm wrong, but as of my current understanding this code trains and validates using the information "from the future" .
If you examine values in the code below you will see that there negative values for the delta.
https://github.com/DyGRec/TGSRec/blob/0c7ba17b1b787648ac0af0b57dfc7b91f2f00654/model.py#L557
I can see that mask is created only for the 0 values so negatives values are still used.
https://github.com/DyGRec/TGSRec/blob/0c7ba17b1b787648ac0af0b57dfc7b91f2f00654/model.py#L575
Data here is sorted by edge_ids not timestamps, so the possible fix for that would by sorting by x[2], instead of x[1]
https://github.com/DyGRec/TGSRec/blob/0c7ba17b1b787648ac0af0b57dfc7b91f2f00654/graph.py#L34-L39
If you look at the:
TGSRec/datasets/ml-100k/u.data
data is not sorted, by the timestamp and there is no point in your codebase, where this sorting happens (I guess).
I tried to run experiments for ml-100 for both scenarios: your original implementation and with the sorted input data and the results I got are significantly worse, at least for the early stages of training. I haven't run it for 200 epochs, so maybe the final results are closer to each other, but firstly I would like to see if my assumption is correct.
Results afters 20 epochs:
Without sorting:
valid acc: 0.7069337926425662
valid auc: 0.8038448618385599
valid f1: 0.7070636805233961
valid ap: 0.8172432697828477
With sorting:
valid acc: 0.5271334211112526
valid auc: 0.7374971517076878
valid f1: 0.6789490018391757
valid ap: 0.7001478155001294