I'm trying to run trRosetta on about 3000 different proteins.
For each protein I have generated an MSA file with HHblits.
For the first few proteins everything goes smoothly, and it predicts the distogram/angles in about 1-3 minutes.
However, at some point it tries to predict on one of the larger MSA's. The particular protein that it stalls on has a length of 1600 amino acids, and roughly 3500 aligned sequences.
So far Rosetta has been running for about 1 hour on this sequence, and has printed the following:
2020-08-06 15:38:16.462083: W tensorflow/core/framework/allocator.cc:124] Allocation of 4498921476 exceeds 10% of system memory.
2020-08-06 15:38:24.750222: W tensorflow/core/framework/allocator.cc:124] Allocation of 4498921476 exceeds 10% of system memory.
2020-08-06 16:46:23.006386: W tensorflow/core/framework/allocator.cc:124] Allocation of 4498921476 exceeds 10% of system memory.
2020-08-06 16:46:23.006592: W tensorflow/core/framework/allocator.cc:124] Allocation of 4080654400 exceeds 10% of system memory.
2020-08-06 16:46:30.095595: W tensorflow/core/framework/allocator.cc:124] Allocation of 4080654400 exceeds 10% of system memory.
./data/model2019_07/model.xaa - done
I'm guessing this is some sort of issue where the whole thing isn't loaded into memory, but rather ends up in the swap file or something? which would probably explain why it ends up taking such an awful long time. However I have 32 GB of ram on this machine and it isn't really using more 11 GB for this, and furthermore everything seems to be done on the CPU as opposed to the GPU, is that normal for this code?
Finally, one thing I have been wondering about, it seems like the model is loaded in everytime as the code is running right now, which takes quite a while and seems like a waste, I would imagine the model should only need to be loaded in once, and then just run on each of the MSA's to predict the distograms/angles, however it seems like the model being loaded depends on the specific MSA used as input, which would make it less trivial I guess?