STAR
Official implementation of Sparse Transformer-based Action Recognition
Dataset
download NTU RGB+D 60 action recognition of 2D/3D skeleton from http://rose1.ntu.edu.sg/datasets/actionRecognition.asp
or use google drive
uzip data as the following file structure: $(project_folder)/raw/.\*skeleton
or $(project_folder)/dataset/raw/.\*skeleton
(create "raw" folder under $(project_folder) or $(project_folder)/dataset then put raw skeleton files under "raw" folder)
run the code below to generate dataset:
python datagen.py
Training
git fetch and checkout to "distributed" branch
python train_dist.py -#distributed training
Configuration
parser.set_defaults(gpu=True,
batch_size=128,
dataset_name='NTU',
dataset_root=osp.join(os.getcwd()), # or dataset_root=osp.join(os.getcwd(), 'dataset')
load_model=False,
in_channels=9,
num_enc_layers=5,
num_conv_layers=2,
weight_decay=4e-5,
drop_rate=[0.4, 0.4, 0.4, 0.4], # linear_attention, sparse_attention, add_norm, ffn
hid_channels=64,
out_channels=64,
heads=8,
data_parallel=False,
cross_k=5,
mlp_head_hidden=128)
parser.set_defaults(gpu=True,
batch_size=128,
dataset_name='NTU',
dataset_root=osp.join(os.getcwd()),
load_model=False,
in_channels=9,
num_enc_layers=5,
num_conv_layers=2,
weight_decay=4e-5,
drop_rate=[0.4, 0.4, 0.4, 0.4], # linear_attention, sparse_attention, add_norm, ffn
hid_channels=128,
out_channels=128,
heads=8,
data_parallel=False,
cross_k=5,
mlp_head_hidden=128)