Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks.

πŸ’₯ Try our new DEMO for online baseline detection. ❗ ❗

If you find this toolkit useful in your research, please cite:

  author = {Lorenzo QuirΓ³s},
  title = {P2PaLA: Page to PAGE Layout Analysis tookit},
  year = {2017},
  publisher = {GitHub},
  note = {GitHub repository},
  howpublished = {\url{}},

Check this paper for more details Arxiv.


  • Linux (OSX may work, but untested.).
  • Python (2.7, 3.6 under conda virtual environment is recomended)
  • Numpy
  • PyTorch (1.0). PyTorch 0.3.1 compatible on this branch
  • OpenCv (
  • NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN works, but is not recomended for training).
  • tensorboard-pytorch (v0.9) [Optional]. pip install tensorboardX > A diferent conda env is recomended to keep tensorflow separated from PyTorch


python install

To install python dependencies alone, use requirements file conda env create --file conda_requirements.yml


  1. Input data must follow the folder structure data_tag/page, where images must be into the data_tag folder and xml files into page. For example:
mkdir -p data/{train,val,test,prod}/page;
tree data;
β”œβ”€β”€ prod
β”‚   β”œβ”€β”€ page
β”‚   β”‚   β”œβ”€β”€ prod_0.xml
β”‚   β”‚   └── prod_1.xml
β”‚   β”œβ”€β”€ prod_0.jpg
β”‚   └── prod_1.jpg
β”œβ”€β”€ test
β”‚   β”œβ”€β”€ page
β”‚   β”‚   β”œβ”€β”€ test_0.xml
β”‚   β”‚   └── test_1.xml
β”‚   β”œβ”€β”€ test_0.jpg
β”‚   └── test_1.jpg
β”œβ”€β”€ train
β”‚   β”œβ”€β”€ page
β”‚   β”‚   β”œβ”€β”€ train_0.xml
β”‚   β”‚   └── train_1.xml
β”‚   β”œβ”€β”€ train_0.jpg
β”‚   └── train_1.jpg
└── val
    β”œβ”€β”€ page
    β”‚   β”œβ”€β”€ val_0.xml
    β”‚   └── val_1.xml
    β”œβ”€β”€ val_0.jpg
    └── val_1.jpg
  1. Run the tool.
python --config config.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"

❗ Pre-trained models available here

  1. Use TensorBoard to visualize train status:
tensorboard --logdir ./work/runs
  1. xml-PAGE files must be at "./work/results/test/"

We recommend Transkribus or nw-page-editor to visualize and edit PAGE-xml files.

  1. For detail about arguments and config file, see docs or python -h.
  2. For more detailed example see egs:
    • Bozen dataset see
    • cBAD complex competition dataset see
    • OHG dataset see


GNU General Public License v3.0 See LICENSE to see the full text.


Code is inspired by pix2pix and pytorch-CycleGAN-and-pix2pix

  • RTX cards require minimum Pytorch 1.0 [CUDNN_STATUS_EXECUTION_FAILED]

    On my Linux mint 19.1 using an RTX 2070

    When trying to recognize using the default installation:

    (p3p) home@home-lnx:~/Desktop/programs/P2PaLA$ python --config config_ALAR_min_model_17_12_18.txt --prev_model ALAR_min_model_17_12_18.pth --prod_data ./images/
    2019-01-21 13:42:19,280 - optparse - INFO - Reading configuration from config_ALAR_min_model_17_12_18.txt
    2019-01-21 13:42:19,282 - P2PaLA - INFO - Working on prod inference...
    2019-01-21 13:42:19,283 - P2PaLA - INFO - Results will be saved to ./work/results/prod
    2019-01-21 13:42:19,599 - P2PaLA - INFO - Resumming from model ALAR_min_model_17_12_18.pth
    /home/home/.conda/envs/p3p/lib/python3.6/site-packages/torch/cuda/ UserWarning: 
        Found GPU0 GeForce RTX 2070 which requires CUDA_VERSION >= 9000 for
         optimal performance and fast startup time, but your PyTorch was compiled
         with CUDA_VERSION 8000. Please install the correct PyTorch binary
         using instructions from
      warnings.warn(incorrect_binary_warn % (d, name, 9000, CUDA_VERSION))

    So I installed latest torch and torchvision:

    (p3p) home@home-lnx:~/Desktop/programs/P2PaLA$ pip install --ignore-installed torch torchvision

    Then ran recognition:

    (p3p) home@home-lnx:~/Desktop/programs/P2PaLA$ python --config config_ALAR_min_model_17_12_18.txt --prev_model ALAR_min_model_17_12_18.pth --prod_data ./images/
    /home/home/.conda/envs/p3p/lib/python3.6/site-packages/torch/nn/ UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
    2019-01-21 13:58:31,771 - optparse - INFO - Reading configuration from config_ALAR_min_model_17_12_18.txt
    2019-01-21 13:58:31,773 - P2PaLA - INFO - Working on prod inference...
    2019-01-21 13:58:31,774 - P2PaLA - INFO - Results will be saved to ./work/results/prod
    2019-01-21 13:58:32,125 - P2PaLA - INFO - Resumming from model ALAR_min_model_17_12_18.pth
    2019-01-21 13:58:34,859 - P2PaLA - INFO - Preprocessing data from ./images/ UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
      pr_x = Variable(sample["image"], volatile=True)
    THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
    2019-01-21 13:58:35,463 - P2PaLA - INFO - Production stage done. total time taken: 0.604010820388794
    2019-01-21 13:58:35,463 - P2PaLA - INFO - Average time per page: 0.604010820388794
    2019-01-21 13:58:35,463 - P2PaLA - INFO - All Done...

    Now the problem is when trying to train

    (p3p) home@home-lnx:~/Desktop/programs/P2PaLA$ python --config config_BL_only.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"
    /home/home/.conda/envs/p3p/lib/python3.6/site-packages/torch/nn/ UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
    2019-01-21 14:06:09,788 - optparse - INFO - Reading configuration from config_BL_only.txt
    2019-01-21 14:06:09,789 - optparse - DEBUG - Creating output dir: ./work_BL_only
    2019-01-21 14:06:09,790 - optparse - DEBUG - Creating checkpoints dir: ./work_BL_only/checkpoints
    2019-01-21 14:06:09,790 - P2PaLA - INFO - Working on training stage...
    2019-01-21 14:06:09,791 - P2PaLA - WARNING - tensorboardX is not installed, display logger set to OFF.
    2019-01-21 14:06:09,791 - P2PaLA - INFO - Preprocessing data from ./data/train
    /home/home/Desktop/programs/P2PaLA/nn_models/ UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
      init.uniform(, 0.0, 0.02)
    /home/home/Desktop/programs/P2PaLA/nn_models/ UserWarning: nn.init.uniform is now deprecated in favor of nn.init.uniform_.
      init.uniform(, 1.0, 0.02)
    THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=405 error=11 : invalid argument
    Traceback (most recent call last):
      File "", line 1262, in <module>
      File "", line 606, in main
        epoch_lossD +=[0]
    IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number
    opened by ghost 14
Lorenzo QuirΓ³s DΓ­az
Lorenzo QuirΓ³s DΓ­az
