Layout Parser is a deep learning based tool for document image layout analysis tasks.

Overview

Layout Parser Logo

Docs PyPI PyVersion License


Layout Parser is a deep learning based tool for document image layout analysis tasks.

Installation

Use pip or conda to install the library:

pip install layoutparser

# Install Detectron2 for using DL Layout Detection Model
# Please make sure the PyTorch version is compatible with
# the installed Detectron2 version. 
pip install 'git+https://github.com/facebookresearch/detectron2.git#egg=detectron2' 

# Install the ocr components when necessary 
pip install layoutparser[ocr]      

This by default will install the CPU version of the Detectron2, and it should be able to run on most of the computers. But if you have a GPU, you can consider the GPU version of the Detectron2, referring to the official instructions.

Quick Start

We provide a series of examples for to help you start using the layout parser library:

  1. Table OCR and Results Parsing: layoutparser can be used for conveniently OCR documents and convert the output in to structured data.

  2. Deep Layout Parsing Example: With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

DL Assisted Layout Prediction Example

Example Usage

The images shown in the figure above are: a screenshot of this paper, an image from the PRIMA Layout Analysis Dataset, a screenshot of the WSJ website, and an image from the HJDataset.

With only 4 lines of code in layoutparse, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the ModelZoo, or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it:

>>> import layoutparser as lp
>>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config')
>>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...)
>>> lp.draw_box(image, layout,) # With extra configurations

Citing layoutparser

If you find layoutparser helpful to your work, please consider citing our tool and paper using the following BibTeX entry.

@article{shen2021layoutparser,
  title={LayoutParser: A Unified Toolkit for Deep Learning Based Document Image Analysis},
  author={Shen, Zejiang and Zhang, Ruochen and Dell, Melissa and Lee, Benjamin Charles Germain and Carlson, Jacob and Li, Weining},
  journal={arXiv preprint arXiv:2103.15348},
  year={2021}
}
Comments
  • Apply detect() on readable PDF files

    Apply detect() on readable PDF files

    Hi there, from the docs I infere that detect() operates, for example, on PIL.Image objects. Is there way to directly operate on already readable PDF files (which obviates the need applying OCR as well). Greetings

    enhancement 
    opened by simonschoe 12
  • AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Hi,

    Thank you for this awesome program! I successfully installed layout-parser Detectron2 on my windows 10 laptop. When I run the following code:

    import layoutparser as lp import cv2 from pdf2image import convert_from_bytes

    images = convert_from_bytes(open('C:\temp\ConsigneeList\Doc 4 Distribution List.pdf', 'rb').read())

    model = lp.Detectron2LayoutModel( config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional ) #loop through each page for image in images: ocr_agent = lp.ocr.TesseractAgent()

    image = np.array(image)
    
    layout = model.detect(image)
    

    text_blocks = lp.Layout([b for b in layout if b.type == 'Text']) #loop through each text box on page.

    for block in text_blocks: segment_image = (block .pad(left=5, right=5, top=5, bottom=5) .crop_image(image)) text = ocr_agent.detect(segment_image) block.set(text=text, inplace=True)

    for i, txt in enumerate(text_blocks.get_texts()):
            my_file = open("OUTPUT FILE PATH/FILENAME.TXT","a+")
            my_file.write(txt)
    

    I get the following errors:


    AttributeError Traceback (most recent call last) in ----> 1 model = lp.Detectron2LayoutModel( 2 config_path ='lp://PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config', # In model catalog 3 label_map = {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In modellabel_map 4 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional 5 )

    C:\ProgramData\Anaconda3\lib\site-packages\layoutparser\file_utils.py in getattr(self, name) 224 value = getattr(module, name) 225 else: --> 226 raise AttributeError(f"module {self.name} has no attribute {name}") 227 228 setattr(self, name, value)

    AttributeError: module layoutparser has no attribute Detectron2LayoutModel

    Any ideas on what is wrong? Thank you!!

    Sincerely,

    tom

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version, see the Layout Parser Releases

    To Reproduce Steps to reproduce the behavior:

    1. What command or script did you run?
    A placeholder for the command.
    

    Environment

    1. Please describe your Platform [Windows/MacOS/Linux]
    2. Please show the Layout Parser version
    3. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error traceback here.

    Screenshots If applicable, add screenshots to help explain your problem.

    Additional context Add any other context about the problem here.

    bug 
    opened by theiman112860 10
  • 'GCVAgent' object has no attribute '_client'

    'GCVAgent' object has no attribute '_client'

    Hi, when I was running the tutorial of "OCR tables and parse the output", when I was trying to obtain the result:

    res = ocr_agent.detect(image, return_response=True)

    The response was

    Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 168, in detect res = self._detect(img_content) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/layoutparser/ocr/gcv_agent.py", line 134, in _detect response = self._client.document_text_detection( AttributeError: 'GCVAgent' object has no attribute '_client'

    I googled and some sites said The Client() class was removed in the Client Library v0.25.1 and replaced with ImageAnnotatorClient().

    Was this a problem? Thank you.

    bug 
    opened by junxi-liu 8
  • Error installing dependencies

    Error installing dependencies

    Hi Team, Thank you for all the great work. It looks amazing. I tried installing pip install layoutparser but it thrown me the below error, can you please let me know how to rectify this,

    ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-wheel-awmfv0cr' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (22 lines): running bdist_wheel running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext cythoning pycocotools/_mask.pyx to pycocotools_mask.c C:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\Cython\Compiler\Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\pycocotools_mask.pyx tree = Parsing.p_module(s, pxd, full_module_name) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\pss.ch\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2

    ERROR: Failed building wheel for pycocotools ERROR: Command errored out with exit status 1: command: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\Users\pss.ch\AppData\Roaming\Python\Python38\Include\pycocotools' cwd: C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632
    Complete output (20 lines): running install running build running build_py creating build creating build\lib.win-amd64-3.8 creating build\lib.win-amd64-3.8\pycocotools copying pycocotools\coco.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools\mask.py -> build\lib.win-amd64-3.8\pycocotools copying pycocotools_init_.py -> build\lib.win-amd64-3.8\pycocotools running build_ext skipping 'pycocotools_mask.c' Cython extension (up-to-date) building 'pycocotools._mask' extension creating build\temp.win-amd64-3.8 creating build\temp.win-amd64-3.8\Release creating build\temp.win-amd64-3.8\Release\common creating build\temp.win-amd64-3.8\Release\pycocotools C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\AppData\Roaming\Python\Python38\site-packages\numpy\core\include -I./common "-IC:\Program Files\Anaconda\include" "-IC:\Program Files\Anaconda\include" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\INCLUDE" "-IC:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\ATLMFC\INCLUDE" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.10240.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\8.1\include\shared" "-IC:\Program Files (x86)\Windows Kits\8.1\include\um" "-IC:\Program Files (x86)\Windows Kits\8.1\include\winrt" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.8\Release./common/maskApi.obj -Wno-cpp -Wno-unused-function -std=c99 cl : Command line error D8021 : invalid numeric argument '/Wno-cpp' error: command 'C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\BIN\x86_amd64\cl.exe' failed with exit status 2 ---------------------------------------- ERROR: Command errored out with exit status 1: 'C:\Program Files\Anaconda\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"'; file='"'"'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-install-s13j7o41\pycocotools_6c1fc2cce84542a8be1c0cbeacfda632\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\Public\Documents\Wondershare\CreatorTemp\pip-record-w4euj5sb\install-record.txt' --single-version-externally-managed --user --prefix= --compile --install-headers 'C:\AppData\Roaming\Python\Python38\Include\pycocotools' Check the logs for full command output.

    opened by sriprad 8
  • enforce_cpu not working

    enforce_cpu not working

    When setting enforce_cpu true, still using CUDA instead of CPU. I think it is due to this https://github.com/Layout-Parser/layout-parser/blob/e035fc8f952addc620670e5b47864fe213db0e10/src/layoutparser/models/layoutmodel.py#L120

    Possible fix could be cfg.MODEL.DEVICE = "cuda" if torch.cuda.is_available() and (not enforce_cpu) else "cpu"

    bug 
    opened by lkluo 5
  • Adding support for mathematical formula recognition

    Adding support for mathematical formula recognition

    Have you considered adding support for mathematical formula recognition? Identifying the position of mathematical formulas in documents has always been a problem.

    modeling 
    opened by SleepyCelery 5
  • draw_box draw only one box from layout

    draw_box draw only one box from layout

    Describe the bug I just installed everything according to the installation guide and launched your jupyter notebook from here Deep Layout Parsing Example. After first draw_box it's show only one box, but in print(layout) i see all boxes. Same with second draw_box from your guide. not sure what i'm doing wrong.

    To Reproduce Steps to reproduce the behavior:

    1. installation guide + detectron2 install also from your guide
    2. Run jupyter notebook

    Environment

    1. MacOS
    2. VS Code
    3. Here some stuff from pip:
    torch==1.11.0
    torchvision==0.12.0
    Pillow==9.1.0
    opencv-python==4.5.5.64
    layoutparser==0.3.3
    

    Error traceback No errors, just behaviour not same like in guide or other guides

    Screenshots attached

    output1 output2

    bug 
    opened by Moo1234567 4
  • Gives wrong results when the code is run for some images in a loop

    Gives wrong results when the code is run for some images in a loop

    The code works when it is run for a single image. But when I run the same code in a loop for few images from the publaynet dataset, cached results seem to apply (i.e. The bounding boxes overlap and the boxes for the previous images are also put in the current image).

    opened by surajsubramanian 4
  • ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)

    While using this code, I get this error of Pillow. I tried re-installing pillow but still struggling with this issue. Any help to make this code run?

    import layoutparser as lp
    model = lp.Detectron2LayoutModel(
                config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
                label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
                extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
            )
    model.detect(image)
    

    Getting this error:

    ImportError                               Traceback (most recent call last)
    [<ipython-input-6-59f0fb07b7e3>](https://localhost:8080/#) in <module>
          1 import layoutparser as lp
    ----> 2 model = lp.Detectron2LayoutModel(
          3             config_path ='lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', # In model catalog
          4             label_map   ={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}, # In model`label_map`
          5             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8] # Optional
    
    31 frames
    [/usr/local/lib/python3.7/dist-packages/PIL/ImageFont.py](https://localhost:8080/#) in <module>
         35 from . import Image
         36 from ._deprecate import deprecate
    ---> 37 from ._util import is_directory, is_path
         38 
         39 
    
    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.7/dist-packages/PIL/_util.py)
    
    
    opened by arhamshah 3
  • TypeError: inner() got an unexpected keyword argument 'image_context'

    TypeError: inner() got an unexpected keyword argument 'image_context'

    Hello! Recently encountered an issue when trying to use Google's OCR when running ocr_agent.detect

    Running this:

    image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
    ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    res = ocr_agent.detect(image, return_response=True)
    

    Gives me the following error:

    TypeError                                 Traceback (most recent call last)
    <ipython-input-9-76614ef6a3e8> in <module>
          1 image = cv2.imread("/Users/liz/Documents/Projects/LayoutParser/test2.png")
          2 ocr_agent = lp.GCVAgent.with_credential("/Users/liz/Documents/Projects/Keys/GoogleCloud/vision-341523-e3cbd0df8d19.json",languages = ['en'])
    ----> 3 res = ocr_agent.detect(image, return_response=True)
          4 
          5 #layout = ocr_agent.gather_full_text_annotation(res, agg_level=lp.GCVFeatureType.WORD)
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in detect(self, image, return_response, return_only_text, agg_output_level)
        222                 img_content = image_file.read()
        223 
    --> 224         res = self._detect(img_content)
        225 
        226         if return_response:
    
    /opt/homebrew/Caskroom/miniforge/base/envs/data310/lib/python3.9/site-packages/layoutparser/ocr.py in _detect(self, img_content)
        188     def _detect(self, img_content):
        189         img_content = self._vision.types.Image(content=img_content)
    --> 190         response = self._client.document_text_detection(
        191             image=img_content, image_context=self._context
        192         )
    
    TypeError: inner() got an unexpected keyword argument 'image_context'
    

    Not sure what it is caused by, might be user error but I haven't been able to find anything else about it and I've tried everything I can think of (all the packages are up to date (or in google cloud vision's case, downgraded to stay on the old API). Thanks!

    bug 
    opened by liz-goodwin 3
  • bad result detected

    bad result detected

    I got bad result using layout-parser here is the image I am used: 1

    here is the code run in python :

    image = cv2.imread("1.png")
    # Convert the image from BGR (cv2 default loading style)
    # to RGB
    image = image[..., ::-1]
    origin_image = image.copy()
    
    model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
    # Load the deep layout model from the layoutparser API 
    # For all the supported model, please check the Model 
    # Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html
    
    layout = model.detect(image)
    # print("layout : ", layout)
    # Detect the layout of the input image
    text_blocks = lp.Layout([b for b in layout if b.type=='Text'])
    drawRectangleInImage(origin_image, text_blocks, (36,255,12))
    
    titles_blocks = lp.Layout([b for b in layout if b.type=='Title'])
    drawRectangleInImage(origin_image, titles_blocks, (76, 155, 175))
    
    figure_blocks = lp.Layout([b for b in layout if b.type=='Figure'])
    drawRectangleInImage(origin_image, figure_blocks, (122, 96, 216))
    
    lists_blocks = lp.Layout([b for b in layout if b.type=='List'])
    drawRectangleInImage(origin_image, lists_blocks, (176, 155, 175))
    
    tables_blocks = lp.Layout([b for b in layout if b.type=='Table'])
    drawRectangleInImage(origin_image, tables_blocks, (76, 255, 75))
    
    cv2.imshow('image', origin_image)
    cv2.waitKey()
    

    here is the result:

    截屏2022-01-18 11 45 06

    by the way :

    there is some warning generated :

    /usr/local/lib/python3.9/site-packages/detectron2/structures/image_list.py:99: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). max_size = (max_size + (stride - 1)) // stride * stride /usr/local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

    bug 
    opened by DamonsJ 3
  • Any idea about Detectron gets overlapping and sometimes misses some blocks

    Any idea about Detectron gets overlapping and sometimes misses some blocks

    The problem I am currently using layout-parser to detect the blocks of a scanned book papers and trying to take each block separately from the page and do some processing over them.

    Checklist

    To Reproduce

    import layoutparser as lp
    import cv2
    
    image = cv2.imread("/content/image_0.jpg")
    # Convert the image from BGR (cv2 default loading style) to RGB
    image = image[..., ::-1]
    
    model = lp.Detectron2LayoutModel((lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config),
                                     extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                     label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
    
    
    # Detect the layout of the input image
    layout = model.detect(image)
    
    # Show the detected layout of the input image
    lp.draw_box(image, layout, box_width=3)
    

    Environment

    1. Platform [Linux] (on colab)
    2. Installation commands
    !sudo apt-get update
    !sudo apt-get install libleptonica-dev tesseract-ocr libtesseract-dev python3-pil tesseract-ocr-eng tesseract-ocr-script-latn
    !pip install layoutparser	
    !pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"	
    !pip install "layoutparser[ocr]"	
    !pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit 
    

    Screenshots

    1- Overlapping |3|image_3| |---|---|

    2- Missing |7|image_7| |---|---|

    I know it may not the right place to release that issue, but I think you may have an idea about that problem

    bug 
    opened by rrrokhtar 0
  • [Bug] has_torch_function_variadic error

    [Bug] has_torch_function_variadic error

    Describe the bug When attempting to initialise a model (I've tried with AutoLayoutModel and Detectron2LayoutModel), torch.jit throws a RuntimeError as below...

    RuntimeError: 
    undefined value has_torch_function_variadic:
      File "/opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py", line 2962
             >>> loss.backward()
        """
        if has_torch_function_variadic(input, target, weight, pos_weight):
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            return handle_torch_function(
                binary_cross_entropy_with_logits,
    'binary_cross_entropy_with_logits' is being compiled since it was called from 'sigmoid_focal_loss'
      File "/opt/conda/lib/python3.8/site-packages/fvcore/nn/focal_loss.py", line 34
        """
        p = torch.sigmoid(inputs)
        ce_loss = F.binary_cross_entropy_with_logits(inputs, targets, reduction="none")
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
        p_t = p * targets + (1 - p) * (1 - targets)
        loss = ce_loss * ((1 - p_t) ** gamma)
    

    To Reproduce Steps to reproduce the behavior:

    1. Install layout-parser, OpenCV, Detectron2 as below
    %pip install opensearch-py opencv-python --quiet
    %pip install -U layoutparser[ocr] --quiet
    !python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cpu/torch1.10/index.html
    
    1. Import layoutparser and attempt to init model with lp.models.Detectron2LayoutModel(...)
    2. Error appears

    Environment Linux with layoutparser latest

    bug 
    opened by lucafrost 0
  • cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    cannot import name 'is_directory' from 'PIL._util'(lp.Detectron2LayoutModel)

    Describe the bug When I tried the sample codes:

    !pip install layoutparser
    !pip install 'git+https://github.com/facebookresearch/[email protected]#egg=detectron2'
    
    import layoutparser as lp
    import cv2
    import PIL
    
    image = cv2.imread("image.png")
    model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')
    layout = model.detect(image)
    

    Colab link(Python 3.8.16): https://colab.research.google.com/drive/1lb8_Pcw8_NNdeKPL80HOYca8gaCB0f-E?usp=sharing

    I got an error on this line:

    lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config')

    The error message is:

    ImportError: cannot import name 'is_directory' from 'PIL._util' (/usr/local/lib/python3.8/dist-packages/PIL/_util.py)

    I hope that I can get your help. Thanks!

    bug 
    opened by sudoghut 0
  • [Fix] reduce memory consumption and close pdf stream after usage

    [Fix] reduce memory consumption and close pdf stream after usage

    Flushes the pages and pdf afterwards to reduce the memory/ram consumption.

    Opens the pdf stream as a context manager so that the file is closed afterwads.

    opened by jakobnrmnn 0
  • Minor installation instruction error

    Minor installation instruction error

    On Mac, the command

    pip3 install -U layoutparser[ocr]
    

    doesn't work (returns "zsh: no matches found: layoutparser[ocr]"), you need to do

    pip3 install -U "layoutparser[ocr]"
    
    bug 
    opened by bholtdwyer 0
Releases(v0.3.4)
  • v0.3.4(Apr 6, 2022)

    Bug fixes

    • fix one critical bug for visualization mentioned in #131 by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/132

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.3...v0.3.4

    Source code(tar.gz)
    Source code(zip)
  • v0.3.3(Apr 3, 2022)

    Functional Updates

    • Robust pdf loading for empty pages by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/115
    • fix to issue #94 -- avoiding TesseractAgent.detect() inferring any sequence of digit as float by @k-for-code in https://github.com/Layout-Parser/layout-parser/pull/95
    • Better layout comparison by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/128
    • Better visualization functions by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/129

    Example Updates

    • Minor update to Deep Learning Parser example notebook by @Jim-Salmons in https://github.com/Layout-Parser/layout-parser/pull/56
    • Set inplace to True in sorting function by @yusanshi in https://github.com/Layout-Parser/layout-parser/pull/104
    • Add notebook for customizing LayoutParser Models with Label Studio Annotation by @lolipopshock in https://github.com/Layout-Parser/layout-parser/pull/124

    New Contributors

    • @Jim-Salmons made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/56
    • @yusanshi made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/104
    • @k-for-code made their first contribution in https://github.com/Layout-Parser/layout-parser/pull/95

    Full Changelog: https://github.com/Layout-Parser/layout-parser/compare/v0.3.2...v0.3.3

    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(Sep 23, 2021)

    Important fixes for multibackend layout model support:

    • Resolves the issues mentioned in #78 with other fixes to improve the multibackend layout model support #79
    • Better tests for different backends #79 for preventing future related issues
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Sep 15, 2021)

    • Fixes for automatically setting label_map in Detectron2LayoutModel #75
    • Remove unnecessary class annotations (that might breaks Python 3.6 users) #75
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Sep 13, 2021)

    We are excited to release LayoutParser v0.3.0, with a lot of exciting updates and functional improvements.

    New Features

    • The biggest change in this version is that LayoutParser now supports multiple deep learning backends: Detectron2, effdet, and paddledetection. This allows for more flexible usage of the layoutparser library, and makes it easier for implementing customized layout models in the future. #54 #67
    • Additionally, the newly added AutoModel and improved model configuration parsing makes it easier load and use the layout detection models. #69
      • e.g, model = lp.AutoLayoutModel("lp://efficientdet/PubLayNet").
    • To support this multi-backend framework, we implement the dynamic importing mechanism as well as better ways for installing layoutparser and the needed dependencies (see instructions). #65 #68
    • And now layoutparser supports directly loading PDF files into as layout objects: #71
      import layoutparser as lp
      pdf_layout, pdf_images = lp.load_pdf("path/to/pdf", load_images=True)
      lp.draw_box(pdf_images[0], pdf_layout[0])
      
    • To support more flexible processing of the layout objects, a set of new toolkits are available: #72
      import layout parser as lp
      page_layout = lp.load_pdf("tests/fixtures/io/example.pdf")[0]
      pdf_lines = lp.simple_line_detection(page_layout)
      

    New Models

    • Add MFD model that can detect (display) equation regions within scientific documents #59
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Apr 12, 2021)

    Layout Parser v0.2.0 Release Notes

    New Features

    1. Support for loading and exporting the layout data in json and csv , see #6
    2. Add support for union and intersect operations, see #20 and the detailed explanation

    Improvements

    1. Functional improvements:
      1. When loading Layout Parser official models, Detectron2LayoutModel can automatically detect the label_map, . For example,

        model = lp.Detectron2LayoutModel("lp://HJDataset/faster_rcnn_R_50_FPN_3x/config")
        model.label_map
        # {1: 'Page Frame', ... }
        
      2. Detectron2LayoutModel now supports the enforce_cpu flag that enforces using cpu even when CUDA devices are available.

      3. For visualization.draw_box, it now supports a show_element_type flag that shows the bbox category name on the top left corner of the layout objects.

    2. Improve installation command and documentation, especially for installing Detectron2 on Windows platforms #25

    New Models

    1. Add the table bank detection models that can identify table regions

    Fixes

    1. Fix the incorrect layout issue mentioned in #9 - Thanks to @remidbs.
    2. Fix the some of the dependency issues mentioned in #11 and #13 by using iopath instead of fvcore. See #18, Thanks to @edisongustavo.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.3(Dec 21, 2020)

    Improvements:

    • Supports lazy loading for the Detectron2 module. Now the dependency for Detectron2 will be requested only when you explicitly create a Detectron2LayoutModel object. This might be helpful for using the plain layoutparser library without installing the Detectron2 module.

    New models:

    • Incorporated a pre-trained model based on the NewspaperNavigator dataset: lp://NewspaperNavigator/faster_rcnn_R_50_FPN_3x/config

    Fixes:

    • Corrected a bug in visualization that might overwrite original the image
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Oct 30, 2020)

    In this version, we released a new model for publaynet and made several improvements:

    1. We released the mask_rcnn_X_101_32x8d_FPN_3x model trained on the publaynet dataset. Note: it's been trained on the full training set (while others are only trained on the validation set), and you could expect a 15% performance improvement based on this new model.
    2. We improved the support for PIL images for both layout modeling and visualization
    3. We improved the Default Language Settings for the Tesseract OCR model
    Source code(tar.gz)
    Source code(zip)
  • v0.1.1(Jul 16, 2020)

    Fixes

    • Fixed a bug that could cause errors in loading Prima Models

    Updates

    • Update the prima MASK RCNN model with higher accuracy, and listed detailed evaluation reports.
    Source code(tar.gz)
    Source code(zip)
  • v0.1.0(Jun 24, 2020)

    layoutparser now supports the following functionalities:

    • Coordinate system:

      • Supports the 3 basic coordinate system and their geometric relationships
      • Supports the TextBlook and Layout system for convenient coordinate and text processing
    • OCR System:

      • Supports OCR based on Google Cloud Vision and Tesseract API.
    • Layout Modeling:

      • Supports using pre-trained Deep Learning models for layout object detection using Detection2
    • Visualization:

      • Supports highly-customizable presentation of the box coordinates and text in the detected layout
    Source code(tar.gz)
    Source code(zip)
A document format conversion service based on Pandoc.

reformed Document format conversion service based on Pandoc. Usage The API specification for the Reformed server is as follows: GET /api/v1/formats: L

David Lougheed 3 Jul 18, 2022
A simple document management REST based API for collaboratively interacting with documents

documan_api A simple document management REST based API for collaboratively interacting with documents.

Shahid Yousuf 1 Jan 22, 2022
Yet Another MkDocs Parser

yamp Motivation You want to document your project. You make an effort and write docstrings. You try Sphinx. You think it sucks and it's slow -- I did.

Max Halford 10 May 20, 2022
Parser manager for parsing DOC, DOCX, PDF or HTML files

Parser manager Description Parser gets PDF, DOC, DOCX or HTML file via API and saves parsed data to the database. Implemented in Ruby 3.0.1 using Acti

Эдем 4 Dec 4, 2021
xeuledoc - Fetch information about a public Google document.

xeuledoc - Fetch information about a public Google document.

Malfrats Industries 651 Dec 27, 2022
API spec validator and OpenAPI document generator for Python web frameworks.

API spec validator and OpenAPI document generator for Python web frameworks.

1001001 249 Dec 22, 2022
Mayan EDMS is a document management system.

Mayan EDMS is a document management system. Its main purpose is to store, introspect, and categorize files, with a strong emphasis on preserving the contextual and business information of documents. It can also OCR, preview, label, sign, send, and receive thoses files.

null 3 Oct 2, 2021
Word document generator with python

In this study, real world data is anonymized. The content is completely different, but the structure is the same. It was a script I prepared for the backend of a work using UiPath.

Ezgi Turalı 3 Jan 30, 2022
Searches a document for hash tags. Support multiple natural languages. Works in various contexts.

ht-getter Searches a document for hash tags. Supports multiple natural languages. Works in various contexts. This package uses a non-regex approach an

Rairye 1 Mar 1, 2022
Coursera learning course Python the basics. Programming exercises and tasks

HSE_Python_the_basics Welcome to BAsics programming Python! You’re joining thousands of learners currently enrolled in the course. I'm excited to have

PavelRyzhkov 0 Jan 5, 2022
Quick tutorial on orchest.io that shows how to build multiple deep learning models on your data with a single line of code using python

Deep AutoViML Pipeline for orchest.io Quickstart Build Deep Learning models with a single line of code: deep_autoviml Deep AutoViML helps you build te

Ram Seshadri 6 Oct 2, 2022
Plotting and analysis tools for ARTIS simulations

Artistools Artistools is collection of plotting, analysis, and file format conversion tools for the ARTIS radiative transfer code. Installation First

ARTIS Monte Carlo Radiative Transfer 8 Nov 7, 2022
Docov - Light-weight, recursive docstring coverage analysis for python modules

docov Light-weight, recursive docstring coverage analysis for python modules. Ov

Richard D. Paul 3 Feb 4, 2022
DataAnalysis: Some data analysis projects in charles_pikachu

DataAnalysis DataAnalysis: Some data analysis projects in charles_pikachu You can star this repository to keep track of the project if it's helpful fo

null 9 Nov 4, 2022
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

P3 Ranker Implementation for our SIGIR2022 accepted paper: P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-bas

null 14 Jan 4, 2023
A tool that allows for versioning sites built with mkdocs

mkdocs-versioning mkdocs-versioning is a plugin for mkdocs, a tool designed to create static websites usually for generating project documentation. mk

Zayd Patel 38 Feb 26, 2022
Gaphor is the simple modeling tool

Gaphor Gaphor is a UML and SysML modeling application written in Python. It is designed to be easy to use, while still being powerful. Gaphor implemen

Gaphor 1.3k Jan 3, 2023
This is a tool to make easier brawl stars modding using csv manipulation

Brawler Maker : Modding Tool for Brawl Stars This is a tool to make easier brawl stars modding using csv manipulation if you want to support me, just

null 6 Nov 16, 2022
Sms Bomber, Tool Encryptor

ɴᴏʙɪᴛᴀシ︎ ғᴏʀ ᴀɴʏ ʜᴇʟᴘシ︎ Install pkg install git -y pkg install python -y pip install requests git clone https://github.com/AK27HVAU/akash Run cd Akash

ɴᴏʙɪᴛᴀシ︎ 4 May 23, 2022