About Library for extract infomation from thai personal identity card.

Overview

ThaiPersonalCardExtract

Downloads PyPI Status license Instragram

Library for extract infomation from thai personal identity card. imprement from easyocr and tesseract

New Feature v1.3.2 🎁

  • Increase performance.
  • Support Thai Government Lottery āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļˆāļēāļāļĨāļ­āļ•āđ€āļ•āļ­āļĢāđŒāļĢāļĩāđˆ āđƒāļŠāđ‰āđ„āļ”āđ‰āļ”āļĩāļāļąāļšāļĢāļđāļ›āļ āļēāļžāļ—āļĩāđˆāđ„āļ”āđ‰āļˆāļēāļāđ€āļ„āļĢāļ·āđˆāļ­āļ‡āđāļŠāļāļ™ (16 Aug. 2021)
  • Refactor Output Structure.
  • Support Thai Driving License (Beta) āļŠāļēāļĄāļēāļĢāļ–āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļˆāļēāļāļ āļēāļžāļ–āđˆāļēāļĒāđƒāļšāļ‚āļąāļšāļ‚āļĩāđˆāđ„āļ”āđ‰āļšāļēāļ‡āļĢāļđāļ›āđāļšāļš āđ€āļ™āļ·āđˆāļ­āļ‡āļˆāļēāļ āļāļĢāļĄāļ—āļēāļ‡āļ‚āļ™āļŠāđˆāļ‡āļ—āļēāļ‡āļšāļ āļĄāļĩāļĢāļđāļ›āđāļšāļšāļšāļąāļ•āļĢāļŦāļĨāļēāļāļŦāļĨāļēāļĒāļĢāļđāļ›āđāļšāļš āđāļĨāļ°āđāļ•āđˆāļĨāļ°āļĢāļđāļ›āđāļšāļšāļĄāļĩāļ•āļģāđāļŦāļ™āđˆāļ‡āļ‚āđ‰āļ­āļĄāļđāļĨāļ—āļĩāđˆāđāļ•āļāļ•āđˆāļēāļ‡āļāļąāļ™ āļˆāļķāļ‡āļ—āļģāđƒāļŦāđ‰āļ›āļĢāļ°āļŠāļīāļ—āļ˜āļīāļ āļēāļžāļ•āđˆāļģ

Examples

Example image file.

Real image file Real image file Real image file

wrapPerpective image crop.

wrapPerpective image crop wrapPerpective image crop

keypoint of image detected.

keypoint of image detected

Resutls of library extract region of interest

Identification Number

FullNameTH

NameEN

LastNameEN

BirthdayTH

BirthdayEN

Religion

Address

DateOfIssueTH

DateOfIssueEN

DateOfExpiryTH

DateOfExpiryEN

Recommend ⚠

  • Image quality lowest should be 600x350
  • Images with minimal reflections should be used. for good results
  • Identity Card should be size in the image about 75%, if the image doesn't cropped that to be left only Identity Card area.
  • For faster, please resize image and usage CUDA GPU.

Installation

Install using pip for stable release,

pip install thai-personal-card-extract

For latest development release,

pip install git+git://github.com/ggafiled/ThaiPersonalCardExtrac.git

Note 1: for Windows, please install tesseract first by following the official instruction here https://medium.com/@navapat.tpb/734dae2fb4d3 On medium website, be sure to setup already.

Note 2: for Linux os, please install tesseract by following the official instruction https://github.com/tesseract-ocr/tesseract

Usage

# With build-in Config Options. 

import ThaiPersonalCardExtract as card
reader = card.PersonalCard(
    lang=card.THAI,
    provider=card.DEFAULT,
    tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract",
    save_extract_result=True,
    path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract")
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style āļ•āļąāļ§āļ­āļĒāđˆāļēāļ‡āļāļēāļĢāđ€āļĢāļĩāļĒāļāđƒāļŠāđ‰āļ‡āļēāļ™āļ„āļĨāļēāļŠ PersonalCard āđ€āļžāļ·āđˆāļ­āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļšāļąāļ•āļĢāļ›āļĢāļ°āļˆāļģāļ•āļąāļ§āļ›āļĢāļ°āļŠāļēāļŠāļ™ 

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style āļ•āļąāļ§āļ­āļĒāđˆāļēāļ‡āļāļēāļĢāđ€āļĢāļĩāļĒāļāđƒāļŠāđ‰āļ‡āļēāļ™āļ„āļĨāļēāļŠ DrivingLicense āđ€āļžāļ·āđˆāļ­āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāđƒāļšāļ­āļ™āļļāļāļēāļ•āļ‚āļąāļšāļ‚āļĩāđˆ

from ThaiPersonalCardExtract import DrivingLicense
reader = DrivingLicense(lang="mix", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)


# With free-style āļ•āļąāļ§āļ­āļĒāđˆāļēāļ‡āļāļēāļĢāđ€āļĢāļĩāļĒāļāđƒāļŠāđ‰āļ‡āļēāļ™āļ„āļĨāļēāļŠ ThaiGovernmentLottery āđ€āļžāļ·āđˆāļ­āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļĨāļ­āļ•āđ€āļ•āļ­āļĢāđŒāļĢāļĩāđˆ

from ThaiPersonalCardExtract import ThaiGovernmentLottery
reader = ThaiGovernmentLottery(save_extract_result=True, path_to_save="D:/dev/ThaiPersonalCardExtract/examples/extract/thai_government_lottery") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo("../examples/card7.jpg")
print(result)

Output will be in list format, each item represents result of library can extract, respectively. type of namedtuple āļœāļĨāļĨāļąāļžāļ˜āđŒāļ—āļĩāđˆāđ„āļ”āđ‰āļˆāļ°āđ€āļ›āđ‡āļ™āļ›āļĢāļ°āđ€āļ āļ— namedtuple āļŠāļēāļĄāļēāļĢāļ–āļĻāļķāļāļĐāļēāđ€āļžāļīāđˆāļĄāđ€āļ•āļīāļĄāđ€āļžāļ·āđˆāļ­āđƒāļŠāđ‰āļ‡āļēāļ™āđ„āļ”āđ‰āļˆāļēāļāļ—āļĩāđˆāļ™āļĩāđˆ āļ„āļĨāļīāļ

#Output of PersonalCard
    Card(Identification_Number='9999999999999', FullNameTH='āļ™āļēāļĒ āļ­āļēāļĒāļļāļĄāļšāļĄāļļāļĢāļēāđ€āļŠāļ°', PrefixTH='āļ™āļēāļĒ', NameTH='āļ­āļēāļĒāļļāļĄāļšāļĄāļļāļĢāļēāđ€āļŠāļ°', LastNameTH='āļ­āļēāļĒāļļāļĄāļšāļĄāļļāļĢāļēāđ€āļŠāļ°', PrefixEN='.Mr.Shoyo', NameEN='', LastNameEN='Hinatao', BirthdayTH='21 āļĄāļĩ.āļĒ. 2539', BirthdayEN='21 Jun..1996', Religion='āļžāļļāļ—āļ˜', Address='āļ—8āļ›āļš` 99/1 āļĄāļīāļ‹āļĩāđ‚āļŪāļ° āđ€āļ‚āļ•āļŪāļēāļ™āļēāļĄāļīāļāļēāļ§āļē āļ­āļģāđ€āļ āļ­āļŠāļīāļš', DateOfIssueTH='11 āļŠ.āļ„. 2554', DateOfIssueEN='11 Ang. 2021', DateOfExpiryTH='11 āļŠ.āļ„. 2574', DateOfExpiryEN='11 Aug. 2031,')

#Output of DrivingLicense
    Card(License_Number='98765432', IssueDateTH='āļœāļąāļ‡āļ—āļēāļ—āļĄ', ExpiryDateTH='', IssueDateEN='14 August 2664', ExpiryDateEN='14 August 2574', NameTH='āļē? āđ‚āļ™āļšāļāļ° āđ‚āļ™āļšāļĩ', NameEN='MRONOREAUMANE', BirthDayTH='', BirthDayEN='wa hs OKRA', Identity_Number='', Province='āļ™āļ„āļēāļĢāļēāļŠāļĻāļĩāļĄāļē')

#Output of ThaiGovernmentLottery
    Lottery(LotteryNumber='424603', LessonNumber='08', SetNumber='23', Year='2564') #type namedtuple 
    
 āļŠāļēāļĄāļēāļĢāļ–āđ€āļ‚āđ‰āļēāļ–āļķāļ‡āļ•āļąāļ§āđāļ›āļĢāđ„āļ”āđ‰āļ•āļēāļĄāļĢāļđāļ›āđāļšāļšāļ™āļĩāđ‰
 print(result.LotteryNumber)
 print(result.LessonNumber)
 print(result.SetNumber)
 print(result.Year)

For set lang attribute to tha

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Output will be in list format, each item represents result of library can extract, respectively.

{
   "Identification_Number": "9999999999999",
   "FullNameTH": "āļ™āļēāļĒ āļ­āļēāļĒāļļāļĄāļšāļĄāļļāļĢāļēāđ€āļŠāļ°",
   "PrefixTH": "āļ™āļēāļĒ",
   "NameTH": "āļ­āļēāļĒāļļāļĄāļšāļĄāļļāļĢāļēāđ€āļŠāļ°",
   "LastNameTH": "āļ­āļēāļĒāļļāļĄāļšāļĄāļļāļĢāļēāđ€āļŠāļ°",
   "BirthdayTH": "21 āļĄāļĩ.āļĒ. 2539",
   "Religion": "āļžāļļāļ—āļ˜",
   "Address": "āļ—āđ’ 99/1 āļĄāļīāļŠāļĩāđ‚āļŪāļ° āđ€āļ‚āļ•āļŪāļēāļ™āļēāļĄāļīāļāļēāļ§āļē āļ­āļģāđ€āļ āļ­āļŠāļīāļš;",
   "DateOfIssueTH": "11 āļŠ.āļ„. 2554",
   "DateOfExpiryTH": "11 āļŠ.āļ„. 2574"
}

And you can set ocr provider following below default #used both easyocr and tesseract **Recommend Or easyocr Or tesseract

from ThaiPersonalCardExtract import PersonalCard
reader = PersonalCard(lang="tha", provider="default", tesseract_cmd="D:/Program Files/Tesseract-OCR/tesseract") # for windows need to pass tesseract_cmd parameter to setup your tesseract command path.
result = reader.extractInfo('examples/card.jpg')
print(result)

Config Options

you can set options to Instance by below keyword

Parameter name Value Type Example
lang String Expected Results Language bash mix #get all area both tha and eng Or bash tha Or bash eng *Default is 'mix' āļŠāļģāļŦāļĢāļąāļš DrivingLicense, PersonalCard
provider String OCR Provider have bash default #used both easyocr and tesseract **Recommend Or bash easyocr Or bash tesseract *Default is 'default' āļŠāļģāļŦāļĢāļąāļš DrivingLicense, PersonalCard
template_threshold Double Rate to cals similarity of template *Default is 0.7
sift_rate Int Feature Keypoint rate *Default is 25,000
tesseract_cmd String Path of your tesseract command **For windows only.
save_extract_result Boolean Set True if you want to save extracted image *Default is False
path_to_save String Path that you given it save extracted image, relative with save_extract_result=True

Donate Me ☕

promptpay

Mr.Nattapol Krobklang

You might also like...
extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTsite python Package extract gene TSS/TES site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install Get

A functional standard library for Python.

Toolz A set of utility functions for iterators, functions, and dictionaries. See the PyToolz documentation at https://toolz.readthedocs.io LICENSE New

ðŸ”Đ Like builtins, but boltons. 250+ constructs, recipes, and snippets which extend (and rely on nothing but) the Python standard library. Nothing like Michael Bolton.

Boltons boltons should be builtins. Boltons is a set of over 230 BSD-licensed, pure-Python utilities in the same spirit as — and yet conspicuously mis

Retrying library for Python

Tenacity Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything.

Retrying Retrying is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.
isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type.

isort is a Python utility / library to sort imports alphabetically, and automatically separated into sections and by type. It provides a command line utility, Python library and plugins for various editors to quickly sort all your imports.

A Python library for reading, writing and visualizing the OMEGA Format
A Python library for reading, writing and visualizing the OMEGA Format

A Python library for reading, writing and visualizing the OMEGA Format, targeted towards storing reference and perception data in the automotive context on an object list basis with a focus on an urban use case.

RapidFuzz is a fast string matching library for Python and C++

RapidFuzz is a fast string matching library for Python and C++, which is using the string similarity calculations from FuzzyWuzzy

pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.
pydsinternals - A Python native library containing necessary classes, functions and structures to interact with Windows Active Directory.

pydsinternals - Directory Services Internals Library A Python native library containing necessary classes, functions and structures to interact with W

Comments
  • 'PersonalCard' object has no attribute 'extractInfo'

    'PersonalCard' object has no attribute 'extractInfo'


    AttributeError Traceback (most recent call last) /var/folders/cc/d15mjmhn5c5fscqfwq3fql5h0000gp/T/ipykernel_6920/1182933257.py in ----> 1 result = reader.extractInfo('ThaiPersonalCardExtract/examples/extract/image_scan.jpg') 2 print(result)

    AttributeError: 'PersonalCard' object has no attribute 'extractInfo'

    opened by suwika 0
Releases(v1.3.4)
  • v1.3.4(Sep 2, 2021)

    New Feature v1.3.4 🎁

    • Support Thai identity card laser code extract. (02 Sep. 2021)
    • Fix bug dataset folder not import thai_government_lottery resource. (23 Aug. 2021) #1
    • Increase performance.
    • Support Thai Government Lottery āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļˆāļēāļāļĨāļ­āļ•āđ€āļ•āļ­āļĢāđŒāļĢāļĩāđˆ āđƒāļŠāđ‰āđ„āļ”āđ‰āļ”āļĩāļāļąāļšāļĢāļđāļ›āļ āļēāļžāļ—āļĩāđˆāđ„āļ”āđ‰āļˆāļēāļāđ€āļ„āļĢāļ·āđˆāļ­āļ‡āđāļŠāļāļ™ (16 Aug. 2021)
    • Refactor Output Structure.
    • Support Thai Driving License (Beta) āļŠāļēāļĄāļēāļĢāļ–āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļˆāļēāļāļ āļēāļžāļ–āđˆāļēāļĒāđƒāļšāļ‚āļąāļšāļ‚āļĩāđˆāđ„āļ”āđ‰āļšāļēāļ‡āļĢāļđāļ›āđāļšāļš āđ€āļ™āļ·āđˆāļ­āļ‡āļˆāļēāļ āļāļĢāļĄāļ—āļēāļ‡āļ‚āļ™āļŠāđˆāļ‡āļ—āļēāļ‡āļšāļ āļĄāļĩāļĢāļđāļ›āđāļšāļšāļšāļąāļ•āļĢāļŦāļĨāļēāļāļŦāļĨāļēāļĒāļĢāļđāļ›āđāļšāļš āđāļĨāļ°āđāļ•āđˆāļĨāļ°āļĢāļđāļ›āđāļšāļšāļĄāļĩāļ•āļģāđāļŦāļ™āđˆāļ‡āļ‚āđ‰āļ­āļĄāļđāļĨāļ—āļĩāđˆāđāļ•āļāļ•āđˆāļēāļ‡āļāļąāļ™ āļˆāļķāļ‡āļ—āļģāđƒāļŦāđ‰āļ›āļĢāļ°āļŠāļīāļ—āļ˜āļīāļ āļēāļžāļ•āđˆāļģ
    Source code(tar.gz)
    Source code(zip)
  • v1.3.1(Aug 16, 2021)

    New Feature v1.3.1 🎁 Increase performance. Support Thai Driving License (Beta) āļŠāļēāļĄāļēāļĢāļ–āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļˆāļēāļāļ āļēāļžāļ–āđˆāļēāļĒāđƒāļšāļ‚āļąāļšāļ‚āļĩāđˆāđ„āļ”āđ‰āļšāļēāļ‡āļĢāļđāļ›āđāļšāļš āđ€āļ™āļ·āđˆāļ­āļ‡āļˆāļēāļ āļāļĢāļĄāļ—āļēāļ‡āļ‚āļ™āļŠāđˆāļ‡āļ—āļēāļ‡āļšāļ āļĄāļĩāļĢāļđāļ›āđāļšāļšāļšāļąāļ•āļĢāļŦāļĨāļēāļāļŦāļĨāļēāļĒāļĢāļđāļ›āđāļšāļš āđāļĨāļ°āđāļ•āđˆāļĨāļ°āļĢāļđāļ›āđāļšāļšāļĄāļĩāļ•āļģāđāļŦāļ™āđˆāļ‡āļ‚āđ‰āļ­āļĄāļđāļĨāļ—āļĩāđˆāđāļ•āļāļ•āđˆāļēāļ‡āļāļąāļ™ āļˆāļķāļ‡āļ—āļģāđƒāļŦāđ‰āļ›āļĢāļ°āļŠāļīāļ—āļ˜āļīāļ āļēāļžāļ•āđˆāļģ Support Thai Government Lottery (16 Aug. 2021)

    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Aug 14, 2021)

    Increase performance. Support Thai Driving License (Beta) āļŠāļēāļĄāļēāļĢāļ–āļŠāļāļąāļ”āļ‚āđ‰āļ­āļĄāļđāļĨāļˆāļēāļāļ āļēāļžāļ–āđˆāļēāļĒāđƒāļšāļ‚āļąāļšāļ‚āļĩāđˆāđ„āļ”āđ‰āļšāļēāļ‡āļĢāļđāļ›āđāļšāļš āđ€āļ™āļ·āđˆāļ­āļ‡āļˆāļēāļ āļāļĢāļĄāļ—āļēāļ‡āļ‚āļ™āļŠāđˆāļ‡āļ—āļēāļ‡āļšāļ āļĄāļĩāļĢāļđāļ›āđāļšāļšāļšāļąāļ•āļĢāļŦāļĨāļēāļāļŦāļĨāļēāļĒāļĢāļđāļ›āđāļšāļš āđāļĨāļ°āđāļ•āđˆāļĨāļ°āļĢāļđāļ›āđāļšāļšāļĄāļĩāļ•āļģāđāļŦāļ™āđˆāļ‡āļ‚āđ‰āļ­āļĄāļđāļĨāļ—āļĩāđˆāđāļ•āļāļ•āđˆāļēāļ‡āļāļąāļ™ āļˆāļķāļ‡āļ—āļģāđƒāļŦāđ‰āļ›āļĢāļ°āļŠāļīāļ—āļ˜āļīāļ āļēāļžāļ•āđˆāļģ āļ›āļĢāļąāļšāđ€āļ›āļĨāļĩāđˆāļĒāļ™āļĢāļđāļ›āđāļšāļšāđ„āļŸāļĨāđŒāļĢāļ°āļšāļš

    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Aug 13, 2021)

    New Feature 🎁

    • More arae extract.
    • lang : attribute : get only area of given language.
    • provider : attribute : set ocr provider now support easyocr and tesseract.
    Source code(tar.gz)
    Source code(zip)
  • v1.0-beta(Aug 11, 2021)

Owner
ggafiled
āļ™āļēāļ™āđ†āļ—āļĩ āļ­āļąāļž āļ„āļĢāļąāļšāļšāļš.
ggafiled
Personal Toolbox Package

Jammy (Jam) A personal toolbox by Qsh.zh. Usage setup For core package, run pip install jammy To access functions in bin git clone https://gitlab.com/

null 5 Sep 16, 2022
Extract the download URL from OneDrive or SharePoint share link and push it to aria2

OneDriveShareLinkPushAria2 Extract the download URL from OneDrive or SharePoint share link and push it to aria2 äŧŽOneDrive或SharePointå…ąäšŦé“ūæŽĨ提取äļ‹č――URLåđķ将å…ķæŽĻ送到a

éŦ˜įŽĐæĒ 262 Jan 8, 2023
Utility to extract Fantasy Grounds Unity Line-of-sight and lighting files from a Univeral VTT file exported from Dungeondraft

uvtt2fgu Utility to extract Fantasy Grounds Unity Line-of-sight and lighting files from a Univeral VTT file exported from Dungeondraft This program wo

Andre Kostur 29 Dec 5, 2022
Handy Tool to check the availability of onion site and to extract the title of submitted onion links.

This tool helps is to quickly investigate a huge set of onion sites based by checking its availability which helps to filter out the inactive sites and collect the site title that might helps us to categories what site we are handling.

Balaji 13 Nov 25, 2022
Python based tool to extract forensic info from EventTranscript.db (Windows Diagnostic Data)

EventTranscriptParser EventTranscriptParser is python based tool to extract forensically useful details from EventTranscript.db (Windows Diagnostic Da

P. Abhiram Kumar 24 Nov 18, 2022
Extract XML from the OS X dictionaries.

Extract XML from the OS X dictionaries.

Joshua Olson 13 Dec 11, 2022
A python module for extract domains

A python module for extract domains

Fayas Noushad 4 Aug 10, 2022
A simple tool to extract python code from a Jupyter notebook, and then run pylint on it for static analysis.

Jupyter Pylinter A simple tool to extract python code from a Jupyter notebook, and then run pylint on it for static analysis. If you find this tool us

Edmund Goodman 10 Oct 13, 2022
A hashtag from string extract python module

A hashtag from string extract python module

Fayas Noushad 3 Aug 10, 2022
Program to extract signatures from documents.

Extracting Signatures from Bank Checks Introduction Ahmed et al. [1] suggest a connected components-based method for segmenting signatures in document

Muhammad Saif Ullah Khan 9 Jan 26, 2022