Code for the Open Data Day 2022 publicbodies.org Nepal data scraping activities.

Overview

Open Data Day Publicbodies.org Nepal

We've gathered on Saturday, 5th March 2022 with Open Knowledge Nepal in order to try and automate the collection of data about Nepalese public organizations.

Participants

Data source

The data source used was the government directory at:

mofaga.gov.np

It was not in machine readable format, so we had to use web scaping for that.

Code

The code is in nepal-public-bodies.ipynb. in Jupyter Notebook format.

Next steps

Some things we still need to do:

  • transform the notebook into a simple scraping script
  • map the fields of the resulting csv to the format used by publicbodies.org
  • create a Github Actions job to automate collection by publicbodies.org
You might also like...
An open-source CLI tool for backing up RDS(PostgreSQL) Locally or to Amazon S3 bucket

An open-source CLI tool for backing up RDS(PostgreSQL) Locally or to Amazon S3 bucket

Quickly open any path on your terminal window in your $EDITOR of choice!
Quickly open any path on your terminal window in your $EDITOR of choice!

Tmux fpp Plugin wrapper around Facebook PathPicker. Quickly open any path on your terminal window in your $EDITOR of choice! Demo Dependencies fpp - F

An open source terminal project made in python

Calamity-Terminal An open source terminal project made in python. Calamity Terminal is a free and open source lightweight terminal. Its made 100% off

WebApp Maker make web apps (Duh). It is open source and make with python and shell.

WebApp Maker make web apps (Duh). It is open source and make with python and shell. This app can take any website and turn it into an app. I highly recommend turning these few websites into webapps: - Krunker.io (Fps Game) - play.fancade.com (Minigame Arcade) - Your Own Website If You Have One Apart from that enjoy my app By 220735540 (a.k.a RP400)

gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases.
gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases.

gget is a free and open-source command-line tool and Python package that enables efficient querying of genomic databases. gget consists of a collection of separate but interoperable modules, each designed to facilitate one type of database querying in a single line of code.

CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that data.
CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that data.

CLabel is a terminal-based cluster labeling tool that allows you to explore text data interactively and label clusters based on reviewing that

Seamlessly run Python code in IPython from Vim
Seamlessly run Python code in IPython from Vim

Seamlessly run Python code from Vim in IPython, including executing individual code cells similar to Jupyter notebooks and MATLAB. This plugin also supports other languages and REPLs such as Julia.

A terminal UI dashboard to monitor requests for code review across Github and Gitlab repositories.
A terminal UI dashboard to monitor requests for code review across Github and Gitlab repositories.

A terminal UI dashboard to monitor requests for code review across Github and Gitlab repositories.

Magma is a NeoVim plugin for running code interactively with Jupyter.
Magma is a NeoVim plugin for running code interactively with Jupyter.

Magma Magma is a NeoVim plugin for running code interactively with Jupyter. Requirements NeoVim 0.5+ Python 3.8+ Required Python packages: pynvim (for

Owner
Augusto Herrmann
Creator of dados.gov.br, co-founder of dadosabertos.social, data engineer and contributor to open data and free software projects.
Augusto Herrmann
gcptree - Like the unix tree command but for GCP Org Heirarchy

gcptree Like the unix tree command but for GCP Org Heirarchy. For a note on coloring, the org node is green, folders and blue, and projects that are n

Ryan Canty 25 Sep 6, 2022
A mini command line tool to spellcheck text files using tadqeek.alsharekh.org

tadqeek_sakhr A mini command line tool to spellcheck text files using tadqeek.alsharekh.org Usage usage: python tadqeek_sakhr.py [-h] -i INPUT [-o OUT

Youssif Shaaban Alsager 5 Dec 11, 2022
Regis-ltmpt-auto - Program register ltmpt 2022 automatis

LTMPT Register Otomatis 2022 Program register ltmpt 2022 automatis dibuat untuk

null 1 Jan 13, 2022
🏃 Python3 Solutions of All Problems in GCJ 2022 (In Progress)

GoogleCodeJam 2022 Python3 solutions of Google Code Jam 2022. Solution begins with * means it will get TLE in the largest data set. Total computation

kamyu 12 Dec 20, 2022
Open a file in your locally running Visual Studio Code instance from arbitrary terminal connections.

code-connect Open a file in your locally running Visual Studio Code instance from arbitrary terminal connections. Motivation VS Code supports opening

Christian Volkmann 56 Nov 19, 2022
open a remote repo locally quickly

A command line tool to peek a remote repo hosted on github or gitlab locally and view it in your favorite editor. The tool handles cleanup of the repo once you exit your editor.

Rahul Nair 44 Dec 16, 2022
Free and Open-Source Command Line tool for Text Replacement

Sniplet Free and Open Source Text Replacement Tool Description: Sniplet is a work in progress CLI tool which can do text replacement globally in Linux

Veeraraghavan Narasimhan 13 Nov 28, 2022
Open-Source Python CLI package for copying DynamoDB tables and items in parallel batch processing + query natural & Global Secondary Indexes (GSIs)

Python Command-Line Interface Package to copy Dynamodb data in parallel batch processing + query natural & Global Secondary Indexes (GSIs).

null 1 Oct 31, 2021
Professor Wordlist is a free open source command line tool written in python

Professor Wordlist is a free open source command line tool written in python, With the aim of generating custom wordlists with a variety of unique parameters and functions providing many possibilities.

オークO A K Z E H オーク 1 Oct 28, 2021