A proof-of-concept jupyter extension which converts english queries into relevant python code

Overview

Text2Code for Jupyter notebook

A proof-of-concept jupyter extension which converts english queries into relevant python code.

Blog post with more details:

Data analysis made easy: Text2Code for Jupyter notebook

Demo Video:

Text2Code for Jupyter notebook

Supported Operating Systems:

  • Ubuntu
  • macOS

Installation

NOTE: We have renamed the plugin from mopp to jupyter-text2code. Uninstall mopp before installing new jupyter-text2code version.

pip uninstall mopp

CPU-only install:

For Mac and other Ubuntu installations not having a nvidia GPU, we need to explicitly set an environment variable at time of install.

export JUPYTER_TEXT2CODE_MODE="cpu"

GPU install dependencies:

sudo apt-get install libopenblas-dev libomp-dev

Installation commands:

git clone https://github.com/deepklarity/jupyter-text2code.git
cd jupyter-text2code
pip install .
jupyter nbextension enable jupyter-text2code/main

Uninstallation:

pip uninstall jupyter-text2code

Usage Instructions:

  • Start Jupyter notebook server by running the following command: jupyter notebook
  • If you don't see Nbextensions tab in Jupyter notebook run the following command:jupyter contrib nbextension install --user
  • You can open the sample notebooks/ctds.ipynb notebook for testing
  • If installation happened successfully, then for the first time, Universal Sentence Encoder model will be downloaded from tensorflow_hub
  • Click on the Terminal Icon which appears on the menu (to activate the extension)
  • Type "help" to see a list of currently supported commands in the repo
  • Watch Demo video for some examples

Docker containers for jupyter-text2code

We have published CPU and GPU images to docker hub with all dependencies pre-installed.

Visit https://hub.docker.com/r/deepklarity/jupyter-text2code/ to download the images and usage instructions.
CPU image size: 1.51 GB
GPU image size: 2.56 GB

Model training:

Generate training data:

From a list of templates present at jupyter_text2code/jupyter_text2code_serverextension/data/ner_templates.csv, generate training data by running the following command:

cd scripts && python generate_training_data.py

This command will generate data for intent matching and NER(Named Entity Recognition).

Create intent index faiss

Use the generated data to create a intent-matcher using faiss.

cd scripts && python create_intent_index.py

Train NER model

cd scripts && python train_spacy_ner.py

Steps to add more intents:

  • Add more templates in ner_templates with a new intent_id
  • Generate training data. Modify generate_training_data.py if different generation techniques are needed or if introducing a new entity.
  • Train intent index
  • Train NER model
  • modify jupyter_text2code/jupyter_text2code_serverextension/__init__.py with new intent's condition and add actual code for the intent
  • Reinstall plugin by running: pip install .

TODO:

  • Publish Docker image
  • Refactor code and make it mode modular, remove duplicate code, etc
  • Add support for Windows
  • Add support for more commands
  • Improve intent detection and NER
  • Explore sentence Paraphrasing to generate higher-quality training data
  • Gather real-world variable names, library names as opposed to randomly generating them
  • Try NER with a transformer-based model
  • With enough data, train a language model to directly do English->code like GPT-3 does, instead of having separate stages in the pipeline
  • Create a survey to collect linguistic data
  • Add Speech2Code support

Authored By:

Comments
  • I'm using ubuntu and is it mandatory to run the cd scripts and python create_intent_index.py?

    I'm using ubuntu and is it mandatory to run the cd scripts and python create_intent_index.py?

    I've installed the library and it downloaded the model from tensorflow_hub, but it's not showing any pop-up there. Is it mandatory to run the codes mentioned in the documentation in github?

    I've runned the generate and create intent code and i didn't understand the modify part mentioned.

    The library is installed and nothing extra button is there to enter the input

    Screenshot from 2021-07-22 20-11-09

    opened by nithinreddyy 12
  • Installation working but not Text2Code

    Installation working but not Text2Code

    HI,

    I really find the concept interesting however after installing I get the following error:

    [W 19:07:57.734 NotebookApp] Error loading server extension jupyter_text2code.jupyter_text2code_serverextension Traceback (most recent call last): File "/Users/hicham/.local/lib/python3.7/site-packages/notebook/notebookapp.py", line 1942, in init_server_extensions mod = importlib.import_module(modulename) File "/opt/anaconda3/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 677, in _load_unlocked File "", line 728, in exec_module File "", line 219, in _call_with_frames_removed File "/opt/anaconda3/lib/python3.7/site-packages/jupyter_text2code/jupyter_text2code_serverextension/init.py", line 5, in import faiss File "/opt/anaconda3/lib/python3.7/site-packages/faiss/init.py", line 19, in from .swigfaiss import * File "/opt/anaconda3/lib/python3.7/site-packages/faiss/swigfaiss.py", line 649, in class HammingComputer4(_object): File "/opt/anaconda3/lib/python3.7/site-packages/faiss/swigfaiss.py", line 655, in HammingComputer4 swig_setmethods["a0"] = _swigfaiss.HammingComputer4_a0_set AttributeError: module 'faiss._swigfaiss' has no attribute 'HammingComputer4_a0_set'

    Resulting in this error when trying to use the terminal on a notebook:

    [W 19:08:00.061 NotebookApp] 404 GET /nbextensions/jupyter-text2code/icon.jpg (::1) 7.30ms referer=http://localhost:8888/tree

    [W 19:08:21.215 NotebookApp] 404 GET /jupyter-text2code?query=help&dataframes_info=%27%7B%7D%27 (::1) 2.78ms referer=http://localhost:8888/notebooks/Untitled6.ipynb

    Thanks for your help !

    opened by Rundel12 3
  • Terminal icon not showing

    Terminal icon not showing

    I am installing on macOS Mojave CPU only. I have followed the cpu only instructions from the README. However, the terminal icon does not show up in jupyter. How do i fix this?

    opened by geekjr 3
  • Can anyone or author post a video on how to install this library?

    Can anyone or author post a video on how to install this library?

    I've tried my level best to install this library and it installed but there is no snippet showing to enter the input. Can anyone or author of this library create a video on how to install this library from scratch?

    opened by nithinreddyy 2
  • ERROR: No matching distribution found for faiss (from mopp==0.0.1)

    ERROR: No matching distribution found for faiss (from mopp==0.0.1)

    Hi, I'm getting the below error. please help

    ERROR: Could not find a version that satisfies the requirement faiss (from mopp==0.0.1) (from versions: none) ERROR: No matching distribution found for faiss (from mopp==0.0.1)

    opened by Manikumar34 2
  • NotImplementedError: Cannot convert a symbolic Tensor (strided_slice_2:0) to a numpy array.

    NotImplementedError: Cannot convert a symbolic Tensor (strided_slice_2:0) to a numpy array.

    Hi, When playing around with the demo notebook, I received the above error after executing the following cell:

    cg.generate_code("import spacy")

    What do you think is the solution?

    Operating system: MacOS Python version: 3.7 Tensorflow version: 1.15.2

    opened by ahmedahmedov 2
  • Faiss isn't available on Windows as pip install

    Faiss isn't available on Windows as pip install

    (virtenv)\>jupyter-text2code>pip install .
    Processing \jupyter-text2code
    Requirement already satisfied: jupyter in c:\programdata\anaconda3\envs\virtenv\lib\site-packages (from mopp==0.0.1) (1.0.0)
    Requirement already satisfied: jupyter_nbextensions_configurator in c:\programdata\anaconda3\envs\virtenv\lib\site-packages (from mopp==0.0.1) (0.4.1)
    ERROR: Could not find a version that satisfies the requirement faiss (from mopp==0.0.1) (from versions: none)
    ERROR: No matching distribution found for faiss (from mopp==0.0.1)
    

    Perhaps an alternative to faiss could be introduced for cosine similarity

    bug 
    opened by dk-crazydiv 2
  • Getting below error while running Jupiter-notebok after pip installing text2code

    Getting below error while running Jupiter-notebok after pip installing text2code

    [W 14:18:03.862 NotebookApp] Error loading server extension mopp.mopp_serverextension Traceback (most recent call last): File "/Users/varun/venv/ml/lib/python3.7/site-packages/notebook/notebookapp.py", line 1942, in init_server_extensions mod = importlib.import_module(modulename) File "/Users/varun/venv/ml/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 677, in _load_unlocked File "", line 728, in exec_module File "", line 219, in _call_with_frames_removed File "/Users/varun/venv/ml/lib/python3.7/site-packages/mopp/mopp_serverextension/init.py", line 487, in CG = CodeGenerator() File "/Users/varun/venv/ml/lib/python3.7/site-packages/mopp/mopp_serverextension/init.py", line 48, in init self.nlp = spacy.load(SPACY_MODEL_DIR) File "/Users/varun/venv/ml/lib/python3.7/site-packages/spacy/init.py", line 30, in load return util.load_model(name, **overrides) File "/Users/varun/venv/ml/lib/python3.7/site-packages/spacy/util.py", line 166, in load_model return load_model_from_path(Path(name), **overrides) File "/Users/varun/venv/ml/lib/python3.7/site-packages/spacy/util.py", line 211, in load_model_from_path return nlp.from_disk(model_path, exclude=disable) File "/Users/varun/venv/ml/lib/python3.7/site-packages/spacy/language.py", line 947, in from_disk util.from_disk(path, deserializers, exclude) File "/Users/varun/venv/ml/lib/python3.7/site-packages/spacy/util.py", line 654, in from_disk reader(path / key) File "/Users/varun/venv/ml/lib/python3.7/site-packages/spacy/language.py", line 931, in p File "vocab.pyx", line 466, in spacy.vocab.Vocab.from_disk File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pathlib.py", line 1203, in open opener=self._opener) File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/pathlib.py", line 1058, in _opener return self._accessor.open(self, flags, mode) FileNotFoundError: [Errno 2] No such file or directory: '/Users/varun/venv/ml/lib/python3.7/site-packages/mopp/mopp_serverextension/models/ner/vocab/lexemes.bin'

    opened by varunmittal50 1
  • Import error

    Import error

    Installed the extension in Google colab with the below code. import os MYDIR = ("jupyter-text2code-master") CHECK_FOLDER = os.path.isdir(MYDIR) if not CHECK_FOLDER: !apt install unzip !wget -q https://github.com/deepklarity/jupyter-text2code/archive/master.zip !unzip -q master.zip !pip install /content/jupyter-text2code-master !apt-get install libomp-dev !apt-get install libomp-doc !pwd

    But when try to import the extension, get the below error import sys sys.path.insert(0,'../') from mopp.mopp_serverextension import CodeGenerator

    FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.6/dist-packages/mopp/mopp_serverextension/models/ner/vocab/lexemes.bin'

    opened by vj-python 1
  • Adding cpu-only support

    Adding cpu-only support

    • Added support for faiss & faiss-cpu being different libraries to handle seamless installation on cpu-only machines.
    • Explicitly set tensorflow_hub Universal Sentence Encoder model download directory to make model persist across reboots.
    opened by dk-crazydiv 0
  • Why not use the Codex API?

    Why not use the Codex API?

    As far as I understood, this seems like GitHub Copilot for Jupyter, so why not use the official codex API to convert from english to code?

    P.S: I may be wrong in understanding the project, in which case this issue doesn't make much sense :)

    opened by frankhart2018 0
  • Can't find the terminal icon after loading the ctds notebook

    Can't find the terminal icon after loading the ctds notebook

    Which is one of the steps in usage instructions "Click on the Terminal Icon which appears on the menu (to activate the extension)"

    image

    Any help would be appreciated.

    Thankyou!

    opened by ronithomasAtDure 0
  • Usage issue with docker

    Usage issue with docker

    I use the docker images for installation. It is throwing an error image Every requirement has been fulfilled, I enabled the nbextentions as well but it is now working.

    opened by aadilganigaie 0
  • Installation issue in Ubuntu VM

    Installation issue in Ubuntu VM

    I followed the tutorial installing everything required, under Ubuntu 20.04 in a VirtualBox VM. Everything works fine until i get to this command pip install . jupyter nbextension enable jupyter-text2code/main

    I get different errors like: jupyter-nbextension is not found or could not find a version that satisfies the requirments faiss (for jupyter-text2code==0.0.2) and others How can i proceed further?

    opened by sant3e 0
  • Issue with installation on windows using anaconda

    Issue with installation on windows using anaconda

    I am facing an issue with window installation on Jupyter notebook using anaconda.

    Error: ERROR: Could not find a version that satisfies the requirement faiss (from jupyter-text2code==0.0.2) (from versions: none) ERROR: No matching distribution found for faiss (from jupyter-text2code==0.0.2)

    opened by aadilganigaie 4
Owner
DeepKlarity
DeepKlarity
A Jupyter - Leaflet.js bridge

ipyleaflet A Jupyter / Leaflet bridge enabling interactive maps in the Jupyter notebook. Usage Selecting a basemap for a leaflet map: Loading a geojso

Jupyter Widgets 1.3k Dec 27, 2022
Google maps for Jupyter notebooks

gmaps gmaps is a plugin for including interactive Google maps in the IPython Notebook. Let's plot a heatmap of taxi pickups in San Francisco: import g

Pascal Bugnion 747 Dec 19, 2022
Tool to suck data from ArcGIS Server and spit it into PostgreSQL

chupaESRI About ChupaESRI is a Python module/command line tool to extract features from ArcGIS Server map services. Name? Think "chupacabra" or "Chupa

John Reiser 34 Dec 4, 2022
peartree: A library for converting transit data into a directed graph for sketch network analysis.

peartree ?? ?? peartree is a library for converting GTFS feed schedules into a representative directed network graph. The tool uses Partridge to conve

Kuan Butts 183 Dec 29, 2022
Stitch image tiles into larger composite TIFs

untiler Utility to take a directory of {z}/{x}/{y}.(jpg|png) tiles, and stitch into a scenetiff (tif w/ exact merc tile bounds). Future versions will

Mapbox 38 Dec 16, 2022
Imports VZD (Latvian State Land Service) open data into postgis enabled database

Python script main.py downloads and imports Latvian addresses into PostgreSQL database. Data contains parishes, counties, cities, towns, and streets.

Kaspars Foigts 7 Oct 26, 2022
This app displays interesting statistical weather records and trends which can be used in climate related research including study of global warming.

This app displays interesting statistical weather records and trends which can be used in climate related research including study of global warming.

null 0 Dec 27, 2021
This is a simple python code to get IP address and its location using python

IP address & Location finder @DEV/ED : Pavan Ananth Sharma Dependencies: ip2geotools Note: use pip install ip2geotools to install this in your termin

Pavan Ananth Sharma 2 Jul 5, 2022
Get-countries-info - A python code that fetches data of any country

Country-info A python code getting countries information including country's map

CODE 2 Feb 21, 2022
GeoIP Legacy Python API

MaxMind GeoIP Legacy Python Extension API Requirements Python 2.5+ or 3.3+ GeoIP Legacy C Library 1.4.7 or greater Installation With pip: $ pip instal

MaxMind 230 Nov 10, 2022
Python bindings and utilities for GeoJSON

geojson This Python library contains: Functions for encoding and decoding GeoJSON formatted data Classes for all GeoJSON Objects An implementation of

Jazzband 765 Jan 6, 2023
Geocoding library for Python.

geopy geopy is a Python client for several popular geocoding web services. geopy makes it easy for Python developers to locate the coordinates of addr

geopy 3.8k Dec 30, 2022
Python Data. Leaflet.js Maps.

folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

null 6k Jan 2, 2023
Python tools for geographic data

GeoPandas Python tools for geographic data Introduction GeoPandas is a project to add support for geographic data to pandas objects. It currently impl

GeoPandas 3.5k Jan 3, 2023
Python interface to PROJ (cartographic projections and coordinate transformations library)

pyproj Python interface to PROJ (cartographic projections and coordinate transformations library). Documentation Stable: http://pyproj4.github.io/pypr

null 832 Dec 31, 2022
Python bindings and utilities for GeoJSON

geojson This Python library contains: Functions for encoding and decoding GeoJSON formatted data Classes for all GeoJSON Objects An implementation of

Jazzband 763 Dec 26, 2022
Documentation and samples for ArcGIS API for Python

ArcGIS API for Python ArcGIS API for Python is a Python library for working with maps and geospatial data, powered by web GIS. It provides simple and

Esri 1.4k Dec 30, 2022
PySAL: Python Spatial Analysis Library Meta-Package

Python Spatial Analysis Library PySAL, the Python spatial analysis library, is an open source cross-platform library for geospatial data science with

Python Spatial Analysis Library 1.1k Dec 18, 2022
Simple, concise geographical visualization in Python

Geographic visualizations for HoloViews. Build Status Coverage Latest dev release Latest release Docs What is it? GeoViews is a Python library that ma

HoloViz 445 Jan 2, 2023