Code Generation using a large neural network called GPT-J

Overview

CodeGenX


CodeGenX is a Code Generation system powered by Artificial Intelligence! It is delivered to you in the form of a Visual Studio Code Extension and is Free and Open-source!


Installation

You can find installation instructions and additional information about CodeGenX in the documentation here.


About CodeGenX

1. Languages Supported

CodeGenX currently only supports Python. We are planning to add additional languages in future releases.

2. Modules Trained On

CodeGenX was trained on Python code which covers many of its common uses. Some libraries which CodeGenX is specifically trained on are:

  1. Tensorflow
  2. Pytorch
  3. Scikit-Learn
  4. Pandas
  5. NumPy
  6. OpenCV
  7. Django
  8. Flask
  9. PyGame

3. How CodeGenX Works

At the core of CodeGenX lies a large neural network called GPT-J. GPT-J is a 6 billion parameter transformer model which was trained on hundreds of gigabytes of text from the internet. We fine-tuned this model on a dataset of open-source python code. This fine-tuned model can now be used to generate code when given an input with the right instructions.


Contributors โœจ

This project would not have been possible without the help of these wonderful people:



Arya Manjaramkar

Matthias Wijnsma


Thomas Houtrique


Dominic Rampas

Bilel Medimegh

Josh Hills

Alex

Tiimo

Acknowledgements

Many thanks to the support of the Google TPU Research Cloud for providing the precious compute needed for this project.

Comments
  • Email Verification Fail

    Email Verification Fail

    Hey! I have problem when I clicked on the link to verify my email address this popped up, {"success":false,"error":{"code":"INVALID_VERIFICATION_CODE","message":"Verification code "code" is not valid."}}, It was originally in a JSON format when I clicked on the link. I also tried doing it on mobile but no use either, is there something I did wrong?

    JSON Format(Just a guess): "success": false, "error: { "code": "INVALID_VERIFICATION_CODE", "message": "Verification code "CODE HERE" is not valid." }

    opened by KingZumbie 8
  • plugin support for PyCharm

    plugin support for PyCharm

    Hi guys, I am so excited to see such a project! As an IDE developer, I would like to contribute a PyCharm plugin(open source) to support it. Here is the repo: https://github.com/jtydhr88/codegenx-pycharm-plugin

    For some quick views, I took some screenshots:

    1. Settings settings
    2. Tool window toolwindow
    3. Generate code generatecode
    4. Insert code insertcode

    Please let me know if you are interested in it and we can discuss more.

    opened by jtydhr88 5
  • word.replaceAll is not a function

    word.replaceAll is not a function

    After pressing Ctrl + D, vscode retures a error meassage says "word.replaceAll is not a function". Is there any tips on how to get CodeGenX works properly?

    I'm just using the example code from CodeGenX's doc, and my cursor is at the end of the comment.

    def new_text_file(filename, content):
        """Create a new text file and add content to it"""
    
    opened by CharlesChiuGit 4
  • Migration Error after April 4th

    Migration Error after April 4th

    When I press Ctrl + D the output says We are currently in the process of migrating CodeGenX to more powerful hardware. This will improve inference time and make the service a lot faster. CodeGenX will be temporarily inactive from the 28th of March 2022 for about one week. The code generation will not work and new users won't be able to sign up during this period. The new hardware will also allow us to start working on a new and improved version of CodeGenX with more accurate generation capabilities! We apologize for the inconvinience.

    A week after the 28th of March would be April 4th. Is it still down nearly a month later or is there something wrong with my setup?

    opened by GrahamboJangles 2
  • Training dataset format

    Training dataset format

    Hi! I am working on a similar project to generate javascript code using gpt-j. I want help regarding the format of the fine tuned dataset. What should be the training data format to fine tune the model?

    Thanks for your time and help!

    opened by samyakai 1
  • issue with the verification code

    issue with the verification code

    Hi!

    when I want to check my email address, the website attached in the email sends me this {"success":false,"error":{"code":"INVALID_VERIFICATION_CODE","message":"Verification code "code" is not valid."}}

    Surely this is not normal?

    opened by florianpetiot 1
  • No verification email or token/key received from deepgenx com

    No verification email or token/key received from deepgenx com

    I've typed my email address at https://www.deepgenx.com/signin and after pressing the 'Get API key!' button, an OK message pops up from www.deepgenx.com with the text 'Verification email sent to: [email protected]' (my email address at 'hotmail' is spelled properly in the real pup-up message). I've tried two different email addresses so far.

    However, no email is received, either in the inbox or in the spam email folders.

    • How can I get a key/token for the installed 'DeepGenX.codegenx-0.1.8.vsix', please?

    (As a side note, the DeepGenX doesn't show in my VScodium online search by the extension's name, but can be manually downloaded from https://marketplace.visualstudio.com/items?itemName=DeepGenX.codegenx and installed offline)

    opened by RoGeorge 3
Owner
DeepGenX
DeepGenX
A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex)

CodeJ A python project made to generate code using either OpenAI's codex or GPT-J (Although not as good as codex) Install requirements pip install -r

TheProtagonist 1 Dec 6, 2021
Google and Stanford University released a new pre-trained model called ELECTRA

Google and Stanford University released a new pre-trained model called ELECTRA, which has a much compact model size and relatively competitive performance compared to BERT and its variants. For further accelerating the research of the Chinese pre-trained model, the Joint Laboratory of HIT and iFLYTEK Research (HFL) has released the Chinese ELECTRA models based on the official code of ELECTRA. ELECTRA-small could reach similar or even higher scores on several NLP tasks with only 1/10 parameters compared to BERT and its variants.

Yiming Cui 1.2k Dec 30, 2022
Neural text generators like the GPT models promise a general-purpose means of manipulating texts.

Boolean Prompting for Neural Text Generators Neural text generators like the GPT models promise a general-purpose means of manipulating texts. These m

Jeffrey M. Binder 20 Jan 9, 2023
The model is designed to train a single and large neural network in order to predict correct translation by reading the given sentence.

Neural Machine Translation communication system The model is basically direct to convert one source language to another targeted language using encode

Nishant Banjade 7 Sep 22, 2022
Code for producing Japanese GPT-2 provided by rinna Co., Ltd.

japanese-gpt2 This repository provides the code for training Japanese GPT-2 models. This code has been used for producing japanese-gpt2-medium release

rinna Co.,Ltd. 491 Jan 7, 2023
Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

gpt3-instruct-sandbox Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API Description This project updates an existing GPT-3 san

null 312 Jan 3, 2023
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 Billion Parameters) on a single 16 GB VRAM V100 Google Cloud instance with Huggingfa

null 289 Jan 6, 2023
Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingwai

TextCortex - HemingwAI Generate product descriptions, blogs, ads and more using GPT architecture with a single request to TextCortex API a.k.a Hemingw

TextCortex AI 27 Nov 28, 2022
Seonghwan Kim 24 Sep 11, 2022
Modified GPT using average pooling to reduce the softmax attention memory constraints.

NLP-GPT-Upsampling This repository contains an implementation of Open AI's GPT Model. In particular, this implementation takes inspiration from the Ny

WD 1 Dec 3, 2021
๐Ÿ“œ GPT-2 Rhyming Limerick and Haiku models using data augmentation

Well-formed Limericks and Haikus with GPT2 ?? GPT-2 Rhyming Limerick and Haiku models using data augmentation In collaboration with Matthew Korahais &

Bardia Shahrestani 2 May 26, 2022
Creating a chess engine using GPT-3

GPT3Chess Creating a chess engine using GPT-3 Code for my article : https://towardsdatascience.com/gpt-3-play-chess-d123a96096a9 My game (white) vs GP

null 19 Dec 17, 2022
Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

Google 6.4k Jan 1, 2023
Unsupervised text tokenizer for Neural Network-based text generation.

SentencePiece SentencePiece is an unsupervised text tokenizer and detokenizer mainly for Neural Network-based text generation systems where the vocabu

Google 4.8k Feb 18, 2021
An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

GPT-NeoX An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hun

EleutherAI 3.1k Jan 8, 2023
๐Ÿ›ธ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

Explosion 1.2k Jan 8, 2023
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

gpt-2-simple A simple Python package that wraps existing model fine-tuning and generation scripts for OpenAI's GPT-2 text generation model (specifical

Max Woolf 3.1k Jan 7, 2023
Shirt Bot is a discord bot which uses GPT-3 to generate text

SHIRT BOT ยท Shirt Bot is a discord bot which uses GPT-3 to generate text. Made by Cyclcrclicly#3420 (474183744685604865) on Discord. Support Server EX

null 31 Oct 31, 2022
๐Ÿ›ธ Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy This package provides spaCy components and architectures to use tr

Explosion 903 Feb 17, 2021