This script has been created in order to find what are the most common demanded technologies in Data Engineering field.

Overview

Motivation

This script has been created in order to find what are the most common demanded technologies in Data Engineering field. The process of collecting every job offer is done manually but doing it automatically is not difficult at all. Maybe I will do it in a separate repository.

Job-Offers-Text-Mining

This is a Python script that given a whole corpus of job descriptions and a file with keywords it extracts the number of ocurrences of these keywords and write it to a file. This script it is easy to extend to accept more functionalities

To use the script you only need to create your own corpus of job offers(data_eng_corpus.txt) and configure what are your keywords (DataEngineer_Keywords.txt) and then execute

python job_offers_miner.py

Result:

alt text

You might also like...
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

JSON and CSV data for Swahili dictionary with over 16600+ words

kamusi JSON and CSV data for swahili dictionary with over 16600+ words. This repo consists of data from swahili dictionary with about 16683 words toge

Maiden & Spell community player ranking based on tournament data.

MnSRank Maiden & Spell community player ranking based on tournament data. Why? 2021 just ended and this seemed like a cool idea. Elo doesn't work well

Investment and risk technologies maintained by Fortitudo Technologies.

Fortitudo Technologies Open Source This package allows you to freely explore open-source implementations of some of our fundamental technologies under

🖍️This is a feature-complete clone of the awesome Chalk (JavaScript) library.
🖍️This is a feature-complete clone of the awesome Chalk (JavaScript) library.

Terminal string styling done right This is a feature-complete clone of the awesome Chalk (JavaScript) library. All credits go to Sindre Sorhus. Highli

The goal of this program was to find the most common color in my living room.

The goal of this program was to find the most common color in my living room. I found a dataset online with colors names and their corr

The sequel to SquidNet. It has many of the previous features that were in the original script, however a lot of the functions that do not serve much functionality have been removed.

SquidNet2 The sequel to SquidNet. It has many of the previous features that were in the original script, however a lot of the functions that do not se

InverterApi - This project has been designed to take monitoring data from Voltronic, Axpert, Mppsolar PIP, Voltacon, Effekta

InverterApi - This project has been designed to take monitoring data from Voltronic, Axpert, Mppsolar PIP, Voltacon, Effekta

Arp-spoofing, this script was written for people who want to spoof any vulnerable machine such as Wİndows, of course it could have been more sophisticatedly created but these repos will be updated constantly

ARP-SPOOF ARP spoofing is a type of attack in which a malicious actor sends falsified ARP (Address Resolution Protocol) messages over a local area net

Use this script to track the gains of cryptocurrencies using historical data and display it on a super-imposed chart in order to find the highest performing cryptocurrencies historically

crypto-performance-tracker Use this script to track the gains of cryptocurrencies using historical data and display it on a super-imposed chart in ord

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Nepali-news-notifier This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular in

A calendaring app for Django. It is now stable, Please feel free to use it now. Active development has been taken over by bartekgorny.

Django-schedule A calendaring/scheduling application, featuring: one-time and recurring events calendar exceptions (occurrences changed or cancelled)

:electric_plug: Generating short urls with python has never been easier
:electric_plug: Generating short urls with python has never been easier

pyshorteners A simple URL shortening API wrapper Python library. Installing pip install pyshorteners Documentation https://pyshorteners.readthedocs.i

:electric_plug: Generating short urls with python has never been easier
:electric_plug: Generating short urls with python has never been easier

pyshorteners A simple URL shortening API wrapper Python library. Installing pip install pyshorteners Documentation https://pyshorteners.readthedocs.i

A calendaring app for Django. It is now stable, Please feel free to use it now. Active development has been taken over by bartekgorny.

Django-schedule A calendaring/scheduling application, featuring: one-time and recurring events calendar exceptions (occurrences changed or cancelled)

BuddyPress is an open source WordPress plugin to build a community site. In releases of BuddyPress from 5.0.0 before 7.2.1 it's possible for a non-privileged, regular user to obtain administrator rights by exploiting an issue in the REST API members endpoint. The vulnerability has been fixed in BuddyPress 7.2.1. Existing installations of the plugin should be updated to this version to mitigate the issue. This is a rip off of the classical iPhone Calculator . This project has been made with PyQT5
This is a rip off of the classical iPhone Calculator . This project has been made with PyQT5

iPhoneCalcRIP-OFF This is a rip off of the classical iPhone Calculator . This project has been made with PyQT5

On the 11/11/21 the apache 2.4.49-2.4.50 remote command execution POC has been published online and this is a loader so that you can mass exploit servers using this.
On the 11/11/21 the apache 2.4.49-2.4.50 remote command execution POC has been published online and this is a loader so that you can mass exploit servers using this.

ApacheRCE ApacheRCE is a small little python script that will allow you to input the apache version 2.4.49-2.4.50 and then input a list of ip addresse

A simple projects to help your seo optimizing has been written with python

python-seo-projects it is a very simple projects to help your seo optimizing has been written with python broken link checker with python(it will give

Owner
Antonio Bri Pérez
Computer Science at University Of Alicante & Computer Scientist at FacePhi.
Antonio Bri Pérez
Search for terms(word / table / field name or any) under Snowflake schema names

snowflake-search-terms-in-ddl-views Search for terms(word / table / field name or any) under Snowflake schema names Version : 1.0v How to use ? Run th

Igal Emona 1 Dec 15, 2021
Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Markup is an online annotation tool that can be used to transform unstructured documents into structured formats for NLP and ML tasks, such as named-entity recognition. Markup learns as you annotate in order to predict and suggest complex annotations. Markup also provides integrated access to existing and custom ontologies, enabling the prediction and suggestion of ontology mappings based on the text you're annotating.

Samuel Dobbie 146 Dec 18, 2022
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.

TextDistance TextDistance -- python library for comparing distance between two or more sequences by many algorithms. Features: 30+ algorithms Pure pyt

Life4 3k Jan 2, 2023
Find a Doc is a free online resource aimed at helping connect the foreign community in Japan with health services in their native language.

Find a Doc - Localization Find a Doc is a free online resource aimed at helping connect the foreign community in Japan with health services in their n

Our Japan Life 18 Dec 19, 2022
Wordle strategy: Find frequency of letters appearing in 5-letter words in the English language

Find frequency of letters appearing in 5-letter words in the English language In

Gabriel Apolinário 1 Jan 17, 2022
A Python3 script that simulates the user typing a text on their keyboard.

A Python3 script that simulates the user typing a text on their keyboard. (control the speed, randomness, rate of typos and more!)

Jose Gracia Berenguer 3 Feb 22, 2022
A minimal python script for generating multiple onetime use bip39 seed phrases

seed_signer_ontimes WARNING This project has mainly been used for local development, and creation should be ran on a air-gapped machine. A minimal pyt

CypherToad 4 Sep 12, 2022
Little python script + dictionary to help solve Wordle puzzles

Wordle Solver Little python script + dictionary to help solve Wordle puzzles Usage Usage: ./wordlesolver.py [letters in word] [letters not in word] [p

Luke Stephens (hakluke) 4 Jul 24, 2022
Deasciify-highlighted - A Python script for deasciifying text to Turkish and copying clipboard

deasciify-highlighted is a Python script for deasciifying text to Turkish and copying clipboard.

Ümit Altıntaş 3 Mar 18, 2022
A working (ish) python script to convert text to a gradient.

verticle-horiontal-gradient-script A working (ish) python script to convert text to a gradient. This script is poorly made with the well known python

prmze 1 Feb 20, 2022