This script has been created in order to find what are the most common demanded technologies in Data Engineering field. The process of collecting every job offer is done manually but doing it automatically is not difficult at all. Maybe I will do it in a separate repository.
This is a Python script that given a whole corpus of job descriptions and a file with keywords it extracts the number of ocurrences of these keywords and write it to a file. This script it is easy to extend to accept more functionalities
To use the script you only need to create your own corpus of job offers(data_eng_corpus.txt) and configure what are your keywords (DataEngineer_Keywords.txt) and then execute