Datasets with Softcatalà website content

Overview

softcatala-web-dataset

This repository contains Sofcatalà web site content (articles and programs descriptions).

Dataset are available in the dataset directory.

Dataset size:

  • articles.json contains 623 articles with 366915 words
  • programes.json contains 330 program descripctions with 49110 words

The license of the data is Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

You might also like...
LeetComp - Background tasks powering the static content at LeetComp

LeetComp Analysing compensations mentioned on the Leetcode forums (https://kuuts

An OpenSource crowd-sourced cooking recipes website

An OpenSource crowd-sourced cooking recipes website

Checks for Vaccine Availability at your district and notifies you using E-mail, subscribe to our website.

Vaccine Availability Notifier Project Description Checks for Vaccine Availability at your district and notifies you using E-mail every 10 mins. Kindly

Find the remote website version based on a git repository
Find the remote website version based on a git repository

versionshaker Versionshaker is a tool to find a remote website version based on a git repository This tool will help you to find the website version o

A free website that keeps the people informed about housing and evictions.

Eviction Tracker Currently helping verify detainer warrant data for middle Tennessee - via Middle TN DSA - Red Door Collective Features Presents data

A website to collect vintage 4 tracks cassette recorders.

Vintage 4tk cassette recorders A website to collect vintage 4 tracks cassette recorders. Local development setup Copy and customize Django settings (e

Developed a website to analyze and generate report of students based on the curriculum that represents student’s academic performance.
Developed a website to analyze and generate report of students based on the curriculum that represents student’s academic performance.

Developed a website to analyze and generate report of students based on the curriculum that represents student’s academic performance. We have developed the system such that, it will automatically parse data onto the database from excel file, which will in return reduce time consumption of analysis of data.

Tool to automate the enumeration of a website (CTF)

had4ctf Tool to automate the enumeration of a website (CTF) DISCLAIMER: THE TOOL HAS BEEN DEVELOPED SOLELY FOR EDUCATIONAL PURPOSE ,I WILL NOT BE LIAB

Bring A Trailer(BAT) is a popular online auction website for enthusiast cars. This traverse auction results and saves them as CSV

BaT Data Grabber Bring A Trailer(BAT) is a popular online auction website for enthusiast cars. This traverse auction results and saves them as CSV Bri

Owner
Softcatalà
Softcatalà
A collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to provide an easy-to-use access to them.

KGQA Datasets Brief Introduction This repository is a collection of existing KGQA datasets in the form of the huggingface datasets library, aiming to

Semantic Systems research group 21 Jan 6, 2023
A script that will warn you, by opening a new browser tab, when there are new content in your favourite websites.

web check A script that will warn you, by opening a new browser tab, when there are new content in your favourite websites. What it does The script wi

Jaime Álvarez 52 Mar 15, 2022
Demo content - Automate your automation!

Automate-AAP2 Demo Content - Automate your automation! A fully automated Ansible Automation Platform. Context Installing and configuring Ansible Autom

null 0 Oct 27, 2022
gwcheck is a tool to check .gnu.warning.* sections in ELF object files and display their content.

gwcheck Description gwcheck is a tool to check .gnu.warning.* sections in ELF object files and display their content. For an introduction to .gnu.warn

Frederic Cambus 11 Oct 28, 2022
Automated GitHub profile content using the USGS API, Plotly and GitHub Actions.

Top 20 Largest Earthquakes in the Past 24 Hours Location Mag Date and Time (UTC) 92 km SW of Sechura, Peru 5.2 11-05-2021 23:19:50 113 km NNE of Lobuj

Mr. Phantom 28 Oct 31, 2022
Automated Content Feed Curator

Gathers posts from content feeds, filters, formats, delivers to you.

Alper S. Soylu 2 Jan 22, 2022
A Kodi add-on for watching content hosted on PeerTube.

A Kodi add-on for watching content hosted on PeerTube. This add-on is under development so only basic features work, and you're welcome to improve it.

null 1 Dec 18, 2021
Generates images with semantic content from distribution A in the style of distribution B

A2B Generates images with semantic content from distribution A in the style of d

Richard Herbert 2 Dec 27, 2021
A reference implementation for processing the content.log files found at opendata.dwd.de/weather

A reference implementation for processing the content.log files found at opendata.dwd.de/weather.

Deutscher Wetterdienst (DWD) 6 Nov 26, 2022
Collection of functions for working with interlaced content in VapourSynth.

vsfieldkit Collection of functions for working with interlaced content in VapourSynth. It does not have any hard dependencies outside of VapourSynth.

Justin Turner Arthur 11 May 27, 2022