Informal Persian Universal Dependency Treebank

Overview

Informal Persian Universal Dependency Treebank (iPerUDT)

Informal Persian Universal Dependency Treebank, consisting of 3000 sentences and 54,904 tokens, is an open source collection of colloquial informal texts from Persian blogs. The corpus is annotated in CoNLL-U format within the Universal Dependencies scheme (Nivre et al., 2020).

The following Course-grained Universal Dependencies parts of speech tags (UPOS), and fine-grained language-specific parts of speech tags (XPOS) are used in this treebank.

UPOS XPOS Description
ADJ ADJ Adjective
ADJ ADJ_CMPR Comparative adjective
ADJ ADJ_SUP Superlative adjective
ADV ADV Adverb
ADV ADV_I Adverb of interrogation
ADV ADV_LOC Adverb of location
ADV ADV_NEG Adverb of Negation
ADV ADV_TIME Adverb of time
ADP P Preposition
AUX V_AUX Auxiliary/copula verb
CCONJ CON Coordinating conjunction
DET DET Determiner
INTJ INTJ Interjection
NOUN N_PL Plural noun
NOUN N_SING Singular noun
NUM NUM Numeral
PART PART Differential object marker, focus marker, negative particle, question particle
PRON PRO Pronoun
PROPN PROPN Proper nouns (persons,locations, months, organizations, geopolitical entities)
PUNCT DELM Punctuation/delimiter
SCONJ CON Subordinating conjunction
VERB V_IMP Imperative verb
VERB V_PA Past tense verb
VERB V_PP Past participle
VERB V_PRS Present tense verb
VERB V_SUB subjunctive verb
X FW Foreign word

We used the Universal Dependencies annotation scheme which produces syntactic analyses of sentences in terms of the dependency structures of dependency grammar, determined by the relation between a head and its dependents. The syntactic annotation consists of 42 dependency relations, including 32 universal and 10 language-specific relations (marked by *).

Dependency relation Description
acl Clausal modifier of noun
acl:relcl* relative clause modifier
advcl Adverbial clause modifier
advmod Adverbial modifier
amod Adjectival modifier
appos Appositional modifier
aux Auxiliary
aux:pass Passive auxiliary
case Accusative marker/case marking
cc Coordination
cc:preconj* Preconjunction
ccomp Clausal complement
compound Compound
compound:lvc* Nominal/adjectival NVE in complex predicates
compound:prt* Particle NVE in complex predicates
compound:redup* Reduplicative words
compound:svc* Serial verb constructions
conj Conjunct
Cop Copula
det Determiner
det:predet* Predeterminer
discourse Discourse element
discourse:top/foc* Topic/focus marker
dislocated Dislocated elements
fixed Fixed multiword expressions
flat Flat multiword expressions
goeswith Goes with for poorly-edited words
nmod Nominal modifier
nmod:poss* Possessive/genitive modifier
nsubj Nominal subject
nsubj:pass Passive nominal subject
nummod Numeric modifier
mark Complementizer/marker
obj Object
obl Oblique
obl:arg* Oblique core argument
orphan Ellipsis constructions
parataxis Parataxis
punct Punctuation
root Root
vocative Vocative
xcomp Open clausal complement

References

Nivre, Joakim, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis M. Tyers, and Dan Zeman. (2020). Universal dependencies v2: An evergrowing multilingual treebank collection. In Proceedings of the 12th Conference on Language Resources and Evaluation (LREC), 4027–4036.

You might also like...
A universal framework for learning timestamp-level representations of time series

TS2Vec This repository contains the official implementation for the paper Learning Timestamp-Level Representations for Time Series with Hierarchical C

This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.
This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network.

GPRGNN This is the source code for our ICLR2021 paper: Adaptive Universal Generalized PageRank Graph Neural Network. Hidden state feature extraction i

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

LiDAR R-CNN: An Efficient and Universal 3D Object Detector Introduction This is the official code of LiDAR R-CNN: An Efficient and Universal 3D Object

 URIE: Universal Image Enhancementfor Visual Recognition in the Wild
URIE: Universal Image Enhancementfor Visual Recognition in the Wild

URIE: Universal Image Enhancementfor Visual Recognition in the Wild This is the implementation of the paper "URIE: Universal Image Enhancement for Vis

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation [Paper] Prerequisites To install requirements: pip install -r requirements.txt

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+
Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Based on Yolo's low-power, ultra-lightweight universal target detection algorithm, the parameter is only 250k, and the speed of the smart phone mobile terminal can reach ~300fps+

Official Implementation of Domain-Aware Universal Style Transfer
Official Implementation of Domain-Aware Universal Style Transfer

Domain Aware Universal Style Transfer Official Pytorch Implementation of 'Domain Aware Universal Style Transfer' (ICCV 2021) Domain Aware Universal St

Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Universal Adversarial Triggers for Attacking and Analyzing NLP This is the official code for the EMNLP 2019 paper, Universal Adversarial Triggers for

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)
Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

Owner
Roya Kabiri
Computational Linguist
Roya Kabiri
codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

DLCF-DCA codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification. submitted t

null 15 Aug 30, 2022
Pytorch implementation of “Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement”

Graph-to-Graph Transformers Self-attention models, such as Transformer, have been hugely successful in a wide range of natural language processing (NL

Idiap Research Institute 40 Aug 14, 2022
Implementation of fast algorithms for Maximum Spanning Tree (MST) parsing that includes fast ArcMax+Reweighting+Tarjan algorithm for single-root dependency parsing.

Fast MST Algorithm Implementation of fast algorithms for (Maximum Spanning Tree) MST parsing that includes fast ArcMax+Reweighting+Tarjan algorithm fo

Miloš Stanojević 11 Oct 14, 2022
Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph".

multilingual-mrc-isdg Code for the AAAI 2022 paper "Zero-Shot Cross-Lingual Machine Reading Comprehension via Inter-Sentence Dependency Graph". This r

Liyan 5 Dec 7, 2022
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

null 7.7k Dec 30, 2022
Official codebase for Pretrained Transformers as Universal Computation Engines.

universal-computation Overview Official codebase for Pretrained Transformers as Universal Computation Engines. Contains demo notebook and scripts to r

Kevin Lu 210 Dec 28, 2022
MagFace: A Universal Representation for Face Recognition and Quality Assessment

MagFace MagFace: A Universal Representation for Face Recognition and Quality Assessment in IEEE Conference on Computer Vision and Pattern Recognition

Qiang Meng 523 Jan 5, 2023
git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

USD-Seg This project is an implement of paper USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation, based on FCOS detector f

Ruolin Ye 80 Nov 28, 2022
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Mingyang Zhou 28 Dec 30, 2022