language: id
pipeline_tag: dependency-parsing
widget:
- text: Presiden Joko Widodo mengunjungi korban bencana alam di Palu.
license: mit
library_name: spacy
tags:
- id
- spacy
- dependency-parsing
- indonesian
- gsd
model-index:
- name: spacy-dep-parsing-id
results:
- task:
type: dependency-parsing
name: Dependency Parsing
dataset:
type: ud-id-gsd
name: UD Indonesian GSD (Test Split)
config: test
split: test
revision: main
metrics:
- type: dep_uas
value: 0.8282
name: UAS (Unlabeled Attachment Score)
- type: dep_las
value: 0.7436
name: LAS (Labeled Attachment Score)
- type: sents_f
value: 0.9937
name: Sentence F-Score
spaCy Dependency Parsing Model for Indonesian (UD-ID-GSD)
This repository contains a spaCy v3 model trained for Dependency Parsing on the Indonesian language. The model was trained using the configuration generated by spacy init config
with default settings for the parser component.
Dataset
The model was trained on the Universal Dependencies Indonesian GSD (UD-ID-GSD) dataset. (Reference: McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., Bednářová, Z., Wang, S., & Lee, Y. (2013). Universal Dependency Annotation for Multilingual Parsing.)
The dataset splits used contained the following number of documents:
- Total Sentences (approx.): 5,593
- Training Set: 4,477 documents
- Development (Dev) Set: 559 documents
- Test Set: 557 documents
Pipeline Components
This model's pipeline only contains the parser
component. It does not include a tagger, NER, or other components by default. The parser relies on internal token-to-vector embeddings trained during the process.
How to Use
You can load this model directly using spaCy after installing it:
import spacy
# Load the model from Hugging Face Hub
model_id = "freksowibowo/spacy-dep-parsing-id"
try:
nlp = spacy.load(model_id)
print(f"Model '{model_id}' loaded successfully.")
# Example usage
text = "Gubernur Jawa Barat Ridwan Kamil meresmikan jembatan baru di Cirebon."
doc = nlp(text)
print("\nDependency Parse Results:")
print(f"{'Token':<15} {'Relation':<10} {'Head':<15} {'Head POS':<8}")
print("-" * 50)
for token in doc:
print(f"{token.text:<15} {token.dep_:<10} {token.head.text:<15} {token.head.pos_:<8}")
# You can also visualize using displacy (if in Jupyter/IPython)
# from spacy import displacy
# displacy.render(doc, style="dep", jupyter=True, options={'distance': 100})
except OSError:
print(f"Error: Model '{model_id}' not found.")
print("Please ensure you have internet connection and the repository ID is correct.")
except Exception as e:
print(f"An error occurred: {e}")