Shortcuts

Set-up

Install

Download this repository from GitHub:

$ git clone https://github.uio.no/arthurd/in5550-exam
$ cd in5550-exam

The dataset is part of the repository, however you will need to give access to word embeddings. You can either download the Norwegian-Bokmaal CoNLL17 a.k.a the 58.zip file from the nlpl website, or provide them from saga server.

Make sure that you decode this file with encoding='latin1.

Baseline

To run the baseline.py script, use device=torch.device('cpu') as the cuda version was not implemented.

$ python baseline.py

Additional arguments can be provided, as:

  • --NUM_LAYERS: number of hidden layers for BiLSTM

  • --HIDDEN_DIM: dimensionality of LSTM layers

  • --BATCH_SIZE: number of examples to include in a batch

  • --DROPOUT: dropout to be applied after embedding layer

  • --EMBEDDING_DIM: dimensionality of embeddings

  • --EMBEDDINGS: location of pretrained embeddings

  • --TRAIN_EMBEDDINGS: whether to train or leave fixed

  • --LEARNING_RATE: learning rate for ADAM optimizer

  • --EPOCHS: number of epochs to train model

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Check the GitHub page and contribute to the project

View GitHub