Home Up Next

The Nijmegen Arabic/Dutch Dictionary Project

The RBN (Referentie Bestand Nederlands, Reference File Dutch)


This Dutch Reference File, usable as point of departure for various dictionary projects involving various languages, was compiled because the CLVV intended to avoid the necessity to pay for the use of a Dutch database for every single project (involving different publishers).


The structure of the lexicon is presented as follows:
The Lexicon L is made up of a core + extensions.
Each extension consists of a core + extensions.
Different subject fields (economy, biology etc.) can be considered extensions, where each extension consists of a core of more or less generally known terminology and an extension of less known specialized terminology.


The macro is based on a core of basic vocabulary. This core was the 'Basiswoordenboek van de Nederlandse Taal' (Basic Dictionary of the Dutch Language) published by Van Dale, with a macro of 25.000 entries.
This core  was extended with words extracted from a 5 million words corpus. Criterion was a frequency of 10 or more.
Transparent compounds have been added, as well as encyclopedic terms, vulgar words etc.


An example of the encoding of the RBN is shown in section 1.4.3 SGML Examples of RBN data.

reactions to: j.hoogland@let.kun.nl
last updated 26/10/2003 15:16 +0100
Home Up Next