Hong Kong Budget 1997-2018
(Part of speech search)

Welcome to the Hong Kong Budget 1997-2018 hosted by the Research Centre for Professional Communication in English of The Hong Kong Polytechnic University. The Hong Kong Budget Corpus consists of from 1997 - 2018 delivered by the Financial Secretaries of Hong Kong. They are as published on the Hong Kong Government website, entirely as spoken, and the written headings as originally published have been removed. There are currently 294,517 words in the corpus.




    
    
    
        


Remarks: (Click here for detailed instructions)

(Recommend using Mozilla Firefox or Apple® Safari to search. Best viewed in 1680x1050 screen resolution.)

1. Query word/phrase accepts English alphabets and dash only (i.e. A-Z, a-z, -), case-insensitive.
2. Search string syntax word^pos. [e.g. book^nn, book^vb ^dt, ^vb ^dt ^nn, etc.]
  • The symbol ^tag indicates the tag to be searched for. The string in front of the caret symbol ^ is the query word.
  • A maximum of 5 units of 'word', 'word^pos' or '^pos' can be used for query.
  • A caret with a number (e.g. ^2) in the middle of the search string instructs the search engine to skip two words in between any two query parameters (e.g. book ^2 advance).
  • Head and/or tail partial search of the query word/phrase is possible by adding '  -  ' to the front or end of the query word/phrase (e.g. -ment, advan-, -fine-)
  • Tail partial search for the POS tag is possible by adding '-' at the end of the POS tag (e.g. ^NN- will return ^NN, ^NNS, ^NNP, ^NNPS)
  • Units are separated by a space. Punctuation is not accepted.
3. Part-of-speech tag symbols in alphabetical order*

^CC Coordinating conjunction ^PRPS Possessive pronoun
^CD Cardinal number ^RB Adverb
^DT Determiner ^RBR Adverb, comparative
^EX Existential 'there' ^RBS Adverb, superlative
^FW Foreign word ^RP Particle
^IN Preposition or subordinating conjunction ^TO 'to'
^JJ Adjective ^UH Interjection
^JJR Adjective, comparative ^VB Verb, base form
^JJS Adjective, superlative ^VBD Verb, past tense
^MD Modal ^VBG Verb, gerund or present participle
^NN Noun, singular or mass ^VBN Verb, past participle
^NNS Noun, plural ^VBP Verb, non-3rd person singular present
^NNP Proper noun, singular ^VBZ Verb, 3rd person singular present
^NNPS Proper noun, plural ^WDT Wh-determiner
^PDT Predeterminer ^WP Wh-pronoun
^POS Possessive ending ^WPS Possessive wh-pronoun
^PRP Personal pronoun ^WRB Wh-adverb

4. Maximum 10,000 concordance lines can be listed.

* Source: Santorini, B. (1990). Part-of-speech tagging guidelines for the Penn Treebank Project (3rd revision). Retrieved December 20, 2016 from http://repository.upenn.edu/cgi/viewcontent.cgi?article=1603&context=cis_reports
* The corpus is tagged with Stanford Part-of-speech Tagger v.3.6.0.

       

Back to Main Profession-specific Corpora Search Page