Hong Kong Engineering Corpus (Part of speech search)
Welcome to the Hong Kong Engineering Corpus (HKEC) developed by the Research Centre for Professional Communication in English of The Hong Kong Polytechnic University. The HKEC is a large collection of texts collected from the engineering sector of Hong Kong.
There are currently 9,224,384 words in the HKEC.
Remarks: (Click here for detailed instructions)
(Recommend using Mozilla Firefox or Apple® Safari to search. Best viewed in 1680x1050 screen resolution.)
1. Query word/phrase accepts English alphabets and dash only (i.e. A-Z, a-z, -), case-insensitive.
2. Search string syntax word^pos. [e.g. book^nn, book^vb ^dt, ^vb ^dt ^nn, etc.]
- The symbol ^tag indicates the tag to be searched for. The string in front of the caret symbol ^ is the query word.
- A maximum of 5 units of 'word', 'word^pos' or '^pos' can be used for query.
- A caret with a number (e.g. ^2) in the middle of the search string instructs the search engine to skip two words in between any two query parameters (e.g. book ^2 advance).
- Head and/or tail partial search of the query word/phrase is possible by adding ' - ' to the front or end of the query word/phrase (e.g. -ment, advan-, -fine-)
- Tail partial search for the POS tag is possible by adding '-' at the end of the POS tag (e.g. ^NN- will return ^NN, ^NNS, ^NNP, ^NNPS)
- Units are separated by a space. Punctuation is not accepted.
3. Part-of-speech tag symbols in alphabetical order*
^CC |
Coordinating conjunction |
^PRPS |
Possessive pronoun |
^CD |
Cardinal number |
^RB |
Adverb |
^DT |
Determiner |
^RBR |
Adverb, comparative |
^EX |
Existential 'there' |
^RBS |
Adverb, superlative |
^FW |
Foreign word |
^RP |
Particle |
^IN |
Preposition or subordinating conjunction |
^TO |
'to' |
^JJ |
Adjective |
^UH |
Interjection |
^JJR |
Adjective, comparative |
^VB |
Verb, base form |
^JJS |
Adjective, superlative |
^VBD |
Verb, past tense |
^MD |
Modal |
^VBG |
Verb, gerund or present participle |
^NN |
Noun, singular or mass |
^VBN |
Verb, past participle |
^NNS |
Noun, plural |
^VBP |
Verb, non-3rd person singular present |
^NNP |
Proper noun, singular |
^VBZ |
Verb, 3rd person singular present |
^NNPS |
Proper noun, plural |
^WDT |
Wh-determiner |
^PDT |
Predeterminer |
^WP |
Wh-pronoun |
^POS |
Possessive ending |
^WPS |
Possessive wh-pronoun |
^PRP |
Personal pronoun |
^WRB |
Wh-adverb |
4. Maximum 10,000 concordance lines can be listed.
* Source: Santorini, B. (1990). Part-of-speech tagging guidelines for the Penn Treebank Project (3rd revision). Retrieved December 20, 2016 from http://repository.upenn.edu/cgi/viewcontent.cgi?article=1603&context=cis_reports
* The corpus is tagged with Stanford Part-of-speech Tagger v.3.6.0.
|
Back to Main Profession-specific Corpora Search Page
Copyright - Every effort has been made to
contact all the copyright holders to obtain their permission to include
the texts contained in the HKEC. We are very grateful to the
many organisations that have given their support to the HKEC.
For relevant details of the copyright holders click
here.
Please note that the contents in the
HKEC do not represent the views of the organisation and/or writer.
|