The original ConcGram software was developed by Chris Greaves (2005) and is distributed with John Benjamin's book or order it online. ConcGram was designed to automatically search for concgrams and their frequency in a corpus which facilitates a truly 'corpus-driven' methodology. Unlike ngrams or skipgrams, concgrams shows both constituency (AB, ACB) and positional (AB, BA) variations (Cheng et al. 2006). The original ConcGram is a closed source software which was designed in the year of 32-bit Microsoft Windows. The 32-bit system limits the maximum amount of computer memory the program use, and restrains ConcGram from effectively handling a large corpus.
RCPCE rewrote the core automatic concgram identification feature of ConcGram and named the new software as ConcGramCore in March 2018 for teaching and research purposes.
ConcGramCore utilised the SQL query engine of SQLite3 and Strawberry Perl for the automatic concgram search process and is much more efficient to identify concgrams in a large corpus. The codes can be modified to run on Linux and Apple computers. The program was not designed to be the fastest but to allow simple maintenance and scalability. The core identification process is handled by SQLite which is a robust and well-maintained engine to get the job done. More importantly, all these technologies are free and still widely used (and supported).
ConcGramCore users can also have the options to select the desire segmentation methods. A simple segmentation method separate English words by punctuations, white spaces and paragraph marks. ConcGramCore also utilises the Stanford Part-of-Speech Tagger engine for more accurate segmentation and for compatibility to handle segmentation of other languages such as Arabic and Chinese (with modifications on the code). ConcGramCore processes corpora in batch. The output is saved automatically.
ConcGramCore runs on 32-bit and 64-bit Microsoft® Windows computer with 4GB or more memory installed. As the engine will use your harddisk space to swap the concgrams it found, the faster the harddisk, the better. Minimum 250GB free spaces and 4GB computer memory are recommended for a wide span search on a multi-million words corpus.
You will still need ConcGram or other corpus software (e.g. AntConc) to lookup the concgrams concordances that ConcGramCore found.