ChEnPat & LEnChPat: Chinese-English Parallel Patent Corpora

We have been building Chinese-English parallel corpora since 2007 in Language Information Sciences Research Centre, City University of Hong Kong. Currently, there are two Chinese-English parallel corpora of parallel sentences extracted from comparable patents.

Note: The corpora are in their preliminary stages, and we are still trying to improve them now. Any comments or suggestions are greatly appreciated.
The data collections are copyrighted by Language Information Sciences Research Centre, City University of Hong Kong.

Parallel Corpora

Other References

The main papers relevant to the corpora:

Contact Information

We are planning to make part of them public to the research community.
If you are interested in or have any question about the corpora, please contact with Professor Benjamin K. Tsou at Language Information Sciences Research Centre, City University of Hong Kong.
    Bin LU (PhD student) : lubin2010 at gmail DOT com
    Professor Benjamin K. Tsou : rlbtsou at cityu DOT edu DOT hk

version 0.1
last modified March 16, 2010