The CSIS project builds upon a core research capability in multimedia content analysis within the health domain, focusing on a particular application: support systems for cancer management, both for individual patients and population-level analyses.
The cancer "stage" is a categorisation of its progression in the body, in terms of the extent of the primary tumour and any spreading to local or distant body sites. While staging has a fundamental role in cancer management, due to the expertise and time required and the multi–disciplinary nature of the task, cancer patients are not always routinely staged. By automating the collation, analysis, summarisation and classification of relevant patient data, the reliance on expert clinical staff can be lessened, improving the efficiency and availability of cancer staging.
Initial work will investigate the summarisation and categorisation of patient reports to assist with staging lung cancer. Longer term research will investigate extending this in three ways:
- Extensions to handle other data and cancer types. Initial work is focusing on staging lung cancer using text reports radiology, histology), however opportunities exist to extend this to bowel and other cancers, and also to use information extracted from other forms of data, for example, radiological images.
- Classifying cancer characteristics other than stage. The techniques used to classify cancer stage may be extended to other tasks, such as filtering of patient data, for example, screening for cancer / non-cancer, or classification of cancer types.
- Population-level analyses. Statistical models may be used to identify trends and anomalies in cancer patient demographics or treatment / response characteristics, based on metadata extracted through the automatic content analysis techniques (for example, cancer type, cancer stage, etc).
The CSIS project, in collaboration with the Queensland Cancer Control Analysis Team (QCCAT), has produced a software prototype system for automatic pathological staging of lung cancer and this has been developed on a set of 710 lung cancer patients. The system inputs one or more free text reports for a patient describing surgical resections of the lung, and outputs a pathological T and N stage. In addition, an extract is produced consisting of sentences that were found to contribute to the final staging decision, and their relationship to criteria from the formal staging guidelines for lung cancer. The system has been formally trialled in a clinical setting on a previously unseen set of 179 lung cancer cases. The trial compared the automatic stage decisions to the stages assigned by two expert pathologists. The promising results obtained in the trial have motivated the development of a production-quality system suitable for deployment within cancer registries. The first release of this product is scheduled for December 2007.

