All news

How Celsus gathers and evaluates medical feedbacks


It's not an easy process to create medical AI-based services. This is due in part to the fact that this field of development is relatively new and has unique characteristics; the healthcare is regarded as one of the most difficult industries for a reason. The medical professions conservatism is understandable given that patient lives and health are at stake. As a result, maintaining regular contact with medical professionals and specialists becomes a crucial practice for any medical AI developer.

For more than five years, we have developing AI solutions to help doctors to diagnose faster. In this piece, we'll describe the different kinds of feedback we get from radiologists throughout the course of our work and how we use it.

Feedbacks about the process of  medical AI training

During the system's development, doctors provide the most direct feedback to us. First, we require medical data labeling made by radiologists to train the model. As a result, whether it's a lymph node or a malignant tumour, the neural network learn to recognize patterns and indications typical of diverse items on medical imaging.

Nevertheless, if we ask just one doctor for their perspective while creating an AI system, we will only get opinion-based information, and different doctors may have quite different views on the same study. In addition, we frequently encounter circumstances in practice what are unclear, if not contradictory.

We are developing something like to a virtual consultation where doctors can exchange thoughts on a specific issue and learn the opinion of their peers in order to ensure that the data we use to train the neural network is as objective as feasible. Typically, we reach a consensus during such conversations with doctors.

Also, we have a different consultant radiologist for each diagnostic area in which we work (mammography, fluorography, chest computed tomography, and brain computed tomography). Because we, the developers, lack medical understanding and clinical experience, this more experienced doctor verifies the markup of data for other doctors. The consultant provides guidance to ML experts, settles disputes in the evaluation of research, and assists in developing instructions for doctors who are engaged in data labeling. In this post, we've given you more information about their duties and the function of doctors generally within our team.

Users feedbacks

Doctors who use our products directly provide us with another valuable source of feedback. For example, in Moscow (as part of the experiment of the Department of Health), Celsus solutions are used in more than 100 medical institutions. The Scientific and Practical Clinical Center of Diagnostics and Telemedicine regularly conducts webiners for developers, where doctors talk about the work of AI-services and give them the feedback.

Medical experts who are interested in artificial intelligence really serving as their helper are particularly engaged in the Moscow experiment. Thus, numerous doctors got in touch with us and invited us to their hospitals to receive feedback on how we could improve our service. We conducted this type of communication with many doctors a year and a half ago. They offer us two to three hours during which they test the new version in action and inform us of the system's strengths and areas for improvement.

For example, pleural effusion was once detected in the sinuses and corners at the bottom of the lungs using a new version of our system for processing Chest CT scans. Actually, it was merely the folds of the lungs overlaid there; it wasn't a pleural effusion. As a result of the comments, we immediately took this into consideration, annotated extra data, retrained the model, and within a few weeks, this issue was fully eliminated.

Comments from people who have not yet used Celsus

As mentioned above, more than 100 hospitals participate in the Moscow experiment by using Celsus solutions. However, numerous services from other developers take part in the experiment in addition to our services. Also, each hospital selects the AI system that it believes to be the finest.

We went to more than 20 hospitals that don't use our services to get feedback, and we asked which service they thought was the best and why. This allowed us to pinpoint several key points and once more improve our solutions. The doctors who had been previously interviewed were then asked to compare, contrast, and evaluate our service in light of improvements. As a result, several hospitals decided to use our services after noticing improvements.

Comments from those planning to use

Not only in Moscow are our solutions used. More than 30 regions of the Russian Federation currently use them, and we are currently working to broaden their geographic use. We always run pilot tests before expanding into a new area because sometimes AI services may not function properly on various medical image types produced by various equipment. As a result, during the pilot stage, we continuously check for critical errors in the research flow (for example, when the system has not processed anything at all).

We then ask doctors for their opinion once more. It typically goes like this in this situation: for instance, 100 studies are taken, and they are analyzed The doctor then completes a table that the system will use to evaluate the findings of the analysis. There are a total of four possible results:

  • truly positive : artificial intelligence has identified pathology-related indicators in the study;
  • false positive - the system has identified pathology but the study does not contain any;
  • False negative: When signs of pathology are present but the system did not pick them up. 
  • True negative: When artificial intelligence has not picked up any signs of pathology and there actually aren't any in the study.

Doctors add the following information to the table: in such a study, a false positive result in such a pathology. Additionally, we ask them to explain any errors and suggest improvements before taking the document to the appropriate department for correction.


Developers of the system require various types of feedback from the medical community at various stages of its development, and each stage requires it absolutely. The Celsus team appreciates all doctors who spoke with us and interacted with us. We admire people like you who aren't afraid to learn new skills for their jobs and thereby advance medicine.

Send a request for demo access to the system if you want to use your data to test out Celsus' artificial intelligence. We'll talk to you soon!