From Veins to Bones: SE-RegUNet’s New Frontier in Medical Imaging

Maksim Sukhodolov October 28th, 2024

Bone diseases can develop unnoticed and lead to severe consequences if not detected in time. Doctors face numerous X-rays every day, and sometimes there’s simply not enough time for a detailed analysis.

Now imagine that AI can detect fractures and highlight problematic areas on its own. This could significantly speed up and simplify diagnostics.

At Setronica, we like to mix our curiosity with the newest technology. We’re all about exploring artificial intelligence (AI) in ways that are practical and get results.

We do three main things:

We try out new AI experiments that others have done, to see if they really work.
We come up with our own new ideas about machine learning and test them out.
We create AI solutions that solve real problems in the world.

This time we took on the task of adapting the SE-RegUNet architecture, originally developed for vascular analysis, to the bone damage segmentation.

This model has already demonstrated high accuracy in complex segmentation tasks, and its capabilities were ideal for detecting fine and complex structures, such as bone fractures.

Detecting bone structures in medical imaging

SE-RegUNet overview

But what makes SE-RegUNet so effective? It combines RegNet blocks, which help improve feature extraction, and Squeeze-and-Excitation (SE) blocks, which focus attention on the most important areas of the image. It also uses image preprocessing techniques such as CLAHE and USM. Thanks to these architectural solutions, the model can distinguish complex structures better, even under low-contrast conditions.

Images before and after applying CLAHE and USM

CLAHE (Contrast Limited Adaptive Histogram Equalization), an advanced form of histogram equalization, excels at improving local contrast in images. It operates by dividing the image into small tiles and applying histogram equalization to each, while limiting contrast enhancement to mitigate noise amplification. This method is particularly valuable in medical imaging and for enhancing detail in areas that are overexposed or underexposed.

USM (Unsharp Masking), on the other hand, is a sharpening technique that focuses on enhancing edge contrast in images. The process involves creating a blurred version of the original image, subtracting it to form a mask, and then adding this mask back to the original image. This method effectively brings out fine details and textures, making it a staple in digital photography and image editing software.

Optimization of the training process

When we began adapting SE-RegUNet for analyzing bone X-rays, we realized that adjusting the architecture was not enough to achieve stable and accurate results. We had to dig deeper into the training process, and it turned out to be quite challenging.

Fixing random seed

Every time the model initializes weights randomly, the results can vary greatly. This means that in one run, the model can achieve good results, and in another, things might go wrong.

In the early stages of the experiment, we noticed significant discrepancies in Dice Score and Accuracy across devices. We suggested that fixing the seed would help stabilize the metrics and reduce the impact of random processes.

With setting a fixed random seed, the model always starts training with the same initial parameters, which reduces metric fluctuations and speeds up the search for optimal solutions. This is not a random selection of weights, but a way to improve accuracy and reproducibility.

Restart due to high initial loss

Early stopping

However, the random seed is not the only trick in our arsenal. We also used an early stopping threshold to speed up the training process.

Early stopping is a regularization technique that helps prevent overfitting by monitoring the model’s performance on a validation set during training. When the model’s performance on the validation set begins to degrade, training is halted, effectively “stopping early” before the model can overfit to the training data.

When the model began showing large losses (e.g., exceeding 0.6), we automatically stopped the cycle and restarted it with a new seed. This avoided futile attempts to “pull” the model out of a bad state, and immediately aimed for optimal parameters. This approach significantly saves time and resources.

Mismatch between image and mask tensors

During our work, we also encountered a critical issue where image and mask tensors were mismatched in size. Such mismatches can arise due to various reasons, including inconsistent preprocessing steps, data augmentation errors, or discrepancies in the dataset preparation pipeline.

This caused major errors during model training. The solution involved recalculating and ensuring consistent tensor dimensions, which allowed the model to properly align the images and masks during training.

Longer training opportunities

But what if you have time for long training? In this case, you can use the classic method – increasing the number of epochs.

Epoch increase is part of the iterative training process in neural networks that involves exposing the model to the entire training dataset multiple times:

In each epoch, the model processes every sample in the training set.
After each sample, the model’s parameters are updated based on the calculated loss.
At the end of each epoch, validation is typically performed to assess performance.

All of this allows the model to gradually accumulate knowledge and correct mistakes. But to ensure that the model is learning effectively, we used a technique called five-fold cross-validation testing.

What is five-fold testing?

It is a method where the data is divided into five parts, and in each cycle, four are used for training, and one for testing. Thus, each part of the data is tested at least once.

This allows for

assessing the model’s stability;
ensuring that the results are not dependent on a random data set;
increasing accuracy through broader analysis.

Each fold (cycle) gives its evaluation, and in the end, we can calculate the average Dice Score and Accuracy. This gives us a more objective picture of how the model works on real data.

Five-fold cross-validation results

Not only do these measures speed up training, but they also make the model more resilient to “bad” data sets or unexpected errors.

Project discoveries

We weren’t just recreating a model for vascular segmentation – our goal was to adapt it for analyzing bone X-rays. And here SE-RegUNet showed its strength. Thanks to the use of RegNet blocks and SE blocks (Squeeze-and-Excitation), the model effectively extracts important features from images and highlights areas that need special attention, such as zones where fractures might be present.

Advantages of SE-RegUNet for bone analysis

Enhanced structure extraction. The complexity of X-rays lies in the fact that bones can overlap, making visual analysis difficult. SE-RegUNet helps solve this problem by focusing on key areas of the image.
Increased accuracy due to image preprocessing.We used a combination of the CLAHE and USM methods, which improve contrast and sharpness, making fractures more visible.
Model scalability.It can be easily configured for other tasks – for example, lung image analysis or even satellite image structure segmentation. However, it’s important to note that the model isn’t always suitable for all types of images.

Generated mask applied to the input X-ray image

Where can the model be applied?

Medical diagnostics is the main focus of our project. SE-RegUNet can be used to detect fractures, cracks, and other anomalies on bone X-rays. This can speed up doctors’ work and reduces the risk of human error.

Segmentation of natural objects – for example, highlighting rivers on geographical maps. The model shows high accuracy in tasks where it’s important to distinguish complex structures from background objects.

Where should the model not be applied?

Despite all its advantages, SE-RegUNet has limitations. For example, it should not be used in tasks requiring real-time data processing. Due to its complex architecture, the model requires significant computational resources, and its speed may not be sufficient for processing video or streaming data.

Additionally, in some cases, artifacts may appear + especially on image edges, where bones or other objects may be obscured by other structures. We plan to solve this problem in future versions of the model, possibly using other approaches, such as CIDN.

What’s next?

Integration of CIDN methods: We plan to implement CIDN methods to eliminate artifacts on image edges and improve segmentation accuracy.

Performance optimization: Our plans include reducing processing time and lowering resource requirements for more efficient application of the model in practical conditions.

Conclusion

We successfully adapted and implemented the SE-RegUNet model for fracture detection alongside vascular detection. Overcoming technical challenges, we were able to improve the training process, minimize losses by fixing the random seed and early stopping threshold, and achieved high accuracy in our experiments.

We keep up with the newest AI developments and help businesses use them. If you want to

learn about big AI language tools,
deal with AI ethics issues,
create special AI solutions for your needs,

Setronica can help you with all of these – contact us here.

We work together with our clients, mixing careful research with new ideas. This way, your AI projects are based on good science and real-world use. We’d love to work with you to create the future of AI.