It has only been 5 days since DeepSeek V4 was released, and there are new actions almost every day. Yesterday, researchers predicted that multi-modal capabilities are coming, and today there is already a grayscale test. Many people have discovered that the DeepSeek web page has added a picture recognition mode, which means that it can understand image information. Although this ability will not directly improve the programming and reasoning performance of AI, it will be very convenient to use. If you encounter problems in daily life, you can directly upload screenshots and let DeepSeek analyze it by itself. It is easier than describing the problem yourself.

Netizens who have reached Grayscale also use professional image information.For example, I used CT images taken by the hospital to check DeepSeek’s image recognition capabilities, and I was shocked by the results.

The CT image uploaded by @brick, a netizen in the Linux.do community, is from a professional paper. After being analyzed by DeepSeek, it can accurately determine the content of the image and conduct professional analysis.Several results were ultimately produced, giving possible directions for the disease, including several different types of pneumonia possibilities.

There is a clear conclusion in the paper where this CT image is located. By comparison, it can be seen that DeepSeek's analysis is still very reliable, and it can assume the role of an AI doctor in this regard.

However, AI is AI after all. It can help everyone analyze the situation. Such major medical examinations and confirmation of diseases require analysis and confirmation by hospitals and doctors.

If it is not a serious disease, you can actually use AI to be a doctor for common medical problems. There are also many AI applications trained by large professional medical models. It is enough to use them to judge problems and provide suggestions when encountering problems. There is no need to go to the hospital and queue up for small problems.

Back to the issue of DeepSeek, they have also conducted multi-modal research before, and the open source OCR technology has even reached the world's top level. Therefore, the visual capabilities are also worth looking forward to, which can further improve the capability range and usage limit of the DeepSeek V4 large model.