Failure to recognize differently-coloured text within larger chunk of text #69

ngoomie · 2024-06-20T02:37:37Z

Hi! I'm using manga-ocr to help me play through the PC release of the Ace Attorney trilogy. Often the games will have certain keywords highlighted in a different colour for emphasis, and when this happens, manga-ocr will usually fail to properly recognize the differently-coloured text. If the entire block of text is in a colour other than black or white it will be fine, and so it also is fine if you select just the word that manga-ocr failed to process on the prior attempt.

An example:

First pass this gets OCR'd like this:

裁判が進むと、こんなふうに《．．．》が提出されていくの。

Second pass on just the red word, it gets OCR'd correctly as 証拠品.

This has happened rather consistently with any instances of text like this that I've found. I can probably provide more examples if needed.

I also understand that this is somewhat out of the scope of manga-ocr since, well, it's not manga! And it technically works fine enough to be usable anyways. So I understand if this issue doesn't get touched on at all, but I figured it might be worth reporting anyways just in case.

ngoomie · 2024-06-20T02:58:47Z

Oh actually, another thing worth mentioning too is sometimes it just stops at the differently-coloured word and doesn't finish the rest of the chunk of text that was white. An example here too:

My OCR result was the following:

殺人現場から逃げていく被告人・矢張くんを

kha-white · 2024-06-21T20:14:09Z

Thank you, I'm not sure if I can do anything about it soon, but it's an interesting insight.

HighLiuk · 2024-08-30T07:32:52Z

@ngoomie MangaOcr preprocessed the image by first making it gray-scale. This is how your image looks like when converted to gray-scale.

As you can see, it's barely readable. In your case of white text with some red over black background, maybe it's better to preprocess the image to turn red into white and then get the results.

import numpy as np
from PIL import Image

img = Image.open('aceattorney.jpg')
# Make sure to not have alpha channel
img = img.convert('RGB')
# Convert the image to a NumPy array
img = np.array(img)
# Compute the maximum value for each pixel across the RGB channels
img = np.max(img, axis=2)
# Create a new grayscale image from the maximum values
img = Image.fromarray(img)

print(mocr(img))

And this is the recognized text:
裁判が進むと、こんなふうに《証拠品》が提出されていくの。

This way of preprocessing the image also works if the text is green, blue, or colored in general. But it works differently for text that is black over white background (in which case you should probably take the minimum instead of maximum).

Hope it helps.

HighLiuk · 2024-08-30T07:45:04Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure to recognize differently-coloured text within larger chunk of text #69

Failure to recognize differently-coloured text within larger chunk of text #69

ngoomie commented Jun 20, 2024

ngoomie commented Jun 20, 2024

kha-white commented Jun 21, 2024

HighLiuk commented Aug 30, 2024

HighLiuk commented Aug 30, 2024

Failure to recognize differently-coloured text within larger chunk of text #69

Failure to recognize differently-coloured text within larger chunk of text #69

Comments

ngoomie commented Jun 20, 2024

ngoomie commented Jun 20, 2024

kha-white commented Jun 21, 2024

HighLiuk commented Aug 30, 2024

HighLiuk commented Aug 30, 2024