Should LLMs just treat text content as an image?

Should LLMs just treat text content as an image?