Image to Text using Python
In my last post we saw how to download youtube videos using python you can see here.
in this post we will discuss how can we use python for finding the text which is their on the image so lets begin.
Now this task can be achieved in both ways offline and online, if you want to achieve this offline then you need to take lots of headache like downloading the pytesseract , models and setting it up in this article we are going to use online method as it is fast and more convenient.
Requirements:- For this project we need the following things listed below.
- Make Sure Python is installed in your computer(I am using python 3.x ) .
- API
- Libraries :- cv2 , requests , io and PIL
How to get API ?
Ans:-
- visit OCR Space .
- click on the Free OCR API
- Click on "Get your free API" or click Get Your Free Api here
- After the registration on OCR SPACE site you can get your API.
- Now we are ready to go
How to install Libraries ?
Ans:- open your command prompt and type following commands.
- pip install opencv-python
- pip install requests
- pip install pillow
Working :- Before we Jump to the coding part lets understand the working first.
- Python program will take ".jpg" image as an input.
- perform some operation on the image .
- now using API it will send the post request to the ocr space site.
- if everything is ok till this step ocr space will give the response.
- After performing some parsing we can extract the text from the response.
Coding:- Now lets implement this in coding.
step 1 :- Import the libraries which is required in this project.
import cv2 import requests import io from PIL import Image
step 2 :- Once all the required libraries are imported next step is to read or load the image in the program. But we cannot read the image directly we need to take help of PIL library it has method Image.open(path of the image).
im1 = Image.open(input("Enter the image path"))
Now we have the image and we will save the image in the current working directory
im1.save('test.jpg')
step 3 :-We have our image in the current working directory but it is possible that size of the image is large i.e in several mega bytes which can slow down the request process therefore we need to apply compression on the image.but we cannot use image directly for compression first we need to convert it into arrays.opencv has imread(path_of_the_image) method that converts into the array.
path_image=r'test.jpg' img =cv2.imread(path_image)
step 4 :-In the img variable we have the array of our image now we can perform compression.
_ , compressedimage = cv2.imencode(".jpg",img,[1,90])#compression
Now we have the compressed image but for making the request we require the file in bytes format so lets convert in bytes using the io library. it has BytesIO(image) method that takes an image as an input and returns the bytes format of the image.
file_bytes = io.BytesIO(compressedimage)#converting to bytes
step 5 :- Now our image is ready to make request but for making a request we require a url where we will make request.we will use below url and API(enter your api in the api_token)
api_token = "Enter your Api here"#register on http://ocr.space/ for free API
api_token = "Enter your Api here"#register on http://ocr.space/ for free API
step 6 :- Now we will make post request and provide the bytes format of the image and our api_token.
response=requests.post(url_api,files={path_image:file_bytes},data={"apikey":api_token})
it will return the result in json format we can use below code for printing the result.
print("Detected Text are:-")print(response.json()['ParsedResults'][0]['ParsedText'])
You can get complete code here
output:-
Post a Comment