通用文字识别
适用场景
通用文字识别,是通过拍照、扫描等光学输入方式,将各种票据、卡证、表格、报刊、书籍等印刷品文字转化为图像信息,再利用文字识别技术将图像信息转化为计算机等设备可以使用的字符信息的技术。
- 可以对文档翻拍、街景翻拍等图片进行文字检测和识别,也可以集成于其他应用中,提供文字检测、识别的功能,并根据识别结果提供翻译、搜索等相关服务。
- 可以处理来自相机、图库等多种来源的图像数据,提供一个自动检测文本、识别图像中文本位置以及文本内容功能的开放能力。
- 支持特定角度范围内的文本倾斜、拍摄角度倾斜、复杂光照条件以及复杂文本背景等场景的文字识别。
效果如下图所示:

约束与限制
该能力当前不支持模拟器。
| AI能力 | 约束 |
|---|---|
| 文字识别 | - 支持的图片格式:JPEG、JPG、PNG。 - 支持的语言:简体中文、英文、日文、韩文、繁体中文。 - 文本长度:不超过10000字符。 - 支持文档印刷体识别,在识别手写字体方面能力有所欠缺。 - 输入图像具有合适成像的质量(建议720p以上),100px<高度<15210px,100px<宽度<10000px,高宽比例建议10:1以下(高度小于宽度的10倍),接近手机屏幕高宽比例为宜。 - 拍摄角度与文本所在平面垂直方向的夹角应小于30度。 |
开发步骤
-
在使用通用文字识别时,将实现文字识别的相关的类添加至工程。
import { textRecognition } from '@kit.CoreVisionKit'import { image } from '@kit.ImageKit';import { hilog } from '@kit.PerformanceAnalysisKit';import { BusinessError } from '@kit.BasicServicesKit';import { fileIo } from '@kit.CoreFileKit';import { photoAccessHelper } from '@kit.MediaLibraryKit'; -
简单配置页面的布局,并在Button组件添加点击事件,拉起图库,选择图片。
Button('选择图片').type(ButtonType.Capsule).fontColor(Color.White).alignSelf(ItemAlign.Center).width('80%').margin(10).onClick(() => {// 拉起图库,获取图片资源void this.selectImage();}) -
通过图库获取图片资源,将图片转换为PixelMap,并添加初始化和释放方法。
async aboutToAppear(): Promise<void> {const initResult = await textRecognition.init();hilog.info(0x0000, 'OCRDemo', `OCR service initialization result:${initResult}`);}async aboutToDisappear(): Promise<void> {await textRecognition.release();hilog.info(0x0000, 'OCRDemo', 'OCR service released successfully');}private async selectImage() {let uri = await this.openPhoto();if (uri === undefined) {hilog.error(0x0000, 'OCRDemo', "Failed to get uri.");return;}this.loadImage(uri);}private async openPhoto(): Promise<string> {return new Promise<string>((resolve) => {let photoPicker: photoAccessHelper.PhotoViewPicker = new photoAccessHelper.PhotoViewPicker();photoPicker.select({MIMEType: photoAccessHelper.PhotoViewMIMETypes.IMAGE_TYPE,maxSelectNumber: 1}).then((res: photoAccessHelper.PhotoSelectResult) => {resolve(res.photoUris[0]);}).catch((err: BusinessError) => {hilog.error(0x0000, 'OCRDemo', `Failed to get photo image uri. code: ${err.code}, message: ${err.message}`);resolve('');})})}private loadImage(name: string) {setTimeout(async () => {let imageSource: image.ImageSource | undefined = undefined;let fileSource = await fileIo.open(name, fileIo.OpenMode.READ_ONLY);imageSource = image.createImageSource(fileSource.fd);this.chooseImage = await imageSource.createPixelMap();}, 100)} -
实例化VisionInfo对象,并传入待检测图片的PixelMap。
VisionInfo为待OCR检测识别的入参项,目前仅支持PixelMap类型的视觉信息。
let visionInfo: textRecognition.VisionInfo = {pixelMap: this.chooseImage}; -
配置通用文本识别的配置项TextRecognitionConfiguration,用于配置是否支持朝向检测。
let textConfiguration: textRecognition.TextRecognitionConfiguration = {isDirectionDetectionSupported: false}; -
调用textRecognition的recognizeText接口,对识别到的结果进行处理。
当调用成功时,获取文字识别的结果;调用失败时,将返回对应错误码。recognizeText接口提供了三种调用形式,当前以其中一种作为示例,其他方式可参考API参考。
textRecognition.recognizeText(visionInfo, textConfiguration).then((data: textRecognition.TextRecognitionResult) => {// 识别成功,获取对应的结果let recognitionString = JSON.stringify(data);hilog.info(0x0000, 'OCRDemo', `Succeeded in recognizing text: ${recognitionString}`);// 将结果更新到Text中显示this.dataValues = data.value;}).catch((error: BusinessError) => {hilog.error(0x0000, 'OCRDemo', `Failed to recognize text. Code: ${error.code}, message: ${error.message}`);this.dataValues = `Error: ${error.message}`;});
开发实例
点击按钮,识别一张图片的文字内容,并通过日志打印。
Index.ets
import { textRecognition } from '@kit.CoreVisionKit'
import { image } from '@kit.ImageKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import { BusinessError } from '@kit.BasicServicesKit';
import { fileIo } from '@kit.CoreFileKit';
import { photoAccessHelper } from '@kit.MediaLibraryKit';
@Entry
@Component
struct Index {
private imageSource: image.ImageSource | undefined = undefined;
@State chooseImage: PixelMap | undefined = undefined;
@State dataValues: string = '';
async aboutToAppear(): Promise<void> {
const initResult = await textRecognition.init();
hilog.info(0x0000, 'OCRDemo', `OCR service initialization result:${initResult}`);
}
async aboutToDisappear(): Promise<void> {
await textRecognition.release();
hilog.info(0x0000, 'OCRDemo', 'OCR service released successfully');
}
build() {
Column() {
Image(this.chooseImage)
.objectFit(ImageFit.Fill)
.height('60%')
Text(this.dataValues)
.copyOption(CopyOptions.LocalDevice)
.height('15%')
.margin(10)
.width('60%')
Button('选择图片')
.type(ButtonType.Capsule)
.fontColor(Color.White)
.alignSelf(ItemAlign.Center)
.width('80%')
.margin(10)
.onClick(() => {
// 拉起图库,获取图片资源
void this.selectImage();
})
Button('开始识别')
.type(ButtonType.Capsule)
.fontColor(Color.White)
.alignSelf(ItemAlign.Center)
.width('80%')
.margin(10)
.onClick(() => {
this.textRecognitionTest();
})
}
.width('100%')
.height('100%')
.justifyContent(FlexAlign.Center)
}
private textRecognitionTest() {
if (!this.chooseImage) {
return;
}
// 调用文本识别接口
let visionInfo: textRecognition.VisionInfo = {
pixelMap: this.chooseImage
};
let textConfiguration: textRecognition.TextRecognitionConfiguration = {
isDirectionDetectionSupported: false
};
textRecognition.recognizeText(visionInfo, textConfiguration)
.then((data: textRecognition.TextRecognitionResult) => {
// 识别成功,获取对应的结果
let recognitionString = JSON.stringify(data);
hilog.info(0x0000, 'OCRDemo', `Succeeded in recognizing text: ${recognitionString}`);
// 将结果更新到Text中显示
this.dataValues = data.value;
})
.catch((error: BusinessError) => {
hilog.error(0x0000, 'OCRDemo', `Failed to recognize text. Code: ${error.code}, message: ${error.message}`);
this.dataValues = `Error: ${error.message}`;
});
}
private async selectImage() {
let uri = await this.openPhoto();
if (uri === undefined) {
hilog.error(0x0000, 'OCRDemo', "Failed to get uri.");
return;
}
this.loadImage(uri);
}
private async openPhoto(): Promise<string> {
return new Promise<string>((resolve) => {
let photoPicker: photoAccessHelper.PhotoViewPicker = new photoAccessHelper.PhotoViewPicker();
photoPicker.select({
MIMEType: photoAccessHelper.PhotoViewMIMETypes.IMAGE_TYPE,
maxSelectNumber: 1
}).then((res: photoAccessHelper.PhotoSelectResult) => {
resolve(res.photoUris[0]);
}).catch((err: BusinessError) => {
hilog.error(0x0000, 'OCRDemo', `Failed to get photo image uri. code: ${err.code}, message: ${err.message}`);
resolve('');
})
})
}
private loadImage(name: string) {
setTimeout(async () => {
try {
let fileSource = await fileIo.open(name, fileIo.OpenMode.READ_ONLY);
this.imageSource = image.createImageSource(fileSource.fd);
this.chooseImage = await this.imageSource.createPixelMap();
await fileIo.close(fileSource);
} catch (error) {
hilog.error(0x0000, 'OCRDemo', `Failed to open file. Error: ${error}`);
}
}, 100)
}
}