H5 图像识别-linux运维-PHP中文网

识别对比

1、百度识别

发现百度的图片搜索识别率不是特别，下面为测试图片跟测试后的结果：

测试图片：

下面为测试后的结果：

2、采用 tesseract.js 后结果

H5 图像识别（采用Tesseract.js 进行识别）

简单的文案之类的，识别的还算可以，但是稍微复杂点的，准确率就不是那么好了，在学习中。。。

安装代码语言：javascript代码运行次数：0运行复制

<code class="javascript"><script src='https://cdn.rawgit.com/naptha/tesseract.js/1.0.10/dist/tesseract.js'></script></code>

登录后复制

或者

PS：如果使用 npm 安装异常，可以使用 cnpm 进行安装使用

使用

demo 1：then使用

代码语言：javascript代码运行次数：0运行复制

<code class="javascript">var Tesseract = require('tesseract.js')Tesseract.recognize(myImage).then(function(result){    console.log(result)})</code>

登录后复制

demo 2：lang切换

代码语言：javascript代码运行次数：0运行复制

<code class="javascript">Tesseract.recognize(myImage, {    lang: 'spa',    tessedit_char_blacklist: 'e'}).then(function(result){    console.log(result)})</code>

登录后复制

demo 3：（then、progress、catch、then、finally）

代码语言：javascript代码运行次数：0运行复制

<code class="javascript">Tesseract.recognize(src, {        lang:"chi_sim",    })    .progress(function(message) {        console.log(message)    })    .catch(function(err) {        console.error(err)    })    .then(function(result) {        console.log(result.text)    })    .finally(function(resultOrError) {        console.log(resultOrError)    })</code>

登录后复制

参数介绍：

1、image是任何参数介绍：

image是任何 ImageLike 对象，取决于它是从浏览器还是通过NodeJS运行。

第一个参数，可以是 img 路劲地址，可以是图片base64位的二进制码、也可以是Image对象等。

附上实现的代码：

代码语言：javascript代码运行次数：0运行复制

<code class="javascript"><!DOCTYPE html><html>    <head>        <meta charset="UTF-8">        <meta name="viewport" content="width=device-width,initial-scale=1,shrink-to-fit=no,user-scalable=no,minimum-scale=1,maximum-scale=1">        <title>图片识别</title>        <style>body{margin:0 auto;width:500px;font-size:12px;font-family:"arial, helvetica, sans-serif"}fieldset{margin-bottom:10%;border:1px solid #ddd;border-radius:5px}img,select,button{cursor:pointer}img{background:#ddd}h2{font-weight:500;font-size:16px}fieldset legend{margin-left:33%}</style>    </head>    <body>        <fieldset>            <legend>                <h2> 图片识别前 </h2>            </legend>            选择文件：@@##@@            <p>                选择语言：                <select id="langsel" onchange="recognizeFile()">                    <option value='afr'> 南非荷兰语（Afrikaans） </option>                    <option value='ara'> 阿拉伯 （Arabic）</option>                    <option value='aze'> 阿塞拜疆 （Azerbaijani） </option>                    <option value='bel'> 白俄罗斯（美式：Belarusian） </option>                    <option value='ben'> 孟加拉（Bengali） </option>                    <option value='bul'> 保加利亚语（Bulgarian） </option>                    <option value='cat'> 西班牙 （Catalan）</option>                    <option value='ces'> 捷克（Czech） </option>                    <option value='chi_sim' selected> 中文（Chinese） </option>                    <option value='chi_tra'> 繁体中文（Traditional Chinese） </option>                    <option value='chr'> Cherokee </option>                    <option value='dan'> 丹麦语（Danish） </option>                    <option value='deu'> 德语（German） </option>                    <option value='ell'> 希腊语（Greek） </option>                    <option value='eng'> 英语（English） </option>                    <option value='enm'> 英文（老）（English (Old)） </option>                    <option value='meme'> Internet Meme </option>                    <option value='epo'> Esperanto </option>                    <option value='epo_alt'> （Esperanto alternative）</option>                    <option value='equ'> Math </option>                    <option value='est'> Estonian </option>                    <option value='eus'> 爱沙尼亚语（Basque） </option>                    <option value='fin'> （Finnish） </option>                    <option value='fra'> 芬兰（French） </option>                    <option value='frk'> Frankish </option>                    <option value='frm'> 法语（老）（French (Old)） </option>                    <option value='glg'> 加利西亚（Galician） </option>                    <option value='grc'> Ancient Greek </option>                    <option value='heb'> Hebrew </option>                    <option value='hin'> Hindi </option>                    <option value='hrv'> Croatian </option>                    <option value='hun'> Hungarian </option>                    <option value='ind'> Indonesian </option>                    <option value='isl'> Icelandic </option>                    <option value='ita'> 意大利（Italian） </option>                    <option value='ita_old'> 意大利语（老）（Italian (Old)） </option>                    <option value='jpn'> 日本（Japanese） </option>                    <option value='kan'> Kannada </option>                    <option value='kor'> 朝鲜的（Korean） </option>                    <option value='lav'> Latvian </option>                    <option value='lit'> Lithuanian </option>                    <option value='mal'> Malayalam </option>                    <option value='mkd'> Macedonian </option>                    <option value='mlt'> Maltese </option>                    <option value='msa'> Malay </option>                    <option value='nld'> Dutch </option>                    <option value='nor'> Norwegian </option>                    <option value='pol'> Polish </option>                    <option value='por'> 葡萄牙语（Portuguese） </option>                    <option value='ron'> Romanian </option>                    <option value='rus'> 俄语（Russian） </option>                    <option value='slk'> Slovakian </option>                    <option value='slv'> Slovenian </option>                    <option value='spa'> 西班牙语（Spanish） </option>                    <option value='spa_old'>老西班牙语 （Old Spanish） </option>                    <option value='sqi'> Albanian </option>                    <option value='srp'> 塞尔维亚语（拉丁语）（Serbian (Latin)） </option>                    <option value='swa'> Swahili </option>                    <option value='swe'> Swedish </option>                    <option value='tam'> Tamil </option>                    <option value='tel'> Telugu </option>                    <option value='tgl'> Tagalog </option>                    <option value='tha'> 泰国（Thai） </option>                    <option value='tur'> 土耳其（Turkish） </option>                    <option value='ukr'> 乌克兰（乌克兰） </option>                    <option value='vie'> 越南（Vietnamese） </option>                </select>            </p>            <p align="center">                <button onclick="btn()">执行</button>            </p>        </fieldset>        <fieldset>            <legend>                <h2> 输出结果 </h2>            </legend>            <div id="result"></div>        </fieldset>    </body>    <script src='img/tesseract.js'></script>    <script>        var src = document.querySelector("img").src,            selectOption = "",            result = document.querySelector("#result");        function recognizeFile() {            var select = document.querySelector("#langsel")            selectOption = select.options[select.selectedIndex].value;        }        function btn() {            Tesseract.recognize(src, {                    lang: selectOption ? selectOption : "chi_sim",                }).progress(function(message) {                    console.log(message)                })                .catch(function(err) {                    result.innerHTML = err;                    console.error(err)                })                .then(function(result) {                    console.log(result.text)                    result.value = result.text;                })                .finally(function(resultOrError) {                    result.innerHTML = resultOrError.value;                    console.log(resultOrError)                })        }    </script></html></code>

登录后复制

2、语言支持介绍：

lang

Language

‘afr’

Afrikaans

‘ara’

Arabic

‘aze’

Azerbaijani

‘bel’

Belarusian

‘ben’

Bengali

‘bul’

Bulgarian

‘cat’

Catalan

‘ces’

Czech

‘chi_sim’

Chinese

‘chi_tra’

Traditional Chinese

‘chr’

Cherokee

‘dan’

Danish

‘deu’

German

‘ell’

Greek

‘eng’

English

‘enm’

English (Old)

‘epo’

Esperanto

‘epo_alt’

Esperanto alternative

‘equ’

Math

‘est’

Estonian

‘eus’

Basque

‘fin’

Finnish

‘fra’

French

‘frk’

Frankish

‘frm’

French (Old)

‘glg’

Galician

‘grc’

Ancient Greek

‘heb’

Hebrew

‘hin’

Hindi

‘hrv’

Croatian

‘hun’

Hungarian

‘ind’

Indonesian

‘isl’

Icelandic

‘ita’

Italian

‘ita_old’

Italian (Old)

‘jpn’

Japanese

‘kan’

Kannada

‘kor’

Korean

‘lav’

Latvian

‘lit’

Lithuanian

‘mal’

Malayalam

‘mkd’

Macedonian

‘mlt’

Maltese

‘msa’

Malay

‘nld’

Dutch

‘nor’

Norwegian

‘pol’

Polish

‘por’

Portuguese

‘ron’

Romanian

‘rus’

Russian

‘slk’

Slovakian

‘slv’

Slovenian

‘spa’

Spanish

‘spa_old’

Old Spanish

‘sqi’

Albanian

‘srp’

Serbian (Latin)

‘swa’

Swahili

‘swe’

Swedish

‘tam’

Tamil

‘tel’

Telugu

‘tgl’

Tagalog

‘tha’

Thai

‘tur’

Turkish

‘ukr’

Ukrainian

‘vie’

Vietnamese

Tesseract参数支持介绍：

Parameter

Default Value

Description

ambigs_debug_level

Debug level for unichar ambiguities

applybox_debug

Debug level

applybox_exposure_pattern

.exp

Exposure value follows this pattern in the image filename. The name of the image files are expected to be in the form [lang].[fontname].exp[num].tif

applybox_learn_chars_and_char_frags_mode

Learn both character fragments (as is done in the special low exposure mode) as well as unfragmented characters.

applybox_learn_ngrams_mode

Each bounding box is assumed to contain ngrams. Only learn the ngrams whose outlines overlap horizontally.

applybox_page

Page number to apply boxes from

assume_fixed_pitch_char_segment

include fixed-pitch heuristics in char segmentation

bestrate_pruning_factor

Multiplying factor of current best rate to prune other hypotheses

bidi_debug

Debug level for BiDi

bland_unrej

unrej potential with no chekcs

certainty_scale

Certainty scaling factor

certainty_scale

Certainty scaling factor

chop_center_knob

0.15

Split center adjustment

chop_centered_maxwidth

Width of (smaller) chopped blobs above which we don’t care that a chop is not near the center.

chop_debug

Chop debug

chop_enable

Chop enable

chop_good_split

Good split limit

chop_inside_angle

-50

Min Inside Angle Bend

chop_min_outline_area

2000

Min Outline Area

chop_min_outline_points

Min Number of Points on Outline

chop_new_seam_pile

Use new seam_pile

chop_ok_split

100

OK split limit

chop_overlap_knob

0.9

Split overlap adjustment

chop_same_distance

Same distance

chop_seam_pile_size

150

Max number of seams in seam_pile

chop_sharpness_knob

0.06

Split sharpness adjustment

chop_split_dist_knob

0.5

Split length adjustment

chop_split_length

10000

Split Length

chop_vertical_creep

Vertical creep

chop_width_change_knob

Width change adjustment

chop_x_y_weight

X / Y length weight

chs_leading_punct

(‘`”

Leading punctuation

chs_trailing_punct1

).,;:?!

1st Trailing punctuation

chs_trailing_punct2

)’`”

2nd Trailing punctuation

classify_adapt_feature_threshold

230

Threshold for good features during adaptive 0-255

classify_adapt_proto_threshold

230

Threshold for good protos during adaptive 0-255

classify_adapted_pruning_factor

2.5

Prune poor adapted results this much worse than best result

classify_adapted_pruning_threshold

-1

Threshold at which classify_adapted_pruning_factor starts

classify_bln_numeric_mode

Assume the input is numbers [0-9].

classify_char_norm_range

0.2

Character Normalization Range …

classify_character_fragments_garbage_certainty_threshold

-3

Exclude fragments that do not look like whole characters from training and adaption

classify_class_pruner_multiplier

Class Pruner Multiplier 0-255:

classify_class_pruner_threshold

229

Class Pruner Threshold 0-255

classify_cp_angle_pad_loose

Class Pruner Angle Pad Loose

classify_cp_angle_pad_medium

Class Pruner Angle Pad Medium

classify_cp_angle_pad_tight

CLass Pruner Angle Pad Tight

classify_cp_cutoff_strength

Class Pruner CutoffStrength:

classify_cp_end_pad_loose

0.5

Class Pruner End Pad Loose

classify_cp_end_pad_medium

0.5

Class Pruner End Pad Medium

classify_cp_end_pad_tight

0.5

Class Pruner End Pad Tight

classify_cp_side_pad_loose

2.5

Class Pruner Side Pad Loose

classify_cp_side_pad_medium

1.2

Class Pruner Side Pad Medium

classify_cp_side_pad_tight

0.6

Class Pruner Side Pad Tight

classify_debug_character_fragments

Bring up graphical debugging windows for fragments training

classify_debug_level

Classify debug level

classify_enable_adaptive_debugger

Enable match debugger

classify_enable_adaptive_matcher

Enable adaptive classifier

classify_enable_learning

Enable adaptive classifier

classify_font_name

UnknownFont

Default font name to be used in training

classify_integer_matcher_multiplier

Integer Matcher Multiplier 0-255:

classify_learn_debug_str

Class str to debug learning

classify_learning_debug_level

Learning Debug Level:

classify_max_certainty_margin

5.5

Veto difference between classifier certainties

classify_max_norm_scale_x

0.325

Max char x-norm scale …

classify_max_norm_scale_y

0.325

Max char y-norm scale …

classify_max_rating_ratio

1.5

Veto ratio between classifier ratings

classify_max_slope

2.41421

Slope above which lines are called vertical

classify_min_norm_scale_x

Min char x-norm scale …

classify_min_norm_scale_y

Min char y-norm scale …

classify_min_slope

0.414214

Slope below which lines are called horizontal

classify_misfit_junk_penalty

Penalty to apply when a non-alnum is vertically out of its expected textline position

classify_nonlinear_norm

Non-linear stroke-density normalization

classify_norm_adj_curl

Norm adjust curl …

classify_norm_adj_midpoint

Norm adjust midpoint …

classify_norm_method

Normalization Method …

classify_num_cp_levels

Number of Class Pruner Levels

classify_pico_feature_length

0.05

Pico Feature Length

classify_pp_angle_pad

Proto Pruner Angle Pad

classify_pp_end_pad

0.5

Proto Prune End Pad

classify_pp_side_pad

2.5

Proto Pruner Side Pad

classify_save_adapted_templates

Save adapted templates to a file

classify_training_file

MicroFeatures

Training file

classify_use_pre_adapted_templates

Use pre-adapted classifier templates

conflict_set_I_l_1

Il1[]

Il1 conflict set

crunch_accept_ok

Use acceptability in okstring

crunch_debug

As it says

crunch_del_cert

-10

POTENTIAL crunch cert lt this

crunch_del_high_word

1.5

Del if word gt xht x this above bl

crunch_del_low_word

0.5

Del if word gt xht x this below bl

crunch_del_max_ht

Del if word ht gt xht x this

crunch_del_min_ht

0.7

Del if word ht lt xht x this

crunch_del_min_width

Del if word width lt xht x this

crunch_del_rating

POTENTIAL crunch rating lt this

crunch_early_convert_bad_unlv_chs

Take out ~^ early?

crunch_early_merge_tess_fails

Before word crunch?

crunch_include_numerals

Fiddle alpha figures

crunch_leave_accept_strings

Dont pot crunch sensible strings

crunch_leave_lc_strings

Dont crunch words with long lower case strings

crunch_leave_ok_strings

Dont touch sensible strings

crunch_leave_uc_strings

Dont crunch words with long lower case strings

crunch_long_repetitions

Crunch words with long repetitions

crunch_poor_garbage_cert

-9

crunch garbage cert lt this

crunch_poor_garbage_rate

crunch garbage rating lt this

crunch_pot_garbage

POTENTIAL crunch garbage

crunch_pot_indicators

How many potential indicators needed

crunch_pot_poor_cert

-8

POTENTIAL crunch cert lt this

crunch_pot_poor_rate

POTENTIAL crunch rating lt this

crunch_rating_max

For adj length in rating per ch

crunch_small_outlines_size

0.6

Small if lt xht x this

crunch_terrible_garbage

As it says

crunch_terrible_rating

crunch rating lt this

cube_debug_level

Print cube debug info.

dawg_debug_level

Set to 1 for general debug info, to 2 for more details, to 3 to see all the debug messages

debug_acceptable_wds

Dump word pass/fail chk

debug_file

File to send tprintf output to

debug_fix_space_level

Contextual fixspace debug

debug_noise_removal

Debug reassignment of small outlines

debug_x_ht_level

Reestimate debug

devanagari_split_debugimage

Whether to create a debug image for split shiro-rekha process.

devanagari_split_debuglevel

Debug level for split shiro-rekha process.

disable_character_fragments

Do not include character fragments in the results of the classifier

doc_dict_certainty_threshold

-2.25

图像转图像AI

利用AI轻松变形、风格化和重绘任何图像

查看详情

Worst certainty for words that can be inserted into thedocument dictionary

doc_dict_pending_threshold

Worst certainty for using pending dictionary

docqual_excuse_outline_errs

Allow outline errs in unrejection?

edges_boxarea

0.875

Min area fraction of grandchild for box

edges_childarea

0.5

Min area fraction of child outline

edges_children_count_limit

Max holes allowed in blob

edges_children_fix

Remove boxy parents of char-like children

edges_children_per_grandchild

Importance ratio for chucking outlines

edges_debug

turn on debugging for this module

edges_max_children_layers

Max layers of nested children inside a character outline

edges_max_children_per_outline

Max number of children inside a character outline

edges_min_nonhole

Min pixels for potential char in box

edges_patharea_ratio

Max lensq/area for acceptable child outline

edges_use_new_outline_complexity

Use the new outline complexity module

editor_dbwin_height

Editor debug window height

editor_dbwin_name

EditorDBWin

Editor debug window name

editor_dbwin_width

Editor debug window width

editor_dbwin_xpos

Editor debug window X Pos

editor_dbwin_ypos

500

Editor debug window Y Pos

editor_debug_config_file

Config file to apply to single words

editor_image_blob_bb_color

Blob bounding box colour

editor_image_menuheight

Add to image height for menu bar

editor_image_text_color

Correct text colour

editor_image_win_name

EditorImage

Editor image window name

editor_image_word_bb_color

Word bounding box colour

editor_image_xpos

590

Editor image X Pos

editor_image_ypos

Editor image Y Pos

editor_word_height

240

Word window height

editor_word_name

BlnWords

BL normalized word window

editor_word_width

655

Word window width

editor_word_xpos

Word window X Pos

editor_word_ypos

510

Word window Y Pos

enable_new_segsearch

Enable new segmentation search path.

enable_noise_removal

Remove and conditionally reassign small outlines when they confuse layout analysis, determining diacritics vs noise

equationdetect_save_bi_image

Save input bi image

equationdetect_save_merged_image

Save the merged image

equationdetect_save_seed_image

Save the seed image

equationdetect_save_spt_image

Save special character image

file_type

.tif

Filename extension

fixsp_done_mode

What constitues done for spacing

fixsp_non_noise_limit

How many non-noise blbs either side?

fixsp_small_outlines_size

0.28

Small if lt xht x this

force_word_assoc

force associator to run regardless of what enable_assoc is.This is used for CJK where component grouping is necessary.

fragments_debug

Debug character fragments

fragments_guide_chopper

Use information from fragments to guide chopping process

fx_debugfile

FXDebug

Name of debugfile

gapmap_big_gaps

1.75

xht multiplier

gapmap_debug

Say which blocks have tables

gapmap_no_isolated_quanta

Ensure gaps not less than 2quanta wide

gapmap_use_ends

Use large space at start and end of rows

heuristic_max_char_wh_ratio

max char width-to-height ratio allowed in segmentation

heuristic_segcost_rating_base

1.25

base factor for adding segmentation cost into word rating.It’s a multiplying factor, the larger the value above 1, the bigger the effect of segmentation cost.

heuristic_weight_rating

weight associated with char rating in combined cost ofstate

heuristic_weight_seamcut

weight associated with seam cut in combined cost of state

heuristic_weight_width

1000

weight associated with width evidence in combined cost of state

hocr_font_info

Add font info to hocr output

hyphen_debug_level

Debug level for hyphenated words.

il1_adaption_test

Dont adapt to i/I at beginning of word

include_page_breaks

Include page separator string in output text after each image/page.

interactive_display_mode

Run interactively?

language_model_debug_level

Language model debug level

language_model_fixed_length_choices_depth

Depth of blob choice lists to explore when fixed length dawgs are on

language_model_min_compound_length

Minimum length of compound words

language_model_ngram_nonmatch_score

-40

Average classifier score of a non-matching unichar.

language_model_ngram_on

Turn on/off the use of character ngram model

language_model_ngram_order

Maximum order of the character ngram model

language_model_ngram_rating_factor

Factor to bring log-probs into the same range as ratings when multiplied by outline length

language_model_ngram_scale_factor

0.03

Strength of the character ngram model relative to the character classifier

language_model_ngram_small_prob

1e-06

To avoid overly small denominators use this as the floor of the probability returned by the ngram model.

language_model_ngram_space_delimited_language

Words are delimited by space

language_model_ngram_use_only_first_uft8_step

Use only the first UTF8 step of the given string when computing log probabilities.

language_model_penalty_case

0.1

Penalty for inconsistent case

language_model_penalty_chartype

0.3

Penalty for inconsistent character type

language_model_penalty_font

Penalty for inconsistent font

language_model_penalty_increment

0.01

Penalty increment

language_model_penalty_non_dict_word

0.15

Penalty for non-dictionary words

language_model_penalty_non_freq_dict_word

0.1

Penalty for words not in the frequent word dictionary

language_model_penalty_punc

0.2

Penalty for inconsistent punctuation

language_model_penalty_script

0.5

Penalty for inconsistent script

language_model_penalty_spacing

0.05

Penalty for inconsistent spacing

language_model_use_sigmoidal_certainty

Use sigmoidal score for certainty

language_model_viterbi_list_max_num_prunable

Maximum number of prunable (those for which PrunablePath() is true) entries in each viterbi list recorded in BLOB_CHOICEs

language_model_viterbi_list_max_size

500

Maximum size of viterbi lists recorded in BLOB_CHOICEs

load_bigram_dawg

Load dawg with special word bigrams.

load_fixed_length_dawgs

Load fixed length dawgs (e.g. for non-space delimited languages)

load_freq_dawg

Load frequent word dawg.

load_number_dawg

Load dawg with number patterns.

load_punc_dawg

Load dawg with punctuation patterns.

load_system_dawg

Load system word dawg.

load_unambig_dawg

Load unambiguous word dawg.

m_data_sub_dir

tessdata/

Directory for data files

matcher_avg_noise_size

Avg. noise blob length

matcher_bad_match_pad

0.15

Bad Match Pad (0-1)

matcher_clustering_max_angle_delta

0.015

Maximum angle delta for prototype clustering

matcher_debug_flags

Matcher Debug Flags

matcher_debug_level

Matcher Debug Level

matcher_debug_separate_windows

Use two different windows for debugging the matching: One for the protos and one for the features.

matcher_good_threshold

0.125

Good Match (0-1)

matcher_great_threshold

Great Match (0-1)

matcher_min_examples_for_prototyping

Reliable Config Threshold

matcher_perfect_threshold

0.02

Perfect Match (0-1)

matcher_permanent_classes_min

Min # of permanent classes

matcher_rating_margin

0.1

New template margin (0-1)

matcher_sufficient_examples_for_prototyping

Enable adaption even if the ambiguities have not been seen

max_permuter_attempts

10000

Maximum number of different character choices to consider during permutation. This limit is especially useful when user patterns are specified, since overly generic patterns can result in dawg search exploring an overly large number of options.

max_viterbi_list_size

Maximum size of viterbi list.

merge_fragments_in_matrix

Merge the fragments in the ratings matrix and delete them after merging

min_orientation_margin

Min acceptable orientation margin

min_sane_x_ht_pixels

Reject any x-ht lt or eq than this

ngram_permuter_activated

Activate character-level n-gram-based permuter

noise_cert_basechar

-8

Hingepoint for base char certainty

noise_cert_disjoint

-1

Hingepoint for disjoint certainty

noise_cert_factor

0.375

Scaling on certainty diff from Hingepoint

noise_cert_punc

-3

Threshold for new punc char certainty

noise_maxperblob

Max diacritics to apply to a blob

noise_maxperword

Max diacritics to apply to a word

numeric_punctuation

Punct. chs expected WITHIN numbers

ocr_devanagari_split_strategy

Whether to use the top-line splitting process for Devanagari documents while performing ocr.

ok_repeated_ch_non_alphanum_wds

-?*=

Allow NN to unrej

oldbl_corrfix

Improve correlation of heights

oldbl_dot_error_size

1.26

Max aspect ratio of a dot

oldbl_holed_losscount

Max lost before fallback line used

oldbl_xhfix

Fix bug in modes threshold for xheights

oldbl_xhfract

0.4

Fraction of est allowed in calc

outlines_2

ij!?%”:;

Non standard number of outlines

outlines_odd

output_ambig_words_file

Output file for ambiguities found in the dictionary

page_separator

pageseg_devanagari_split_strategy

Whether to use the top-line splitting process for Devanagari documents while performing page-segmentation.

paragraph_debug_level

Print paragraph debug info.

paragraph_text_based

Run paragraph detection on the post-text-recognition (more accurate)

permute_chartype_word

Turn on character type (property) consistency permuter

permute_debug

Debug char permutation process

permute_fixed_length_dawg

Turn on fixed-length phrasebook search permuter

permute_only_top

Run only the top choice permuter

permute_script_word

Turn on word script consistency permuter

pitsync_fake_depth

Max advance fake generation

pitsync_joined_edge

0.75

Dist inside big blob for chopping

pitsync_linear_version

Use new fast algorithm

pitsync_offset_freecut_fraction

0.25

Fraction of cut for free cuts

poly_allow_detailed_fx

Allow feature extractors to see the original outline

poly_debug

Debug old poly

poly_wide_objects_better

More accurate approx on wide things

preserve_interword_spaces

Preserve multiple interword spaces

prioritize_division

Prioritize blob division over chopping

quality_blob_pc

good_quality_doc gte good blobs limit

quality_char_pc

0.95

good_quality_doc gte good char limit

quality_min_initial_alphas_reqd

alphas in a good word

quality_outline_pc

good_quality_doc lte outline error limit

quality_rej_pc

0.08

good_quality_doc lte rejection limit

quality_rowrej_pc

1.1

good_quality_doc gte good char limit

rating_scale

1.5

Rating scaling factor

rej_1Il_trust_permuter_type

Dont double check

rej_1Il_use_dict_word

Use dictword test

rej_alphas_in_number_perm

Extend permuter check

rej_trust_doc_dawg

Use DOC dawg in 11l conf. detector

rej_use_good_perm

Individual rejection control

rej_use_sensible_wd

Extend permuter check

rej_use_tess_accepted

Individual rejection control

rej_use_tess_blanks

Individual rejection control

rej_whole_of_mostly_reject_word_fract

0.85

if >this fract

repair_unchopped_blobs

Fix blobs that aren’t chopped

save_alt_choices

Save alternative paths found during chopping and segmentation search

save_doc_words

Save Document Words

save_raw_choices

Deprecated- backward compatablity only

segment_adjust_debug

Segmentation adjustment debug

segment_debug

Debug the whole segmentation process

segment_nonalphabetic_script

Don’t use any alphabetic-specific tricks.Set to true in the traineddata config file for scripts that are cursive or inherently fixed-pitch

segment_penalty_dict_case_bad

1.3125

Default score multiplier for word matches, which may have case issues (lower is better).

segment_penalty_dict_case_ok

1.1

Score multiplier for word matches that have good case (lower is better).

segment_penalty_dict_frequent_word

Score multiplier for word matches which have good case andare frequent in the given language (lower is better).

segment_penalty_dict_nonword

1.25

Score multiplier for glyph fragment segmentations which do not match a dictionary word (lower is better).

GitHub地址

以上就是H5 图像识别的详细内容，更多请关注php中文网其它相关文章！

大家都在看：

Linux日志怎么切割_Linux日志通过自定义切割脚本实现多目录管理的教程 Linux日志怎么切割_Linux日志通过编写Python脚本实现智能切割的方法 Linux日志怎么切割_Linux日志利用logrotate按硬件温度切割的实用方法 Linux网络策略怎么制定_Linux网络策略制定方法与安全建议 Linux终端怎么配置_Linux终端界面配置与美化教程