Pytables time performance -

- August 15, 2012

i'm working on project related text detection in natural images. have train classifier , i'm using pytables store information. have:

62 classes (a-z,a-z,0-9)
each class has between 100 , 600 tables
each table has 1 single column store 32bit float
each column has between 2^2 , 2^8 rows (depending on parameters)

my problem after train classifier, takes lot of time read information in test. example: 1 database has 27900 tables (62 classes * 450 tables per class) , there 4 rows per table , took aprox 4hs read , retrieve information need. test program read each table 390 times (for classes a-z, a-z) , 150 times classes 0-9 info need. normal? tried use index option unique column , dont see performance. work on virtualmachine 2gb ram on hp pavillion dv6 (4gb ram ddr3, core2 duo).

this because column lookup on tables 1 of slower operations can , of information lives. have 2 basic options increase performance tables many columns , few rows:

pivot structure such have table many rows , few columns.
move more efficient data structure carray or earray every row / column.

additionally, can try using compression speed things up. sort of generic advice, because haven't included code.

Search This Blog

Sharma

Pytables time performance -

Comments

Post a Comment

Popular posts from this blog

html5 - What is breaking my page when printing? -

html - Unable to style the color of bullets in a list -

c# - must be a non-abstract type with a public parameterless constructor in redis -